Skip to content

Conversation

@steveharter
Copy link
Contributor

@steveharter steveharter commented Oct 1, 2020

ArrayBufferWriter performance may become unusable when a calling GetMemory(sizehint) when the previous allocated size is >= int.MaxSize / 2. When a re-alloc is necessary, the buffer would normally double, but the current algorithm says once an size of int.MaxSize / 2 is reached it can no longer double (to avoid OutOfMemory exception), so instead it increases memory by sizeHint which may be a smallish number (4,096 using byte[] in testing) causing many re-allocs over time. The PR changes this to allocate half of the available range instead of sizeHint.

This was discovered while adding benchmarks where it appeared a test was hanging, but in fact it forgot to clear the ArrayBufferWriter between runs and was doing hundreds of re-allocs, causing many calls to Buffer.Memmove which is slow: at ~1GB (when using byte) it takes about .5 seconds each on local hardware.

For example, before this PR, this takes 50 seconds for only 100 allocations of 4K:

    const int OneGig = (Int32.MaxValue / 2) + 1; // 1,073,741,824

    var output = new ArrayBufferWriter<byte>(OneGig);
    output.GetMemory(OneGig);
    output.Advance(OneGig);

    // This takes 50 seconds for only 409,600 bytes
    for (int i = 0; i < 100; i++)
    {
        output.GetMemory(4096);
        output.Advance(4096);
    }

With the changes in this PR, this takes .75 seconds with a max of 17 seconds if allocating all the way to the 2GB barrier (4K requests). The previous max was ~36 hours(!) since each resize takes ~.5 seconds so (1GB\4K = 262,144) resizes * .5 seconds = 36 hours.

@steveharter steveharter added the tenet-performance Performance related issue label Oct 1, 2020
@steveharter steveharter added this to the 6.0.0 milestone Oct 1, 2020
@steveharter steveharter self-assigned this Oct 1, 2020
{
newSize = currentLength + sizeHint;
// Attempt to grow by the larger of the minimum size and half of the available size.
growBy = Math.Max(sizeHint, (int.MaxValue - currentLength) / 2 + 1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The +1 is to attempt to allocate an extra byte to have it align on a nice boundary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does adding 1 align on a boundary? Can't currentLength be wide variety of even or odd numbers?

if ((uint)newSize > int.MaxValue)
{
newSize = currentLength + sizeHint;
// Attempt to grow by the larger of the minimum size and half of the available size.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this, we could allocate the magic number that represents the largest array size, but allocating half of the available size seems less risky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the magic number that represents the largest array size

It is what we do in number of other places. I do not see a problem with doing it here as well.

We can consider adding API that returns the max size to make it less magic.

Copy link
Contributor

@ahsonkhan ahsonkhan Oct 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be that magic number? int.max - 56, or 2 billion, or something else?

Until we have such an API, let's add it as a named constant here, similar to DefaultInitialBufferSize = 256.

Either approach would be reasonable.

The downside of going all the way to a ~2 billion right away, is that for output data that is around 1.1 to 1.5 billion, we are over-allocating by quite a lot (even if that is relatively fast because we only resize once, it isn't very memory efficient). So, the current approach of incrementing by half-way between current and max, has some benefit there.

On the other hand, the closer to you get to 2 billion by incrementing the size this way, that increasing exponential decay will become small again, making things slow again for those edge cases (possible even 2-100 bytes).

with a max of 17 seconds if allocating all the way to the 2GB barrier (4K requests).

Can we hit a balance where we get the best of both worlds, and keep the worst case time/number of growths down to < 5 seconds too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be that magic number? int.max - 56, or 2 billion, or something else?

It varies a bit between byte and other numbers : https://github.com/dotnet/runtime/search?q=MaxArrayLength

For now, I think using the max makes sense since we already allocated size of at least int32.MaxValue / 2 so we would always allocate less than double the current buffer size; i.e. the current algorithm doubles on each alloc, so the last alloc using max doesn't quite double so it seems fine from that viewpoint.

@steveharter steveharter changed the title Improve ArrayBufferWriter perf when total memory allocated is >=1GB Improve ArrayBufferWriter re-alloc perf when index is > int.MaxSize / 2 Oct 1, 2020
@steveharter steveharter changed the title Improve ArrayBufferWriter re-alloc perf when index is > int.MaxSize / 2 Improve ArrayBufferWriter re-alloc perf when size is > int.MaxSize / 2 Oct 1, 2020
@steveharter
Copy link
Contributor Author

steveharter commented Oct 1, 2020

/azp run runtime-libraries-coreclr outerloop

@dotnet dotnet deleted a comment from azure-pipelines bot Oct 1, 2020
@dotnet dotnet deleted a comment from azure-pipelines bot Oct 1, 2020
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dotnet dotnet deleted a comment from azure-pipelines bot Oct 1, 2020
@steveharter
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

sealed class ArrayBufferWriter<T> : IBufferWriter<T>
{
// Copy of Array.MaxArrayLength. For byte arrays the limit is slightly larger
private const int MaxArrayLength = 0X7FEFFFFF;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use Array.MaxArrayLength, so we don't have this constant in multiple places?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is compiled in NS2.0 assemblies, so it needs to have a local copy. I agree it would be nice to have an API for this - reactivating #31366

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it matter on full framework where MaxArrayLength might be smaller (I believe not due to how the buffer grows, but I thought I'd double check)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it is not easy to find the maximum array length in .NET Framework. It depends on gcAllowVeryLargeObjects configuration setting. It just means that this code may throw OOM on .NET Framework even when it could have been avoided. I think that is fine.

@steveharter
Copy link
Contributor Author

We should consider this for servicing since 5.0 has a regression from 3.1. In 3.1, the code would throw an exception more eagerly and not try to slowly grow the buffer. In 5.0, the code was changed to slowly grow the buffer increasing by sizeHint once more than half the max size is reached. When sizeHint is small, code can appear to hang: as mentioned earlier, it could take ~36 hours when sizeHint == 4K before OOM is thrown.

Also, note that in 5.0 changed from OverflowException to OutOfMemoryException (which is more appropriate).

cc @ericstj

@ericstj
Copy link
Member

ericstj commented Oct 5, 2020

in 5.0 changed from OverflowException to OutOfMemoryException

I've marked #32587 as breaking. @steveharter can you file the breaking change doc?

We should consider this for servicing since 5.0 has a regression from 3.1.

Typically we don't consider it breaking when we avoid throwing and instead make progress. How likely would you think production code is to get over 1 GB while still advancing small amounts? Have we heard about this from any customers?

@ahsonkhan
Copy link
Contributor

in 5.0 changed from OverflowException to OutOfMemoryException

I've marked #32587 as breaking. @steveharter Steve Harter FTE can you file the breaking change doc?

Changing exception types isn't considered a breaking change, correct? I am not sure we need a breaking change doc for that.

Additionally, having the performance degrade beyond 1 GB in certain edge cases isn't really a breaking change either.

@steveharter
Copy link
Contributor Author

Changing exception types isn't considered a breaking change, correct? I am not sure we need a breaking change doc for that.

Additionally, having the performance degrade beyond 1 GB in certain edge cases isn't really a breaking change either.

I think this is more a bug than a breaking change. The exception is now more appropriate. OOM exception is not normally caught either. A potential breaking change is that there is "hang-like" performance characteristics in certain scenarios.

Typically we don't consider it breaking when we avoid throwing and instead make progress. How likely would you think production code is to get over 1 GB while still advancing small amounts? Have we heard about this from any customers?

I am not aware of any customer issues yet. We have to be cognizant that it may be reported as a hang, not a perf issue.

I believe the sample code that I pasted in the description is a fairly common pattern for ABW: smallish sizehint in Advance() and many calls to GetMemory(). The questionable part is the >1GB size -- I'd guess it is unlikely, and when encountered usually is a flag to refactor the code to not require that much data.

Aside, one benefit of this PR is that is allows a deterministic max size (~2GB) along with a deterministic OOM message that is not "randomized" by the amount of free space in the currently allocated buffer. The previous approaches of either always doubling or using sizeHint did not calculate in free space and thus may not allow for a full ~2GB size -- in the worst case scenario, the previous code would only allow 1GB in size before throwing.

@ericstj
Copy link
Member

ericstj commented Oct 6, 2020

Changing exception types isn't considered a breaking change, correct?

Changing the exception for an existing codepath to a new exception which is not derived from an existing thrown exception (in similarly common codepath) is considered breaking. https://github.com/dotnet/runtime/blob/master/docs/coding-guidelines/breaking-change-rules.md#exceptions

@stephentoub
Copy link
Member

stephentoub commented Oct 6, 2020

OutOfMemoryException was already possible here on the same code path: you go to allocate the array, and it could throw an OOM because it couldn't allocate the humongous array. Previously in some cases it might throw an OverflowException but not always. This doesn't seem like a change to me worthwhile as documenting as breaking.

@ericstj
Copy link
Member

ericstj commented Oct 6, 2020

To be clear @stephentoub I'm not talking about this change being breaking, but this one: #32587
@steveharter said we explicitly changed it from throwing OverflowException to OOM. For things like that it's a permissive breaking change (in a major release), but we should document it because it is easy and cheap and may save someone from finding out in production.

@stephentoub
Copy link
Member

stephentoub commented Oct 6, 2020

To be clear @stephentoub I'm not talking about this change being breaking, but this one: #32587
@steveharter said we explicitly changed it from throwing OverflowException to OOM.

Yup, understood. The code was essentially:

int newSize = checked { ComputeNewSize }; // might throw OverflowException
byte[] newArray = new byte[newSize]; // might throw OutOfMemoryException

and now it's more like:

int newSize = ComputeNewSize();
if (aboveOverflowed) throw new OutOfMemoryException(); // might throw OutOfMemoryException
byte[] newArray = new byte[newSize]; // might throw OutOfMemoryException

My point was that OutOfMemoryException was always possible, regardless of whether an OverflowException was thrown or not. You could not overflow and still get an OOM because you were trying to allocate more than was available. So any consumer that cared about handling such extreme conditions would already need to be prepared to handle OOMs.

Hence why it doesn't seem to me worthy of calling out as a breaking change. If you disagree, that's of course fine :) My $.02. I just want to make sure that a list of breaking changes isn't so overwhelmed by minutia that readers lose the forest for the trees.

@ericstj
Copy link
Member

ericstj commented Oct 6, 2020

So any consumer that cared about handling such extreme conditions would already need to be prepared to handle OOMs.

Perfect. That makes it non breaking. Thanks for clarifying. This was exactly what I was referring to with

which is not derived from an existing thrown exception (in similarly common codepath)

Removed the tag.

@steveharter
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@tannergooding
Copy link
Member

@stephentoub, we document the OOM that can be thrown by failing to allocate as a "catastrophic failure" and distinct from the user-defined variant: https://docs.microsoft.com/en-us/dotnet/api/system.outofmemoryexception?view=netcore-3.1#remarks

It might be that the docs are wrong here, but I would still classify it as breaking even if you already could get OOM, because its general classification is different and the user might interpret it differently.

Debug.Assert(_arrayBufferWriter != null);

_memory = _arrayBufferWriter.GetMemory(checked(BytesPending + sizeHint));
int needed = BytesPending + sizeHint;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed to keep the OOM exception semantics (instead of OverflowException). There are no effective perf differences since the overhead of the new if statement roughly equals the extra logic in using checked().

Note there are other cases in STJ that also use checked() that could be changed like this code, but it doesn't use ABW so not changed in this PR. I may create a new issue to track this if deemed useful.

Copy link
Contributor

@ahsonkhan ahsonkhan Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed to keep the OOM exception semantics (instead of OverflowException).

Add a test, if one doesn't already exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the existing test that was modified in this PR was failing due to this (not throwing OOM as expected).

@steveharter
Copy link
Contributor Author

runtime test failures unrelated; due to #43178

@steveharter
Copy link
Contributor Author

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@steveharter
Copy link
Contributor Author

Outerloop failures appear unrelated in:

System.Net.Http.Functional.Tests.SocketsHttpHandler_HttpClientHandler_ClientCertificates_Test.Manual_CertificateSentMatchesCertificateReceived_Success
System.Net.Http.Functional.Tests.SyncHttpHandler_HttpClientHandler_Cancellation_Test.Expect100Continue_WaitsExpectedPeriodSystem.IO.Tests.File_GetSetTimes.SetDateTimeMax [FAIL]
System.Net.Security.Tests.SslStreamNetworkStreamTest.SslStream_NetworkStream_Renegotiation_Succeeds(useSync: False) [FAIL]
System.Net.Security.Tests.SslStreamNetworkStreamTest.SslStream_NetworkStream_Renegotiation_Succeeds(Boolean useSync)
System.Net.Security.Tests.SslStreamAllowRenegotiationTests_Async.SslStream_AllowRenegotiation_True_Succeeds [FAIL]
System.Net.Security.Tests.SslStreamAllowRenegotiationTestsBase.SslStream_AllowRenegotiation_True_Succeeds()

@steveharter
Copy link
Contributor Author

@layomia @tannergooding can you take another look and approve \ request changes? Need owners to commit. Thanks

Copy link
Contributor

@layomia layomia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed offline with @steveharter - LGTM.

@steveharter steveharter merged commit 035b729 into dotnet:master Oct 14, 2020
@steveharter steveharter deleted the HangOnOverflow branch October 14, 2020 19:55
@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants