-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Improve ArrayBufferWriter re-alloc perf when size is > int.MaxSize / 2 #42950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| { | ||
| newSize = currentLength + sizeHint; | ||
| // Attempt to grow by the larger of the minimum size and half of the available size. | ||
| growBy = Math.Max(sizeHint, (int.MaxValue - currentLength) / 2 + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +1 is to attempt to allocate an extra byte to have it align on a nice boundary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does adding 1 align on a boundary? Can't currentLength be wide variety of even or odd numbers?
| if ((uint)newSize > int.MaxValue) | ||
| { | ||
| newSize = currentLength + sizeHint; | ||
| // Attempt to grow by the larger of the minimum size and half of the available size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing this, we could allocate the magic number that represents the largest array size, but allocating half of the available size seems less risky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the magic number that represents the largest array size
It is what we do in number of other places. I do not see a problem with doing it here as well.
We can consider adding API that returns the max size to make it less magic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be that magic number? int.max - 56, or 2 billion, or something else?
Until we have such an API, let's add it as a named constant here, similar to DefaultInitialBufferSize = 256.
Either approach would be reasonable.
The downside of going all the way to a ~2 billion right away, is that for output data that is around 1.1 to 1.5 billion, we are over-allocating by quite a lot (even if that is relatively fast because we only resize once, it isn't very memory efficient). So, the current approach of incrementing by half-way between current and max, has some benefit there.
On the other hand, the closer to you get to 2 billion by incrementing the size this way, that increasing exponential decay will become small again, making things slow again for those edge cases (possible even 2-100 bytes).
with a max of 17 seconds if allocating all the way to the 2GB barrier (4K requests).
Can we hit a balance where we get the best of both worlds, and keep the worst case time/number of growths down to < 5 seconds too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be that magic number? int.max - 56, or 2 billion, or something else?
It varies a bit between byte and other numbers : https://github.com/dotnet/runtime/search?q=MaxArrayLength
For now, I think using the max makes sense since we already allocated size of at least int32.MaxValue / 2 so we would always allocate less than double the current buffer size; i.e. the current algorithm doubles on each alloc, so the last alloc using max doesn't quite double so it seems fine from that viewpoint.
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
src/libraries/System.Memory/tests/ArrayBufferWriter/ArrayBufferWriterTests.Byte.cs
Outdated
Show resolved
Hide resolved
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| sealed class ArrayBufferWriter<T> : IBufferWriter<T> | ||
| { | ||
| // Copy of Array.MaxArrayLength. For byte arrays the limit is slightly larger | ||
| private const int MaxArrayLength = 0X7FEFFFFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use Array.MaxArrayLength, so we don't have this constant in multiple places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is compiled in NS2.0 assemblies, so it needs to have a local copy. I agree it would be nice to have an API for this - reactivating #31366
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it matter on full framework where MaxArrayLength might be smaller (I believe not due to how the buffer grows, but I thought I'd double check)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it is not easy to find the maximum array length in .NET Framework. It depends on gcAllowVeryLargeObjects configuration setting. It just means that this code may throw OOM on .NET Framework even when it could have been avoided. I think that is fine.
|
We should consider this for servicing since 5.0 has a regression from 3.1. In 3.1, the code would throw an exception more eagerly and not try to slowly grow the buffer. In 5.0, the code was changed to slowly grow the buffer increasing by Also, note that in 5.0 changed from cc @ericstj |
I've marked #32587 as breaking. @steveharter can you file the breaking change doc?
Typically we don't consider it breaking when we avoid throwing and instead make progress. How likely would you think production code is to get over 1 GB while still advancing small amounts? Have we heard about this from any customers? |
Changing exception types isn't considered a breaking change, correct? I am not sure we need a breaking change doc for that. Additionally, having the performance degrade beyond 1 GB in certain edge cases isn't really a breaking change either. |
I think this is more a bug than a breaking change. The exception is now more appropriate. OOM exception is not normally caught either. A potential breaking change is that there is "hang-like" performance characteristics in certain scenarios.
I am not aware of any customer issues yet. We have to be cognizant that it may be reported as a hang, not a perf issue. I believe the sample code that I pasted in the description is a fairly common pattern for ABW: smallish Aside, one benefit of this PR is that is allows a deterministic max size (~2GB) along with a deterministic OOM message that is not "randomized" by the amount of free space in the currently allocated buffer. The previous approaches of either always doubling or using |
Changing the exception for an existing codepath to a new exception which is not derived from an existing thrown exception (in similarly common codepath) is considered breaking. https://github.com/dotnet/runtime/blob/master/docs/coding-guidelines/breaking-change-rules.md#exceptions |
|
OutOfMemoryException was already possible here on the same code path: you go to allocate the array, and it could throw an OOM because it couldn't allocate the humongous array. Previously in some cases it might throw an OverflowException but not always. This doesn't seem like a change to me worthwhile as documenting as breaking. |
|
To be clear @stephentoub I'm not talking about this change being breaking, but this one: #32587 |
Yup, understood. The code was essentially: int newSize = checked { ComputeNewSize }; // might throw OverflowException
byte[] newArray = new byte[newSize]; // might throw OutOfMemoryExceptionand now it's more like: int newSize = ComputeNewSize();
if (aboveOverflowed) throw new OutOfMemoryException(); // might throw OutOfMemoryException
byte[] newArray = new byte[newSize]; // might throw OutOfMemoryExceptionMy point was that OutOfMemoryException was always possible, regardless of whether an OverflowException was thrown or not. You could not overflow and still get an OOM because you were trying to allocate more than was available. So any consumer that cared about handling such extreme conditions would already need to be prepared to handle OOMs. Hence why it doesn't seem to me worthy of calling out as a breaking change. If you disagree, that's of course fine :) My $.02. I just want to make sure that a list of breaking changes isn't so overwhelmed by minutia that readers lose the forest for the trees. |
Perfect. That makes it non breaking. Thanks for clarifying. This was exactly what I was referring to with
Removed the tag. |
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@stephentoub, we document the OOM that can be thrown by failing to allocate as a "catastrophic failure" and distinct from the user-defined variant: https://docs.microsoft.com/en-us/dotnet/api/system.outofmemoryexception?view=netcore-3.1#remarks It might be that the docs are wrong here, but I would still classify it as breaking even if you already could get OOM, because its general classification is different and the user might interpret it differently. |
| Debug.Assert(_arrayBufferWriter != null); | ||
|
|
||
| _memory = _arrayBufferWriter.GetMemory(checked(BytesPending + sizeHint)); | ||
| int needed = BytesPending + sizeHint; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was changed to keep the OOM exception semantics (instead of OverflowException). There are no effective perf differences since the overhead of the new if statement roughly equals the extra logic in using checked().
Note there are other cases in STJ that also use checked() that could be changed like this code, but it doesn't use ABW so not changed in this PR. I may create a new issue to track this if deemed useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was changed to keep the OOM exception semantics (instead of OverflowException).
Add a test, if one doesn't already exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the existing test that was modified in this PR was failing due to this (not throwing OOM as expected).
|
runtime test failures unrelated; due to #43178 |
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Outerloop failures appear unrelated in: |
|
@layomia @tannergooding can you take another look and approve \ request changes? Need owners to commit. Thanks |
layomia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed offline with @steveharter - LGTM.
ArrayBufferWriterperformance may become unusable when a callingGetMemory(sizehint)when the previous allocated size is >= int.MaxSize / 2. When a re-alloc is necessary, the buffer would normally double, but the current algorithm says once an size of int.MaxSize / 2 is reached it can no longer double (to avoid OutOfMemory exception), so instead it increases memory bysizeHintwhich may be a smallish number (4,096 usingbyte[]in testing) causing many re-allocs over time. The PR changes this to allocate half of the available range instead ofsizeHint.This was discovered while adding benchmarks where it appeared a test was hanging, but in fact it forgot to clear the
ArrayBufferWriterbetween runs and was doing hundreds of re-allocs, causing many calls toBuffer.Memmovewhich is slow: at ~1GB (when usingbyte) it takes about .5 seconds each on local hardware.For example, before this PR, this takes 50 seconds for only 100 allocations of 4K:
With the changes in this PR, this takes .75 seconds with a max of 17 seconds if allocating all the way to the 2GB barrier (4K requests). The previous max was ~36 hours(!) since each resize takes ~.5 seconds so (1GB\4K = 262,144) resizes * .5 seconds = 36 hours.