Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@ahsonkhan
Copy link

@ahsonkhan ahsonkhan commented Feb 5, 2019

Fixes https://github.com/dotnet/corefx/issues/34894

TODO:

  • Add XML comments
  • Add tests for T != byte (char, reference type like string, struct, class).
  • Figure out which assembly to expose this from. We decided on System.Buffers.dll, but I am not sure if we can add APIs to that assembly or not given ArrayPool, the only API implemented in that assembly, got moved down to corelib (Remove System.Buffers package #32096). Currently, exposing this from System.Memory.dll. cc @safern
    <ItemGroup>
    <ReferenceFromRuntime Include="System.Private.CoreLib" />
    </ItemGroup>

Changes based on dotnet/apireviews#88:

  • No optional argument on the ctor to avoid exposing internal implementation detail around default initial capacity (of 256 bytes).
  • Sealed the class
  • Renamed BytesCommitted to TotalBytesWritten for clarity
  • Exposing Capacity and BytesAvailable
  • Implementing the IBufferWriter<byte> interface explicitly to hide the APIs on ArrayBufferWriter itself.
  • The output span/memory properties are now ReadOnly
  • Renamed Clear() to Reset() and CopyToAndClear() to CopyToAndReset() based on feedback from @davidfowl

cc @davidfowl, @KrzysztofCwalina, @stephentoub, @steveharter, @JeremyKuhne, @pakrym, @layomia

@ahsonkhan ahsonkhan added this to the 3.0 milestone Feb 5, 2019
public ArrayBufferWriter(int initialCapacity) { }
public int BytesAvailable { get { throw null; } }
public int BytesWritten { get { throw null; } }
public int Capacity { get { throw null; } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesAvailable + BytesWritten == Capacity? Do we need BytesAvailable? What does one do with it? Why do two have the Bytes prefix and the third doesn't?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesAvailable + BytesWritten == Capacity?

Yes.

Do we need BytesAvailable? What does one do with it?

BytesAvailable is useful when the user wants to avoid resizing when they call GetSpan() since they can know up front if the data they want to write will fit or not in the leftover space.

@ahsonkhan ahsonkhan self-assigned this Feb 6, 2019
@ahsonkhan
Copy link
Author

ahsonkhan commented Feb 6, 2019

API changes based on feedback:

  • Make ArrayBufferWriter generic
  • Remove explicit implementation of the interface
  • OutputAsMemory/OutputAsSpan return ROM<T>/ROS<T> instead of byte
  • Remove CopyToAndReset / CopyToAndResetAsync / TotalBytesWritten
  • Rename BytesWritten to CurrentIndex
  • Rename BytesAvailable to AvailableSpace
  • Rename Reset() back to Clear()

public void Clear() { }
public void Dispose() { }
public System.Memory<T> GetMemory(int sizeHint = 0) { throw null; }
public System.Span<T> GetSpan(int sizeHint = 0) { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have another API review discussion about these names.

Thank you for making these implicitly implemented rather than explicitly. The crux of my feedback was that, without doing so or exposing similar functionality, the type was useless without being cast to the interface. The problem now, though, is that these names are a bit confusing, especially with OutputAsMemory and OutputAsSpan exposed on the type. We could still use explicit interface implementations if we wanted to and then expose this functionality as public methods with different names, in order to make them less confusing when using the public surface area of the type directly rather than via the interface. Or, we could leave them as is, and either accept the small confusion or potentially come up with different names for OutputAsMemory/Span.

public int Capacity { get { throw null; } }
public int CurrentIndex { get { throw null; } }
public System.ReadOnlyMemory<T> OutputAsMemory { get { throw null; } }
public System.ReadOnlySpan<T> OutputAsSpan { get { throw null; } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose both OutputAsMemory and OutputAsSpan? You can get a Span from a Memory, so the latter is equivalent to OutputAsMemory.Span. Given the typical use for this type, does the difference in perf between the two when you actually want a span make any measurable difference?

Same question goes for exposing GetSpan; as long as GetMemory is exposed, GetSpan could be implemented explicitly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my primary scenario for this was writing the data to a stream (asynchronously), we can defer adding OutputAsSpan until there is a customer request for it (Mem.Span is not noticeably slower (i.e. 15-25 ns) even for small writes like hello world since it's called infrequently).

Same question goes for exposing GetSpan; as long as GetMemory is exposed, GetSpan could be implemented explicitly.

However, I don't know if the same applies to GetSpan/GetMemory. If we expect ArrayBufferWriter to be used similar to how IBufferWriter is used today (i.e. we have Advance/GetMemory as regular interface implementations), we should be consistent and implement GetSpan that way as well. Writing some data to a span (synchronously) is faster than going through memory (especially if its a small amount). That was the primary reason to have GetSpan on the interface to begin with. I haven't measured myself but presumably the perf was significant enough to justify adding it, given how we currently use the API. It looks like we do this often with high granularity:
https://github.com/aspnet/AspNetCore/blob/68cc5e7846240bacc146854dfc6a175e30ff0f60/src/Servers/Kestrel/Core/src/Internal/BufferWriter.cs#L20
https://github.com/aspnet/AspNetCore/blob/b1f778bfb887f5c8282f5de42302c3f5736da790/src/Servers/Kestrel/Core/src/Internal/Http2/Http2FrameWriter.cs#L404
https://github.com/aspnet/AspNetCore/blob/3cd5054eb5d8851b25881cd05cccb931aaa022e6/src/SignalR/common/Shared/TextMessageFormatter.cs#L17

We only call GetMemory for async calls or interop where the work being done is expensive already:
https://github.com/aspnet/AspNetCore/blob/6d442c1e3d7875132c662a9e8c537e30cea585b8/src/Servers/Kestrel/Core/src/Adapter/Internal/AdaptedPipeline.cs#L119
https://github.com/aspnet/AspNetCore/blob/68cc5e7846240bacc146854dfc6a175e30ff0f60/src/Servers/Kestrel/Transport.Libuv/src/Internal/LibuvConnection.cs#L148
https://github.com/aspnet/AspNetCore/blob/ad11f890ef0646ecf3cefde63c9021865aa9490e/src/Servers/Kestrel/Transport.Sockets/src/Internal/SocketConnection.cs#L165
https://github.com/aspnet/AspNetCore/blob/3cd5054eb5d8851b25881cd05cccb931aaa022e6/src/SignalR/clients/csharp/Http.Connections.Client/src/Internal/WebSocketsTransport.cs#L222

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A benchmark like this would regress ~5%:
https://github.com/aspnet/AspNetCore/blob/dbf82dc8c4cdb410f036811143812db9f8bcc096/src/SignalR/perf/Microbenchmarks/HandshakeProtocolBenchmark.cs#L81
https://github.com/aspnet/AspNetCore/blob/dbf82dc8c4cdb410f036811143812db9f8bcc096/src/SignalR/common/SignalR.Common/src/Protocol/HandshakeProtocol.cs#L59-L70

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17763.253 (1809/October2018Update/Redstone5)
Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview-010174
  [Host] : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT
  Core   : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT

Job=Core  EnvironmentVariables=COMPlus_TieredCompilation=0  Runtime=Core  
Toolchain=.NET Core 3.0  
Method Mean Error StdDev Ratio Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
WriteArrayBufferWriterMemSpan 257.4 ns 2.561 ns 2.396 ns 1.05 0.0076 - - 32 B
WriteArrayBufferWriterSpan 245.8 ns 2.107 ns 1.645 ns 1.00 0.0076 - - 32 B

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢


public ArrayBufferWriter()
{
_rentedBuffer = ArrayPool<T>.Shared.Rent(MinimumBufferSize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rent on first use vs .ctor? e.g. could be _rentedBuffer = Array<T>.Empty();

Copy link
Author

@ahsonkhan ahsonkhan Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the benefit? With this change, we end up going through the unnecessary resize logic when _rentedBuffer == 0 on first GetSpan/GetMemory (because sizeHint > availableSpace)

Instead of just renting the min sized array (what we have today), you end up copying an empty buffer, clearing it, and returning it to the pool.

@davidfowl
Copy link
Member

I didn't realize that this was a single array at first. Is there any concern that this type basically shouldn't be used if buffering over 85K (LOH size?). What about a strategy to make a linked list once buffers get to that size instead of doubling? OR maybe that's a different type?

@benaadams
Copy link
Member

benaadams commented Feb 7, 2019

Is there any concern that this type basically shouldn't be used if buffering over 85K (LOH size?).

ArrayPool buffers though; which go to 2MB? Shouldn't hold on to it longer than needed though... (its backing array)

<ItemGroup Condition="'$(TargetGroup)' == 'uap' OR '$(TargetGroup)' == 'netcoreapp'">
<ItemGroup>
<ProjectReference Include="..\..\System.Runtime\ref\System.Runtime.csproj" />
<ProjectReference Include="..\..\System.Memory\ref\System.Memory.csproj" />
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ericstj, can you please review the props/csproj changes related to System.Buffers? I removed the netstandard configuration since we only build this library in-box now, targeting .NET Core.

Copy link
Author

@ahsonkhan ahsonkhan Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out I can't remove the netstandard configuration so conditionally including the new APIs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we need to keep building System.Memory for netstandard

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you don't need to. You can use a simple-name reference to get the version we restore from packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you aren't shipping the netstandard config nor changing it removing it is correct. You should ensure that you harvest and bin place the old assembly (so that folks don't build against a new version that doesn't ship).

<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<IsPartialFacadeAssembly>true</IsPartialFacadeAssembly>
<ExcludeResourcesImport>true</ExcludeResourcesImport>
<Configurations>netcoreapp-Unix-Debug;netcoreapp-Unix-Release;netcoreapp-Windows_NT-Debug;netcoreapp-Windows_NT-Release;netcoreappaot-Windows_NT-Debug;netcoreappaot-Windows_NT-Release;uap-Windows_NT-Debug;uap-Windows_NT-Release;uapaot-Windows_NT-Debug;uapaot-Windows_NT-Release</Configurations>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep building the aot configurations?

}

// Returns the rented buffer back to the pool
public void Dispose()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgetting to Dispose the ArrayBufferWriter means the _rentedBuffer becomes a memory leak, right?
Should the implementation take care of that via a finalizer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a leak, it just means the memory won't be returned to ArrayPool<T>.Shared, and will instead be collected by the GC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no problem then. Thanks for clarifying.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I followed this same pattern in my Sequence<T> class, and wished I could go back and remove the IDisposable interface. I called a new public method Reset() and the instance is reusable after that, allowing for recycling of everything allocated.

@ahsonkhan ahsonkhan requested a review from ericstj February 8, 2019 04:11
<PropertyGroup>
<!-- Must match version supported by frameworks which support 4.0.* inbox.
Can be removed when API is added and this assembly is versioned to 4.1.* -->
<AssemblyVersion>4.0.2.0</AssemblyVersion>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should bump minor version of the netcoreapp assembly since it has more API than last version (as well as netstandard version)

@@ -1,5 +1,64 @@
<?xml version="1.0" encoding="utf-8"?>
<root>
<!--
Copy link
Member

@ericstj ericstj Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this garbage back? We shouldn't churn: either leave it out or keep it in but don't toggle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't realized we previously removed it (#16284). I just added a new String resource in VS (used in an exception) and it auto-updated to add the boiler plate. I removed the unused string that was there previously.

Other resx file have it too:
https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/Resources/Strings.resx
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/Resources/Strings.resx
https://github.com/dotnet/corefx/blob/master/src/System.IO.Pipelines/src/Resources/Strings.resx

@danmosemsft, should we be removing this comment? It looks like VS adds it back so not sure its worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is we shouldn't be fighting VS on it and shouldn't have removed them in the first place, since any modifications via the GUI end up adding it back. Agreeing with @ericstj, I'm fine putting it back as part of this change as long as we leave it that way. If VS ever allows us to configure the behavior, then we could set that configuration and do a one-time sweep to remove them from everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a bug to remove this 17 years ago. Until they do, there is no point fighting it.

<ItemGroup Condition="'$(TargetGroup)' == 'uap' OR '$(TargetGroup)' == 'netcoreapp'">
<ItemGroup>
<ProjectReference Include="..\..\System.Runtime\ref\System.Runtime.csproj" />
<ProjectReference Include="..\..\System.Memory\ref\System.Memory.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you aren't shipping the netstandard config nor changing it removing it is correct. You should ensure that you harvest and bin place the old assembly (so that folks don't build against a new version that doesn't ship).

netcoreappaot-WebAssembly;
netcoreapp-Unix;
uap-Windows_NT;
uapaot-Windows_NT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? UAPAOT has a different core lib.

@karelz
Copy link
Member

karelz commented Mar 4, 2019

@ahsonkhan what is status & plan for this PR? I don't see update in last 3.5 weeks ...

@ahsonkhan
Copy link
Author

what is status & plan for this PR? I don't see update in last 3.5 weeks ...

I am going to finish up an investigation on an alternate approach and update this in the next 2-3 days based on my findings. Thanks for the ping :)

@safern
Copy link
Member

safern commented Mar 15, 2019

I am going to finish up an investigation on an alternate approach and update this in the next 2-3 days based on my findings. Thanks for the ping :)

@ahsonkhan did you find an alternate approach or you planning on update this PR?

@wli3
Copy link

wli3 commented Mar 15, 2019

any update?


Span<T> previousBuffer = oldBuffer.AsSpan(0, _index);
previousBuffer.CopyTo(_rentedBuffer);
previousBuffer.Clear();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Sequence<T> I check whether T is a reference type and only bother clearing the array when it is (to ensure the GC can collect those objects).
Not clearing value type arrays seems reasonable IMO. If there were "secrets" that shouldn't be leaked, they shouldn't use this type anyway since it doesn't "pin" the array and the GC could have moved objects around leaving old copies in memory.

Debug.Assert(_rentedBuffer != null);

if (sizeHint < 0)
throw new ArgumentException(nameof(sizeHint));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't ArgumentOutOfRangeException be a more precise and appropriate exception type to throw?


T[] oldBuffer = _rentedBuffer;

_rentedBuffer = ArrayPool<T>.Shared.Rent(newSize);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it useful both to accept an optional ArrayPool<T> instance from the user, and to allow the user to specify a minimum array length to rent. I would recommend this here as well.
For me (where I form a sequence of arrays) it prevents an IBufferWriter<T> consumer (e.g. a serialier) from allocating thousands of tiny arrays when it calls GetSpan(sizeof(some-small-struct)) a lot.
For your class, it would prevent a great many array rental and copy operations every time the one array grows by just a few bytes. Either that, or perhaps have a growing factor similar to List<T> so you always double the length or something like that.

@karelz
Copy link
Member

karelz commented Apr 1, 2019

@ahsonkhan if you don't have time to finish it now, let's close it, until you do have the time ;) Thanks!

@ahsonkhan
Copy link
Author

Sounds good. Closing for now, pending potential re-design.

@ahsonkhan ahsonkhan closed this Apr 1, 2019
@wli3
Copy link

wli3 commented Apr 1, 2019

Although what is the new ETA of this change?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.