Skip to content

Commit 02416c2

Browse files
stephentoubmichaelgsharp
authored andcommitted
Enable TensorPrimitives to perform in-place operations (dotnet#92820)
Some operations would produce incorrect results if the same span was passed as both an input and an output. When vectorization was employed but the span's length wasn't a perfect multiple of a vector, we'd do the standard trick of performing one last operation on the last vector's worth of data; however, that relies on the operation being idempotent, and if a previous operation has overwritten input with a new value due to the same memory being used for input and output, some operations won't be idempotent. This fixes that by masking off the already processed elements. It adds tests to validate in-place use works, and it updates the docs to carve out this valid overlapping.
1 parent 4088f05 commit 02416c2

File tree

4 files changed

+740
-96
lines changed

4 files changed

+740
-96
lines changed

0 commit comments

Comments
 (0)