Add scalable vector type to JIT and HFA type for Vector<T> #121114

snickolls-arm · 2025-10-27T13:35:47Z

This branch introduces a new type (TYP_SIMDSV) to the JIT for supporting scalable vectors, registers whose size is determined by hardware at runtime but remains constant for the duration of a process. For ARM64, this means we have vectors sized in powers of 2 from 128 bits up to 2048 bits depending on hardware implementation, with an instruction available to query this size for compiler use. I've also adjusted the implementation of TYP_MASK to scale with the vector length on ARM64 in a similar manner.

This builds on and borrows much of Kunal's work in: #115948.

This PR focuses on enabling scalable type awareness as a foundation for future vector length agnostic code generation. I've refactored existing systems with a way of retrieving the size of the type with access to the compiler runtime state. This mainly involves refactoring areas that depend on genTypeSize to call a new instance method Compiler::getSizeOfType. This is allowing the JIT to emit SVE register moves, loads and stores for Vector<T> etc. but doesn't change the implementation of the Vector<T> API surface. It still emits NEON for arithmetic operations, logical operations, floating point operations and so on. The codegen is functionally equivalent while the vector length is set to 128 bits.

With this type being passed around, we can begin implementing vector length agnostic code in subsequent work, as we can now test for a TYP_SIMDSV and distinguish it from a fixed size TYP_SIMD16.

Testing

Importing Vector<T> as the new type is gated behind DOTNET_JitUseScalableVectorT, meaning TYP_SIMDSV will not appear in compilation unless that variable is set. Likewise, the VM will not report use of the HFA type CORINFO_HFA_ELEM_VECTORT unless this variable is set. With the variable set, testing is validating the implementation of the new type. Without the variable set, testing is validating that TYP_SIMD16 behavior remains stable under these changes.

SuperPMI method contexts are currently out of sync with the updated JIT-EE interface, which causes mismatches in return values between the JIT and EE. I don't expect this to stop until DOTNET_JitUseScalableVectorT is removed and standardized, so this feature will need to be tested with some specially generated MCH files while being developed.

Future

We will be able to remove DOTNET_JitUseScalableVectorT once Vector<T> is working in AOT compilation. This will require all phases to be aware that TYP_SIMDSV is dynamically sized and make the choice if the pass can run or not depending on this.

For a transitional approach, we could allow specifying a fixed target vector length for AOT compilation while broader vector-length agnostic support is implemented. In some cases we might be able to take advantage of knowing the vector size at compilation time, so having both approaches available might be advantageous for JIT mode.

Code Example

static void SimdAdd(ReadOnlySpan<float> a, ReadOnlySpan<float> b, Span<float> c)
{
    int len = Math.Min(Math.Min(a.Length, b.Length), c.Length);
    int i = 0;

    if (Vector.IsHardwareAccelerated)
    {
        int width = Vector<float>.Count;
        for (; i <= len - width; i += width)
        {
            var va = new Vector<float>(a.Slice(i, width));
            var vb = new Vector<float>(b.Slice(i, width));
            (va + vb).CopyTo(c.Slice(i, width));
        }
    }

   // Scalar tail (or full path if no HW acceleration)
   for (; i < len; i++)
       c[i] = a[i] + b[i];
}

With DOTNET_JitUseScalableVectorT=1, the main vector loop body compiles to:

G_M23455_IG03:        ; offs=0x000020, size=0x0044, bbWeight=4, PerfScore 100.00, gcrefRegs=0000 {}, byrefRegs=0015 {x0 x2 x4}, BB03 [0002], BB20 [0018], BB32 [0033], BB44 [0048], byref, isz

IN000b: 000020      mov     w8, w7
IN000c: 000024      add     x9, x8, #4
IN000d: 000028      cmp     x9, w1, UXTW
IN000e: 00002C      bhi     G_M23455_IG10
IN000f: 000030      lsl     x8, x8, #2
IN0010: 000034      add     x10, x0, x8
IN0011: 000038      ldr     z24, [x10]                                 ;; was ldr q24, [x10]
IN0012: 00003C      cmp     x9, w3, UXTW
IN0013: 000040      bhi     G_M23455_IG10
IN0014: 000044      add     x10, x2, x8
IN0015: 000048      ldr     z25, [x10]                                 ;; was ldr q25, [x10]
IN0016: 00004C      fadd    v24.4s, v24.4s, v25.4s
IN0017: 000050      cmp     x9, w5, UXTW
IN0018: 000054      bhi     G_M23455_IG10
IN0019: 000058      add     x8, x4, x8
IN001a: 00005C      str     z24, [x8]                                  ;; was str q24, [x8]
IN001b: 000060      add     w7, w7, #4

G_M23455_IG04:        ; offs=0x000064, size=0x000C, bbWeight=8, PerfScore 16.00, gcrefRegs=0000 {}, byrefRegs=0015 {x0 x2 x4}, loop=IG03, BB04 [0003], byref, isz

IN001c: 000064      sub     w8, w6, #4
IN001d: 000068      cmp     w8, w7
IN001e: 00006C      bge     G_M23455_IG03

Contributing towards #120599

Current implementation derives a size based on the identified SIMD type, and then uses the size to derive the node type. It should instead directly derive the node type from the identified SIMD type, because some SIMD types will not have a statically known size, and this size may conflict with other SIMD types.

The function that determines the size of a local variable needs to have access to compiler state at runtime to handle variable types with sizes that depend on some runtime value, for example Vector<T> when backed by ARM64 scalable vectors.

The function that determines the size of an indirection needs to have access to compiler state at runtime to handle variable types with sizes that depend on some runtime value, for example Vector<T> when backed by ARM64 scalable vectors.

Create a new type designed to support Vector<T> with size evaluated at runtime. Adds a new HFA type to the VM to support passing Vector<T> as a scalable vector register on ARM64. Both types are experimental and locked behind the DOTNET_JitUseScalableVectorT configuration option. This first patch implements SVE codegen for Vector<T>, mainly for managing Vector<T> as a data structure that can be placed in a Z register. When DOTNET_JitUseScalableVectorT=1, the JIT will move the type around using SVE instructions operating on Z registers. It does not yet unlock longer vector lengths or implement operations from the Vector<T> API surface using SVE. This API still generates NEON code, which is functionally equivalent so long as Vector<T>.Count is limited to 128-bits. When DOTNET_JitUseScalableVectorT=0 the code generated for Vector<T> should have zero functional difference but may have some cosmetic differences as some refactoring has been done on general SIMD codegen to support the new type.

snickolls-arm · 2025-10-27T13:42:50Z

@dotnet/arm64-contrib @a74nh @tannergooding

This is the outcome of the investigation I've been doing into supporting scalable vector types in the JIT. I've worked through a few problems and test failures but I still consider this early experimentation, feedback or suggestions on direction are appreciated.

…_Plus_Imm

snickolls-arm added 4 commits October 24, 2025 12:56

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 27, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Oct 27, 2025

jkotas added the arm-sve Work related to arm64 SVE/SVE2 support label Oct 27, 2025

snickolls-arm added 3 commits October 28, 2025 10:35

Fix typo in immediate comparison in Compiler::instGen_Set_Reg_To_Base…

8ae80c3

…_Plus_Imm

Fix conditional compilation for config values

5558a08

Fix macro expansion for MSVC

79ba8ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add scalable vector type to JIT and HFA type for Vector<T> #121114

Add scalable vector type to JIT and HFA type for Vector<T> #121114

snickolls-arm commented Oct 27, 2025

Uh oh!

snickolls-arm commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add scalable vector type to JIT and HFA type for Vector<T> #121114

Are you sure you want to change the base?

Add scalable vector type to JIT and HFA type for Vector<T> #121114

Conversation

snickolls-arm commented Oct 27, 2025

Testing

Future

Code Example

Uh oh!

snickolls-arm commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants