Skip to content

Bounds checks are no longer elided when using nint for indexing. #44415

@tannergooding

Description

@tannergooding

Examine the following two methods that only differ based on using int vs nint for indexing into an array:

public static int SumWithInt32Index(int[] a)
{
    int result = 0;

    for (int i = 0; i < a.Length; i++)
    {
        result += a[i];
    }

    return result;
}

public static int SumWithNativeIntIndex(int[] a)
{
    int result = 0;
        
    for (nint i = 0; i < a.Length; i++)
    {
        result += a[i];
    }
        
    return result;
}

These two methods result in the following codegen:

; ConsoleApp64.Benchmarks.SumWithInt32Index(Int32[])
       xor       eax,eax
       xor       edx,edx
       mov       r8d,[rcx+8]
       test      r8d,r8d
       jle       short M01_L01
M01_L00:
       movsxd    r9,edx
       add       eax,[rcx+r9*4+10]
       inc       edx
       cmp       r8d,edx
       jg        short M01_L00
M01_L01:
       ret
; Total bytes of code 29

; ConsoleApp64.Benchmarks.SumWithNativeIntIndex(Int32[])
       sub       rsp,28
       xor       eax,eax
       xor       edx,edx
       mov       r8d,[rcx+8]
       movsxd    r8,r8d
       test      r8,r8
       jle       short M01_L01
M01_L00:
       cmp       rdx,r8
       jae       short M01_L02
       add       eax,[rcx+rdx*4+10]
       inc       rdx
       cmp       r8,rdx
       jg        short M01_L00
M01_L01:
       add       rsp,28
       ret
M01_L02:
       call      CORINFO_HELP_RNGCHKFAIL
       int       3
; Total bytes of code 48

As can be seen, the latter is no longer able to elide the bounds checks due to the additional cast around a.Length to nint which results in significantly worse codegen. A developer must use Unsafe.Add to avoid the bounds checks in the inner loop and MemoryMarshal.GetArrayDataReference to avoid bounds checks in the outer loop. In both cases, while the inner loop does improve, the overall codegen is still slightly worse:

; ConsoleApp64.Benchmarks.SumWithNativeIntIndex(Int32[])
       xor       eax,eax
       lea       rdx,[rcx+10]
       xor       r8d,r8d
       mov       ecx,[rcx+8]
       movsxd    rcx,ecx
       test      rcx,rcx
       jle       short M01_L01
M01_L00:
       add       eax,[rdx+r8*4]
       inc       r8
       cmp       rcx,r8
       jg        short M01_L00
M01_L01:
       ret
; Total bytes of code 33

It would be beneficial if utilizing nint where it is accepted "just" works.

category:cq
theme:bounds-checks
skill-level:intermediate
cost:small
impact:small

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

Status

Backlog (General)

Relationships

None yet

Development

No branches or pull requests

Issue actions