-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Closed
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone
Description
(applies to Vector256 as well)
Consider Vector128.ShiftRightLogical(ref byte) where X86 does not have a ShiftRightLogical instruction that operates on bytes:
Vector128<byte> v0 = Vector128.LoadUnsafe(ref source);
Vector128<byte> v1 = Vector128.ShiftRightLogical(v0, 4);Which currently emits a scalar fallback
TestClass.Foo(Byte ByRef)
L0000: push rsi
L0001: sub rsp, 0x40
L0005: vzeroupper
L0008: vmovdqu xmm0, [rcx]
L000c: vmovapd [rsp+0x20], xmm0
L0012: xor esi, esi
L0014: lea rcx, [rsp+0x20]
L0019: movsxd rdx, esi
L001c: movzx ecx, byte ptr [rcx+rdx]
L0020: mov edx, 4
L0025: mov rax, 0x7ffa0845bc60
L002f: call qword ptr [rax]
L0031: lea rdx, [rsp+0x30]
L0036: movsxd rcx, esi
L0039: mov [rdx+rcx], al
L003c: inc esi
L003e: cmp esi, 0x10
L0041: jl short L0014
L0043: vmovapd xmm0, [rsp+0x30]
L0049: vpmovmskb eax, xmm0
L004d: add rsp, 0x40
L0051: pop rsi
L0052: retwhere it could instead emit a 32-bit shift and an AND to clear the overlapping bits
Vector128<byte> v0 = Vector128.LoadUnsafe(ref source);
Vector128<byte> v1 = Vector128.ShiftRightLogical(v0.AsInt32(), 4).AsByte() & Vector128.Create((byte)0xF);TestClass.Bar(Byte ByRef)
L0000: vzeroupper
L0003: vmovdqu xmm0, [rcx]
L0007: vpsrld xmm0, xmm0, 4
L000c: vpand xmm0, xmm0, [0x7ffa087600d0]
L0014: vpmovmskb eax, xmm0
L0018: retWe have a few places in runtime that are aware of this issue and employ workarounds, e.g.:
runtime/src/libraries/System.Private.CoreLib/src/System/IndexOfAnyValues/IndexOfAnyAsciiSearcher.cs
Line 875 in c1abf87
: Sse2.ShiftRightLogical(source.AsInt32(), 4).AsByte() & Vector128.Create((byte)0xF); runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs
Line 594 in dc6ad37
Vector128<byte> hiNibbles = Vector128.ShiftRightLogical(str.AsInt32(), 4).AsByte() & mask2F; - https://github.com/dotnet/runtime/blob/8482f562a8b5d96bb0a0fb201bfabea7e5e6b115/src/libraries/System.Private.CoreLib/src/System/IndexOfAnyValues/ProbabilisticMap.cs#L168-L170
Metadata
Metadata
Assignees
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI