This repository was archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Generate efficient code for rotation patterns. #1830
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
erozenfeld
commented
Oct 22, 2015
This change adds code to recognize rotation idioms and generate efficient instructions for them.
Two new operators are added: GT_ROL and GT_ROR.
The patterns recognized:
(x << c1) | (x >>> c2) => x rol c1
(x >>> c1) | (x << c2) => x ror c2
where c1 and c2 are constant and c1 + c2 == bitsize(x)
(x << y) | (x >>> (N - y)) => x rol y
(x >>> y) | (x << (N - y)) => x ror y
where N == bitsize(x)
(x << y & M1) | (x >>> (N - y) & M2) => x rol y
(x >>> y & M1) | (x << (N - y) & M2) => x ror y
where N == bitsize(x)
M1 & (N - 1) == N - 1
M2 & (N - 1) == N - 1
For a simple benchmark with 4 rotation patterns in a tight loop
time goes from 7.324 to 2.600 (2.8 speedup).
Rotations found and optimized in mscorlib:
System.Security.Cryptography.SHA256Managed::RotateRight
System.Security.Cryptography.SHA384Managed::RotateRight
System.Security.Cryptography.SHA512Managed::RotateRight
System.Security.Cryptography.RIPEMD160Managed:MDTransform (320 instances!)
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol1
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol5
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Rol30
System.Diagnostics.Tracing.EventSource.Sha1ForNonSecretPurposes::Drain
(9 instances of Sha1ForNonSecretPurposes::Rol* inlined)
Closes #1619.
Member
Author
|
@sivarv PTAL |
|
Reviewed offline - thanks for adding the test! |
Member
|
LGTM |
erozenfeld
added a commit
that referenced
this pull request
Oct 22, 2015
Generate efficient code for rotation patterns.
|
This does not appear to be working for 32-bit... for this function [MethodImpl(MethodImplOptions.NoInlining)]
private static int Foo(int left)
{
uint rol5 = ((uint)left << 5) | ((uint)left >> 27);
return (int)rol5;
}a rol is getting emitted for x64, but not for regular x86. ; x64
G_M30394_IG01:
G_M30394_IG02:
8BC1 mov eax, ecx
C1C005 rol eax, 5
G_M30394_IG03:
C3 ret
; x86
G_M30394_IG01:
55 push ebp
8BEC mov ebp, esp
G_M30394_IG02:
8BC1 mov eax, ecx
C1E005 shl eax, 5
C1E91B shr ecx, 27
0BC1 or eax, ecx
G_M30394_IG03:
5D pop ebp
C3 ret |
Member
Author
|
Yes, this optimization was added only to RyuJIT (which is the default for 64 bit), not to the current 32-bit jit. 32-bit RyuJIT work is in progress and it will support this optimization. |
|
Ah, so that explains it. Thank you for clarifying 😄 |
|
In the case with the masks, does the |
Member
Author
|
No, these patterns are not recognized with the current implementation. |
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
Generate efficient code for rotation patterns. Commit migrated from dotnet/coreclr@429bb1c
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.