-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[RISCV] Mark VFIRST and VCPOP as SignExtendingOpW #77022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Min-Yih Hsu (mshockwave) ChangesSince their values are small enough ([-1, 65535] & [0, 65535], respectively) to fit into signed 32 bits, any sext (or downcasting + sext) will be redundnat. Hence marking them as SignExtendingOpW. Full diff: https://github.com/llvm/llvm-project/pull/77022.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index 30deeaa064486f..fcb18b67623e7c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -6719,12 +6719,14 @@ defm PseudoVMSET : VPseudoNullaryPseudoM<"VMXNOR">;
// 15.2. Vector mask population count vcpop
//===----------------------------------------------------------------------===//
+let IsSignExtendingOpW = 1 in
defm PseudoVCPOP: VPseudoVPOP_M;
//===----------------------------------------------------------------------===//
// 15.3. vfirst find-first-set mask bit
//===----------------------------------------------------------------------===//
+let IsSignExtendingOpW = 1 in
defm PseudoVFIRST: VPseudoV1ST_M;
//===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/opt-w-instrs.mir b/llvm/test/CodeGen/RISCV/opt-w-instrs.mir
index 0ecf8fd6bef33a..afdf5f9c72ba75 100644
--- a/llvm/test/CodeGen/RISCV/opt-w-instrs.mir
+++ b/llvm/test/CodeGen/RISCV/opt-w-instrs.mir
@@ -1,5 +1,5 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
-# RUN: llc -mtriple=riscv64 -mattr='+d,+zfa' -verify-machineinstrs -run-pass=riscv-opt-w-instrs %s -o - | FileCheck %s --check-prefix=CHECK-ZFA
+# RUN: llc -mtriple=riscv64 -mattr='+d,+zfa,+v' -verify-machineinstrs -run-pass=riscv-opt-w-instrs %s -o - | FileCheck %s
---
name: fcvtmod_w_d
@@ -8,16 +8,16 @@ body: |
bb.0.entry:
liveins: $x10, $x11
- ; CHECK-ZFA-LABEL: name: fcvtmod_w_d
- ; CHECK-ZFA: liveins: $x10, $x11
- ; CHECK-ZFA-NEXT: {{ $}}
- ; CHECK-ZFA-NEXT: [[COPY:%[0-9]+]]:fpr64 = COPY $x10
- ; CHECK-ZFA-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x11
- ; CHECK-ZFA-NEXT: [[FCVTMOD_W_D:%[0-9]+]]:gpr = nofpexcept FCVTMOD_W_D [[COPY]], 1
- ; CHECK-ZFA-NEXT: [[ADD:%[0-9]+]]:gpr = ADD [[COPY1]], [[FCVTMOD_W_D]]
- ; CHECK-ZFA-NEXT: $x10 = COPY [[ADD]]
- ; CHECK-ZFA-NEXT: $x11 = COPY [[FCVTMOD_W_D]]
- ; CHECK-ZFA-NEXT: PseudoRET
+ ; CHECK-LABEL: name: fcvtmod_w_d
+ ; CHECK: liveins: $x10, $x11
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:fpr64 = COPY $x10
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x11
+ ; CHECK-NEXT: [[FCVTMOD_W_D:%[0-9]+]]:gpr = nofpexcept FCVTMOD_W_D [[COPY]], 1
+ ; CHECK-NEXT: [[ADD:%[0-9]+]]:gpr = ADD [[COPY1]], [[FCVTMOD_W_D]]
+ ; CHECK-NEXT: $x10 = COPY [[ADD]]
+ ; CHECK-NEXT: $x11 = COPY [[FCVTMOD_W_D]]
+ ; CHECK-NEXT: PseudoRET
%0:fpr64 = COPY $x10
%1:gpr = COPY $x11
@@ -28,3 +28,55 @@ body: |
$x11 = COPY %4
PseudoRET
...
+---
+ name: vfirst
+ tracksRegLiveness: true
+ body: |
+ bb.0.entry:
+ liveins: $x10, $v8
+
+ ; CHECK-LABEL: name: vfirst
+ ; CHECK: liveins: $x10, $v8
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:vr = COPY $v8
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gprnox0 = COPY $x10
+ ; CHECK-NEXT: [[PseudoVFIRST_M_B1_:%[0-9]+]]:gpr = PseudoVFIRST_M_B1 [[COPY]], [[COPY1]], 0 /* e8 */
+ ; CHECK-NEXT: [[ADD:%[0-9]+]]:gpr = ADD [[COPY1]], [[PseudoVFIRST_M_B1_]]
+ ; CHECK-NEXT: $x10 = COPY [[ADD]]
+ ; CHECK-NEXT: $x11 = COPY [[PseudoVFIRST_M_B1_]]
+ ; CHECK-NEXT: PseudoRET
+ %0:vr = COPY $v8
+ %1:gprnox0 = COPY $x10
+ %2:gpr = PseudoVFIRST_M_B1 %0:vr, %1:gprnox0, 0
+ %3:gpr = ADD %1, %2
+ %4:gpr = ADDIW %2, 0
+ $x10 = COPY %3
+ $x11 = COPY %4
+ PseudoRET
+...
+---
+ name: vcpop
+ tracksRegLiveness: true
+ body: |
+ bb.0.entry:
+ liveins: $x10, $v8
+
+ ; CHECK-LABEL: name: vcpop
+ ; CHECK: liveins: $x10, $v8
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:vr = COPY $v8
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gprnox0 = COPY $x10
+ ; CHECK-NEXT: [[PseudoVCPOP_M_B1_:%[0-9]+]]:gpr = PseudoVCPOP_M_B1 [[COPY]], [[COPY1]], 0 /* e8 */
+ ; CHECK-NEXT: [[ADD:%[0-9]+]]:gpr = ADD [[COPY1]], [[PseudoVCPOP_M_B1_]]
+ ; CHECK-NEXT: $x10 = COPY [[ADD]]
+ ; CHECK-NEXT: $x11 = COPY [[PseudoVCPOP_M_B1_]]
+ ; CHECK-NEXT: PseudoRET
+ %0:vr = COPY $v8
+ %1:gprnox0 = COPY $x10
+ %2:gpr = PseudoVCPOP_M_B1 %0:vr, %1:gprnox0, 0
+ %3:gpr = ADD %1, %2
+ %4:gpr = ADDIW %2, 0
+ $x10 = COPY %3
+ $x11 = COPY %4
+ PseudoRET
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the ADD instruction for? I see it was in the fcvt test too but I can't recall why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we need an additional use of %2
otherwise ADDI
will be removed regardless of this change. Let me double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the ADD instruction for? I see it was in the fcvt test too but I can't recall why
If the user of sext only uses the lower 32 bits, RISCVOptWInstrs will remove the sext regardless of the SignExtendingOpW flag (see the hasAllWUsers
function). Previously, I tried to use ADD
to create that artificial use, despite using an incorrect syntax (ADD
was using %2
not %4
) it somehow stoped RISCVOptWInstrs from removing sext in absent of SignExtendingOpW.
But even ADD
is not the best choice as RISCVOptWInstrs transitively looks into the user of ADD
to determine if only the lower 32 bits are used so we're only pushing off the problem. A better choice would be using SRLI
with its shift amount greater than 32, which creates an artificial use beyond 32 bits. I'll update the patch accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. The ADD and ADDIW were separate users of the VFIRST. The ADDIW had no users other than a COPY . A COPY should be assumed to use all bits. So how was it allowed to be removed?
Since their values are small enough ([-1, 65535] & [0, 65535], respectively) to fit into signed 32 bits, any sext (or downcasting + sext) will be redundnat. Hence marking it as SignExtendingOpW.
f8781e1
to
b21c7a9
Compare
I've rebased to leverage COPY to create artificial (64-bit) uses enabled by 4dd5d96 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Since their values are small enough ([-1, 65535] & [0, 65535], respectively) to fit into signed 32 bits, any sext (or downcasting + sext) will be redundnat. Hence marking them as SignExtendingOpW.
Since their values are small enough ([-1, 65535] & [0, 65535], respectively) to fit into signed 32 bits, any sext (or downcasting + sext) will be redundnat. Hence marking them as SignExtendingOpW.