-
Couldn't load subscription status.
- Fork 15k
Description
Alive for the specific example, with opt+llc: https://alive2.llvm.org/ce/z/7oofsh
+label: llvm:optimizations
I tried the following:
define i64 @src(i64 noundef %0) {
start:
%1 = mul nuw nsw i64 %0, 52
%2 = lshr exact i64 %1, 2
ret i64 %2
}Since 52 = 4 * 13 I expected to see that fold down to just a single mul:
define i64 @tgt(i64 noundef %0) {
start:
%1 = mul nuw nsw i64 %0, 13
ret i64 %1
}But it doesn't -- the mul and lshr are left in the optimized IR, and aren't folded by the target either:
src: # @src
imul rax, rdi, 52
shr rax, 2
retWhereas the single-multiplication version ends up not even needing an imul:
tgt: # @tgt
lea rax, [rdi + 2*rdi]
lea rax, [rdi + 4*rax]
retSo in general, %1 = mul nuw %0, CONST1; %2 = lshr exact %1, CONST2 should simplify to a single mul nuw whenever CONST1 is a multiple of 1 << CONST2.
In fact, it looks like this already happens in opt if written with div: https://alive2.llvm.org/ce/z/L-VouQ
%1 = mul nuw i64 %0, 52
%2 = udiv exact i64 %1, 4
=>
%1 = mul nuw nsw i64 %0, 13So that code path just needs to take into account places where the udiv -> lshr transformation had already happened.