Skip to content

Commit 663c58d

Browse files
vchuravygiordano
andauthored
Change SIMD Loop from Fast to only reassoc/contract (#49405)
Addresses #49387 Co-authored-by: Mosè Giordano <[email protected]>
1 parent 4e0da0d commit 663c58d

File tree

6 files changed

+16
-11
lines changed

6 files changed

+16
-11
lines changed

NEWS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,10 @@ Language changes
2121
that significantly improves load and inference times for heavily overloaded methods that
2222
dispatch on Types (such as traits and constructors).
2323
* The "h bar" `` (`\hslash` U+210F) character is now treated as equivalent to `ħ` (`\hbar` U+0127).
24+
* The `@simd` macro now has a more limited and clearer semantics, it only enables reordering and contraction
25+
of floating-point operations, instead of turning on all "fastmath" optimizations.
26+
If you observe performance regressions due to this change, you can recover previous behavior with `@fastmath @simd`,
27+
if you are OK with all the optimizations enabled by the `@fastmath` macro. ([#49405])
2428
* When a method with keyword arguments is displayed in the stack trace view, the textual
2529
representation of the keyword arguments' types is simplified using the new
2630
`@Kwargs{key1::Type1, ...}` macro syntax ([#49959]).

base/simdloop.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ The object iterated over in a `@simd for` loop should be a one-dimensional range
100100
By using `@simd`, you are asserting several properties of the loop:
101101
102102
* It is safe to execute iterations in arbitrary or overlapping order, with special consideration for reduction variables.
103-
* Floating-point operations on reduction variables can be reordered, possibly causing different results than without `@simd`.
103+
* Floating-point operations on reduction variables can be reordered or contracted, possibly causing different results than without `@simd`.
104104
105105
In many cases, Julia is able to automatically vectorize inner for loops without the use of `@simd`.
106106
Using `@simd` gives the compiler a little extra leeway to make it possible in more situations. In

src/llvm-muladd.cpp

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,10 @@ STATISTIC(TotalContracted, "Total number of multiplies marked for FMA");
4040
* Combine
4141
* ```
4242
* %v0 = fmul ... %a, %b
43-
* %v = fadd fast ... %v0, %c
43+
* %v = fadd contract ... %v0, %c
4444
* ```
4545
* to
46-
* `%v = call fast @llvm.fmuladd.<...>(... %a, ... %b, ... %c)`
46+
* `%v = call contract @llvm.fmuladd.<...>(... %a, ... %b, ... %c)`
4747
* when `%v0` has no other use
4848
*/
4949

@@ -87,13 +87,13 @@ static bool combineMulAdd(Function &F) JL_NOTSAFEPOINT
8787
it++;
8888
switch (I.getOpcode()) {
8989
case Instruction::FAdd: {
90-
if (!I.isFast())
90+
if (!I.hasAllowContract())
9191
continue;
9292
modified |= checkCombine(I.getOperand(0), ORE) || checkCombine(I.getOperand(1), ORE);
9393
break;
9494
}
9595
case Instruction::FSub: {
96-
if (!I.isFast())
96+
if (!I.hasAllowContract())
9797
continue;
9898
modified |= checkCombine(I.getOperand(0), ORE) || checkCombine(I.getOperand(1), ORE);
9999
break;

src/llvm-simdloop.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,8 @@ static void enableUnsafeAlgebraIfReduction(PHINode *Phi, Loop *L, OptimizationRe
149149
return OptimizationRemark(DEBUG_TYPE, "MarkedUnsafeAlgebra", *K)
150150
<< "marked unsafe algebra on " << ore::NV("Instruction", *K);
151151
});
152-
(*K)->setFast(true);
152+
(*K)->setHasAllowReassoc(true);
153+
(*K)->setHasAllowContract(true);
153154
++length;
154155
}
155156
ReductionChainLength += length;

test/llvmpasses/loopinfo.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,10 +29,10 @@ function simdf(X)
2929
acc += x
3030
# CHECK: call void @julia.loopinfo_marker(), {{.*}}, !julia.loopinfo [[LOOPINFO:![0-9]+]]
3131
# LOWER-NOT: llvm.mem.parallel_loop_access
32-
# LOWER: fadd fast double
32+
# LOWER: fadd reassoc contract double
3333
# LOWER-NOT: call void @julia.loopinfo_marker()
3434
# LOWER: br {{.*}}, !llvm.loop [[LOOPID:![0-9]+]]
35-
# FINAL: fadd fast <{{(vscale x )?}}{{[0-9]+}} x double>
35+
# FINAL: fadd reassoc contract <{{(vscale x )?}}{{[0-9]+}} x double>
3636
end
3737
acc
3838
end
@@ -46,7 +46,7 @@ function simdf2(X)
4646
# CHECK: call void @julia.loopinfo_marker(), {{.*}}, !julia.loopinfo [[LOOPINFO2:![0-9]+]]
4747
# LOWER: llvm.mem.parallel_loop_access
4848
# LOWER-NOT: call void @julia.loopinfo_marker()
49-
# LOWER: fadd fast double
49+
# LOWER: fadd reassoc contract double
5050
# LOWER: br {{.*}}, !llvm.loop [[LOOPID2:![0-9]+]]
5151
end
5252
acc

test/llvmpasses/simdloop.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ loop:
4040
; CHECK: llvm.mem.parallel_loop_access
4141
%aval = load double, double *%aptr
4242
%nextv = fsub double %v, %aval
43-
; CHECK: fsub fast double %v, %aval
43+
; CHECK: fsub reassoc contract double %v, %aval
4444
%nexti = add i64 %i, 1
4545
call void @julia.loopinfo_marker(), !julia.loopinfo !3
4646
%done = icmp sgt i64 %nexti, 500
@@ -59,7 +59,7 @@ loop:
5959
%aptr = getelementptr double, double *%a, i64 %i
6060
%aval = load double, double *%aptr
6161
%nextv = fsub double %v, %aval
62-
; CHECK: fsub fast double %v, %aval
62+
; CHECK: fsub reassoc contract double %v, %aval
6363
%nexti = add i64 %i, 1
6464
call void @julia.loopinfo_marker(), !julia.loopinfo !2
6565
%done = icmp sgt i64 %nexti, 500

0 commit comments

Comments
 (0)