Skip to content

[DON'T MERGE] Test pull request #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

shahmishal
Copy link
Member

No description provided.

@shahmishal
Copy link
Member Author

@swift-ci test

2 similar comments
@shahmishal
Copy link
Member Author

@swift-ci test

@shahmishal
Copy link
Member Author

@swift-ci test

@hyp
Copy link

hyp commented Oct 27, 2019

@swift-ci please test

(checking if lit tests are fixed now)

@hyp
Copy link

hyp commented Oct 27, 2019

@swift-ci please test

adrian-prantl pushed a commit to adrian-prantl/llvm-project that referenced this pull request Oct 29, 2019
This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30):

PASS:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325
    frame swiftlang#1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295
    frame swiftlang#2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame swiftlang#3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame swiftlang#4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame swiftlang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame swiftlang#6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243
    frame swiftlang#7: 0x000000000040106e a.out`_start + 46

vs.

FAIL - unrecognized abort() function:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325
    frame swiftlang#1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295
    frame swiftlang#2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame swiftlang#3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame swiftlang#4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame swiftlang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame swiftlang#6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243
    frame swiftlang#7: 0x000000000040106e a.out`.annobin_init.c.hot + 46

The extra ELF symbols are there due to Annobin (I did not investigate why this problem happened specifically since F-30 and not since F-28).
It is due to:

Symbol table '.dynsym' contains 2361 entries:
Valu e          Size Type   Bind   Vis     Name
0000000000022769   5 FUNC   LOCAL  DEFAULT _nl_load_domain.cold
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_loadmsgcat.c_end.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_textdomain.c_end.unlikely
000000000002276e 548 FUNC   GLOBAL DEFAULT abort
000000000002276e 548 FUNC   GLOBAL DEFAULT abort@@GLIBC_2.2.5
000000000002276e 548 FUNC   LOCAL  DEFAULT __GI_abort
0000000000022992   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c_end.unlikely

Differential Revision: https://reviews.llvm.org/D63540

llvm-svn: 364773
(cherry picked from commit 3f594ed)
@hyp
Copy link

hyp commented Nov 1, 2019

@swift-ci please test

swift-ci pushed a commit that referenced this pull request Nov 4, 2019
Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
@shahmishal
Copy link
Member Author

@swift-ci test Linux

4 similar comments
@shahmishal
Copy link
Member Author

@swift-ci test Linux

@shahmishal
Copy link
Member Author

@swift-ci test Linux

@shahmishal
Copy link
Member Author

@swift-ci test Linux

@shahmishal
Copy link
Member Author

@swift-ci test Linux

@shahmishal
Copy link
Member Author

shahmishal commented Nov 6, 2019

@hyp Do you know if this is known failure on Linux?

23:17:41 [3281/4299] /usr/bin/c++   -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/lib/Tooling/Refactor -I/home/buildnode/jenkins/workspace/apple-llvm-project-pr-linux/llvm-project/clang/lib/Tooling/Refactor -I/home/buildnode/jenkins/workspace/apple-llvm-project-pr-linux/llvm-project/clang/include -Itools/clang/include -I/usr/include/libxml2 -Iinclude -I/home/buildnode/jenkins/workspace/apple-llvm-project-pr-linux/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3    -UNDEBUG  -fno-exceptions -fno-rtti -std=c++14 -MMD -MT tools/clang/lib/Tooling/Refactor/CMakeFiles/obj.clangToolingRefactor.dir/Extract.cpp.o -MF tools/clang/lib/Tooling/Refactor/CMakeFiles/obj.clangToolingRefactor.dir/Extract.cpp.o.d -o tools/clang/lib/Tooling/Refactor/CMakeFiles/obj.clangToolingRefactor.dir/Extract.cpp.o -c /home/buildnode/jenkins/workspace/apple-llvm-project-pr-linux/llvm-project/clang/lib/Tooling/Refactor/Extract.cpp
23:17:41 /home/buildnode/jenkins/workspace/apple-llvm-project-pr-linux/llvm-project/clang/lib/Tooling/Refactor/Extract.cpp:871:2: warning: extra ‘;’ [-Wpedantic]
23:17:41  };
23:17:41   ^
23:17:41 ninja: build stopped: subcommand failed.
23:17:41 Build step 'Execute shell' marked build as failure

swift-ci pushed a commit that referenced this pull request Nov 7, 2019
…_call is understood"

This caused Chromium builds to fail with "inlinable function call in a function
with debug info must have a !dbg location" errors. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a
reproducer.

> Currently, clang emits subprograms for declared functions when the
> target debugger or DWARF standard is known to support entry values
> (DW_OP_entry_value & the GNU equivalent).
>
> Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
> tail calls.
>
> Pre-patch debug session with a cross-TU tail call:
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> Post-patch (note that the tail-calling frame, "helper", is visible):
>
> ```
>   * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
>     frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
>     frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> rdar://46577651
>
> Differential Revision: https://reviews.llvm.org/D69743
swift-ci pushed a commit that referenced this pull request Nov 13, 2019
During register coalescing, we update the live-intervals on-the-fly.
To do that we are in this strange mode where the live-intervals can
be slightly out-of-sync (more precisely they are forward looking)
compared to what the IR actually represents.
This happens because the register coalescer only updates the IR when
it is done with updating the live-intervals and it has to do it this
way because updating the IR on-the-fly would actually clobber some
information on how the live-ranges that are being updated look like.

This is problematic for updates that rely on the IR to accurately
represents the state of the live-ranges. Right now, we have only
one of those: stripValuesNotDefiningMask.
To reconcile this need of out-of-sync IR, this patch introduces a
new argument to LiveInterval::refineSubRanges that allows the code
doing the live range updates to reason about how the code should
look like after the coalescer will have rewritten the registers.
Essentially this captures how a subregister index with be offseted
to match its position in a new register class.

E.g., let say we want to merge:
    V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>

We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
Put differently we align V2's sub3 with V1's sub1:
    V2: sub0 sub1 sub2 sub3
    V1: <offset>  sub0 sub1

This offset will look like a composed subregidx in the the class:
     V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
 =>  V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>

Now if we didn't rewrite the uses and def of V1, all the checks for V1
need to account for this offset to match what the live intervals intend
to capture.

Prior to this patch, we would fail to recognize the uses and def of V1
and would end up with machine verifier errors: No live segment at def.
This could lead to miscompile as we would drop some live-ranges and
thus, miss some interferences.

For this problem to trigger, we need to reach stripValuesNotDefiningMask
while having a mismatch between the IR and the live-ranges (i.e.,
we have to apply a subreg offset to the IR.)

This requires the following three conditions:
1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
2. An update with Tuple registers with a possibility to coalesce the
   subreg index: e.g., v1.dsub_1 == v2.dsub_3
3. Subreg liveness enabled.

looking at the IR to decide what is alive and what is not, i.e., calling
stripValuesNotDefiningMask.
coalescer maintains for the live-ranges information.

None of the targets that currently use subreg liveness (i.e., the targets
that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
and #2, so this patch also artificial enables subreg liveness for ARM,
so that a nice test case can be attached.
@dcci
Copy link
Member

dcci commented Nov 18, 2019

Cam we close this one?

@shahmishal shahmishal closed this Nov 18, 2019
@shahmishal shahmishal deleted the shahmishal/test-pr branch November 18, 2019 21:48
swift-ci pushed a commit that referenced this pull request Nov 19, 2019
…ood (reland with fixes)

Currently, clang emits subprograms for declared functions when the
target debugger or DWARF standard is known to support entry values
(DW_OP_entry_value & the GNU equivalent).

Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
tail calls.

Pre-patch debug session with a cross-TU tail call:

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

Post-patch (note that the tail-calling frame, "helper", is visible):

```
  * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
    frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
    frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
```

This was reverted in 5b9a072 because it attached declaration
subprograms to inlinable builtin calls, which interacted badly with the
MergeICmps pass. The fix is to not attach declarations to builtins.

rdar://46577651

Differential Revision: https://reviews.llvm.org/D69743
swift-ci pushed a commit that referenced this pull request Nov 21, 2019
…iant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151"

Summary:
Revert "[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151"

 This reverts commit 5f026b6.

We're (tensorflow.org/xla team) seeing some misscompiles with the new change, only at -O3, with fast math disabled.

I'm still trying to come up with a useful/small/external example, but for now, the following IR:

```
; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

@0 = private unnamed_addr constant [4 x i8] c"\DB\0F\C9@"
@1 = private unnamed_addr constant [4 x i8] c"\00\00\00?"

; Function Attrs: uwtable
define void @jit_wrapped_fun.31(i8* %retval, i8* noalias %run_options, i8** noalias %params, i8** noalias %buffer_table, i64* noalias %prof_counters) #0 {
entry:
  %fusion.invar_address.dim.2 = alloca i64
  %fusion.invar_address.dim.1 = alloca i64
  %fusion.invar_address.dim.0 = alloca i64
  %fusion.1.invar_address.dim.2 = alloca i64
  %fusion.1.invar_address.dim.1 = alloca i64
  %fusion.1.invar_address.dim.0 = alloca i64
  %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1
  %1 = load i8*, i8** %0, !invariant.load !0, !dereferenceable !1, !align !2
  %parameter.3 = bitcast i8* %1 to [2 x [1 x [4 x float]]]*
  %2 = getelementptr inbounds i8*, i8** %buffer_table, i64 5
  %3 = load i8*, i8** %2, !invariant.load !0, !dereferenceable !1, !align !2
  %fusion.1 = bitcast i8* %3 to [2 x [1 x [4 x float]]]*
  store i64 0, i64* %fusion.1.invar_address.dim.0
  br label %fusion.1.loop_header.dim.0

fusion.1.loop_header.dim.0:                       ; preds = %fusion.1.loop_exit.dim.1, %entry
  %fusion.1.indvar.dim.0 = load i64, i64* %fusion.1.invar_address.dim.0
  %4 = icmp uge i64 %fusion.1.indvar.dim.0, 2
  br i1 %4, label %fusion.1.loop_exit.dim.0, label %fusion.1.loop_body.dim.0

fusion.1.loop_body.dim.0:                         ; preds = %fusion.1.loop_header.dim.0
  store i64 0, i64* %fusion.1.invar_address.dim.1
  br label %fusion.1.loop_header.dim.1

fusion.1.loop_header.dim.1:                       ; preds = %fusion.1.loop_exit.dim.2, %fusion.1.loop_body.dim.0
  %fusion.1.indvar.dim.1 = load i64, i64* %fusion.1.invar_address.dim.1
  %5 = icmp uge i64 %fusion.1.indvar.dim.1, 1
  br i1 %5, label %fusion.1.loop_exit.dim.1, label %fusion.1.loop_body.dim.1

fusion.1.loop_body.dim.1:                         ; preds = %fusion.1.loop_header.dim.1
  store i64 0, i64* %fusion.1.invar_address.dim.2
  br label %fusion.1.loop_header.dim.2

fusion.1.loop_header.dim.2:                       ; preds = %fusion.1.loop_body.dim.2, %fusion.1.loop_body.dim.1
  %fusion.1.indvar.dim.2 = load i64, i64* %fusion.1.invar_address.dim.2
  %6 = icmp uge i64 %fusion.1.indvar.dim.2, 4
  br i1 %6, label %fusion.1.loop_exit.dim.2, label %fusion.1.loop_body.dim.2

fusion.1.loop_body.dim.2:                         ; preds = %fusion.1.loop_header.dim.2
  %7 = load float, float* bitcast ([4 x i8]* @0 to float*)
  %8 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2
  %9 = load float, float* %8, !invariant.load !0, !noalias !3
  %10 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2
  %11 = load float, float* %10, !invariant.load !0, !noalias !3
  %12 = fmul float %9, %11
  %13 = fmul float %7, %12
  %14 = call float @llvm.log.f32(float %13)
  %15 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2
  store float %14, float* %15, !alias.scope !7, !noalias !8
  %invar.inc2 = add nuw nsw i64 %fusion.1.indvar.dim.2, 1
  store i64 %invar.inc2, i64* %fusion.1.invar_address.dim.2
  br label %fusion.1.loop_header.dim.2

fusion.1.loop_exit.dim.2:                         ; preds = %fusion.1.loop_header.dim.2
  %invar.inc1 = add nuw nsw i64 %fusion.1.indvar.dim.1, 1
  store i64 %invar.inc1, i64* %fusion.1.invar_address.dim.1
  br label %fusion.1.loop_header.dim.1

fusion.1.loop_exit.dim.1:                         ; preds = %fusion.1.loop_header.dim.1
  %invar.inc = add nuw nsw i64 %fusion.1.indvar.dim.0, 1
  store i64 %invar.inc, i64* %fusion.1.invar_address.dim.0
  br label %fusion.1.loop_header.dim.0

fusion.1.loop_exit.dim.0:                         ; preds = %fusion.1.loop_header.dim.0
  %16 = getelementptr inbounds i8*, i8** %buffer_table, i64 4
  %17 = load i8*, i8** %16, !invariant.load !0, !dereferenceable !9, !align !2
  %parameter.1 = bitcast i8* %17 to float*
  %18 = getelementptr inbounds i8*, i8** %buffer_table, i64 2
  %19 = load i8*, i8** %18, !invariant.load !0, !dereferenceable !10, !align !2
  %parameter.2 = bitcast i8* %19 to [3 x [1 x float]]*
  %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 0
  %21 = load i8*, i8** %20, !invariant.load !0, !dereferenceable !11, !align !2
  %fusion = bitcast i8* %21 to [2 x [3 x [4 x float]]]*
  store i64 0, i64* %fusion.invar_address.dim.0
  br label %fusion.loop_header.dim.0

fusion.loop_header.dim.0:                         ; preds = %fusion.loop_exit.dim.1, %fusion.1.loop_exit.dim.0
  %fusion.indvar.dim.0 = load i64, i64* %fusion.invar_address.dim.0
  %22 = icmp uge i64 %fusion.indvar.dim.0, 2
  br i1 %22, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0

fusion.loop_body.dim.0:                           ; preds = %fusion.loop_header.dim.0
  store i64 0, i64* %fusion.invar_address.dim.1
  br label %fusion.loop_header.dim.1

fusion.loop_header.dim.1:                         ; preds = %fusion.loop_exit.dim.2, %fusion.loop_body.dim.0
  %fusion.indvar.dim.1 = load i64, i64* %fusion.invar_address.dim.1
  %23 = icmp uge i64 %fusion.indvar.dim.1, 3
  br i1 %23, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1

fusion.loop_body.dim.1:                           ; preds = %fusion.loop_header.dim.1
  store i64 0, i64* %fusion.invar_address.dim.2
  br label %fusion.loop_header.dim.2

fusion.loop_header.dim.2:                         ; preds = %fusion.loop_body.dim.2, %fusion.loop_body.dim.1
  %fusion.indvar.dim.2 = load i64, i64* %fusion.invar_address.dim.2
  %24 = icmp uge i64 %fusion.indvar.dim.2, 4
  br i1 %24, label %fusion.loop_exit.dim.2, label %fusion.loop_body.dim.2

fusion.loop_body.dim.2:                           ; preds = %fusion.loop_header.dim.2
  %25 = mul nuw nsw i64 %fusion.indvar.dim.2, 1
  %26 = add nuw nsw i64 0, %25
  %27 = udiv i64 %26, 4
  %28 = mul nuw nsw i64 %fusion.indvar.dim.0, 1
  %29 = add nuw nsw i64 0, %28
  %30 = udiv i64 %29, 2
  %31 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %29, i64 0, i64 %26
  %32 = load float, float* %31, !alias.scope !7, !noalias !8
  %33 = mul nuw nsw i64 %fusion.indvar.dim.1, 1
  %34 = add nuw nsw i64 0, %33
  %35 = udiv i64 %34, 3
  %36 = load float, float* %parameter.1, !invariant.load !0, !noalias !3
  %37 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %parameter.2, i64 0, i64 %34, i64 0
  %38 = load float, float* %37, !invariant.load !0, !noalias !3
  %39 = fsub float %36, %38
  %40 = fmul float %39, %39
  %41 = mul nuw nsw i64 %fusion.indvar.dim.2, 1
  %42 = add nuw nsw i64 0, %41
  %43 = udiv i64 %42, 4
  %44 = mul nuw nsw i64 %fusion.indvar.dim.0, 1
  %45 = add nuw nsw i64 0, %44
  %46 = udiv i64 %45, 2
  %47 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42
  %48 = load float, float* %47, !invariant.load !0, !noalias !3
  %49 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42
  %50 = load float, float* %49, !invariant.load !0, !noalias !3
  %51 = fmul float %48, %50
  %52 = fdiv float %40, %51
  %53 = fadd float %32, %52
  %54 = fneg float %53
  %55 = load float, float* bitcast ([4 x i8]* @1 to float*)
  %56 = fmul float %54, %55
  %57 = getelementptr inbounds [2 x [3 x [4 x float]]], [2 x [3 x [4 x float]]]* %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 %fusion.indvar.dim.1, i64 %fusion.indvar.dim.2
  store float %56, float* %57, !alias.scope !8, !noalias !12
  %invar.inc5 = add nuw nsw i64 %fusion.indvar.dim.2, 1
  store i64 %invar.inc5, i64* %fusion.invar_address.dim.2
  br label %fusion.loop_header.dim.2

fusion.loop_exit.dim.2:                           ; preds = %fusion.loop_header.dim.2
  %invar.inc4 = add nuw nsw i64 %fusion.indvar.dim.1, 1
  store i64 %invar.inc4, i64* %fusion.invar_address.dim.1
  br label %fusion.loop_header.dim.1

fusion.loop_exit.dim.1:                           ; preds = %fusion.loop_header.dim.1
  %invar.inc3 = add nuw nsw i64 %fusion.indvar.dim.0, 1
  store i64 %invar.inc3, i64* %fusion.invar_address.dim.0
  br label %fusion.loop_header.dim.0

fusion.loop_exit.dim.0:                           ; preds = %fusion.loop_header.dim.0
  %58 = getelementptr inbounds i8*, i8** %buffer_table, i64 3
  %59 = load i8*, i8** %58, !invariant.load !0, !dereferenceable !2, !align !2
  %tuple.30 = bitcast i8* %59 to [1 x i8*]*
  %60 = bitcast [2 x [3 x [4 x float]]]* %fusion to i8*
  %61 = getelementptr inbounds [1 x i8*], [1 x i8*]* %tuple.30, i64 0, i64 0
  store i8* %60, i8** %61, !alias.scope !14, !noalias !8
  ret void
}

; Function Attrs: nounwind readnone speculatable willreturn
declare float @llvm.log.f32(float) #1

attributes #0 = { uwtable "no-frame-pointer-elim"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }

!0 = !{}
!1 = !{i64 32}
!2 = !{i64 8}
!3 = !{!4, !6}
!4 = !{!"buffer: {index:0, offset:0, size:96}", !5}
!5 = !{!"XLA global AA domain"}
!6 = !{!"buffer: {index:5, offset:0, size:32}", !5}
!7 = !{!6}
!8 = !{!4}
!9 = !{i64 4}
!10 = !{i64 12}
!11 = !{i64 96}
!12 = !{!13, !6}
!13 = !{!"buffer: {index:3, offset:0, size:8}", !5}
!14 = !{!13}
```

gets (correctly) optimized to the one below without the change:

```
; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

; Function Attrs: nofree nounwind uwtable
define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8** noalias nocapture readnone %params, i8** noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 {
entry:
  %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1
  %1 = bitcast i8** %0 to [2 x [1 x [4 x float]]]**
  %2 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  %3 = getelementptr inbounds i8*, i8** %buffer_table, i64 5
  %4 = bitcast i8** %3 to [2 x [1 x [4 x float]]]**
  %5 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>*
  %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3
  %8 = fmul <4 x float> %7, %7
  %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000>
  %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9)
  %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>*
  store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8
  %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0
  %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0
  %14 = bitcast float* %12 to <4 x float>*
  %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3
  %16 = fmul <4 x float> %15, %15
  %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000>
  %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17)
  %19 = bitcast float* %13 to <4 x float>*
  store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8
  %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 4
  %21 = bitcast i8** %20 to float**
  %22 = load float*, float** %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2
  %23 = getelementptr inbounds i8*, i8** %buffer_table, i64 2
  %24 = bitcast i8** %23 to [3 x [1 x float]]**
  %25 = load [3 x [1 x float]]*, [3 x [1 x float]]** %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2
  %26 = load i8*, i8** %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2
  %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3
  %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0
  %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3
  %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>*
  %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3
  %30 = insertelement <2 x float> undef, float %27, i32 0
  %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer
  %32 = fsub <2 x float> %31, %29
  %33 = fmul <2 x float> %32, %32
  %shuffle30 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1>
  %34 = fsub float %27, %.pre29
  %35 = fmul float %34, %34
  %36 = insertelement <4 x float> undef, float %35, i32 0
  %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer
  %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %38 = fmul <4 x float> %7, %7
  %shuffle31 = shufflevector <4 x float> %38, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %39 = fdiv <8 x float> %shuffle30, %shuffle31
  %40 = fadd <8 x float> %shuffle, %39
  %41 = fmul <8 x float> %40, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %42 = bitcast i8* %26 to <8 x float>*
  store <8 x float> %41, <8 x float>* %42, align 8, !alias.scope !8, !noalias !12
  %43 = getelementptr inbounds i8, i8* %26, i64 32
  %44 = fdiv <4 x float> %37, %38
  %45 = fadd <4 x float> %10, %44
  %46 = fmul <4 x float> %45, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %47 = bitcast i8* %43 to <4 x float>*
  store <4 x float> %46, <4 x float>* %47, align 8, !alias.scope !8, !noalias !12
  %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0
  %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0
  %48 = bitcast float* %.phi.trans.insert to <4 x float>*
  %49 = load <4 x float>, <4 x float>* %48, align 8, !alias.scope !7, !noalias !8
  %50 = bitcast float* %.phi.trans.insert12 to <4 x float>*
  %51 = load <4 x float>, <4 x float>* %50, align 8, !invariant.load !0, !noalias !3
  %shuffle.1 = shufflevector <4 x float> %49, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %52 = getelementptr inbounds i8, i8* %26, i64 48
  %53 = fmul <4 x float> %51, %51
  %shuffle31.1 = shufflevector <4 x float> %53, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %54 = fdiv <8 x float> %shuffle30, %shuffle31.1
  %55 = fadd <8 x float> %shuffle.1, %54
  %56 = fmul <8 x float> %55, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %57 = bitcast i8* %52 to <8 x float>*
  store <8 x float> %56, <8 x float>* %57, align 8, !alias.scope !8, !noalias !12
  %58 = getelementptr inbounds i8, i8* %26, i64 80
  %59 = fdiv <4 x float> %37, %53
  %60 = fadd <4 x float> %49, %59
  %61 = fmul <4 x float> %60, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %62 = bitcast i8* %58 to <4 x float>*
  store <4 x float> %61, <4 x float>* %62, align 8, !alias.scope !8, !noalias !12
  %63 = getelementptr inbounds i8*, i8** %buffer_table, i64 3
  %64 = bitcast i8** %63 to [1 x i8*]**
  %65 = load [1 x i8*]*, [1 x i8*]** %64, align 8, !invariant.load !0, !dereferenceable !2, !align !2
  %66 = getelementptr inbounds [1 x i8*], [1 x i8*]* %65, i64 0, i64 0
  store i8* %26, i8** %66, align 8, !alias.scope !14, !noalias !8
  ret void
}

; Function Attrs: nounwind readnone speculatable willreturn
declare <4 x float> @llvm.log.v4f32(<4 x float>) #1

attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }

!0 = !{}
!1 = !{i64 32}
!2 = !{i64 8}
!3 = !{!4, !6}
!4 = !{!"buffer: {index:0, offset:0, size:96}", !5}
!5 = !{!"XLA global AA domain"}
!6 = !{!"buffer: {index:5, offset:0, size:32}", !5}
!7 = !{!6}
!8 = !{!4}
!9 = !{i64 4}
!10 = !{i64 12}
!11 = !{i64 96}
!12 = !{!13, !6}
!13 = !{!"buffer: {index:3, offset:0, size:8}", !5}
!14 = !{!13}

```

and (incorrectly) optimized to the one below with the change:

```
; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

; Function Attrs: nofree nounwind uwtable
define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8** noalias nocapture readnone %params, i8** noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 {
entry:
  %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1
  %1 = bitcast i8** %0 to [2 x [1 x [4 x float]]]**
  %2 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  %3 = getelementptr inbounds i8*, i8** %buffer_table, i64 5
  %4 = bitcast i8** %3 to [2 x [1 x [4 x float]]]**
  %5 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>*
  %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3
  %8 = fmul <4 x float> %7, %7
  %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000>
  %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9)
  %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>*
  store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8
  %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0
  %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0
  %14 = bitcast float* %12 to <4 x float>*
  %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3
  %16 = fmul <4 x float> %15, %15
  %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000>
  %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17)
  %19 = bitcast float* %13 to <4 x float>*
  store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8
  %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 4
  %21 = bitcast i8** %20 to float**
  %22 = load float*, float** %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2
  %23 = getelementptr inbounds i8*, i8** %buffer_table, i64 2
  %24 = bitcast i8** %23 to [3 x [1 x float]]**
  %25 = load [3 x [1 x float]]*, [3 x [1 x float]]** %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2
  %26 = load i8*, i8** %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2
  %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3
  %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0
  %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3
  %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>*
  %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3
  %30 = insertelement <2 x float> undef, float %27, i32 0
  %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer
  %32 = fsub <2 x float> %31, %29
  %33 = fmul <2 x float> %32, %32
  %shuffle32 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1>
  %34 = fsub float %27, %.pre29
  %35 = fmul float %34, %34
  %36 = insertelement <4 x float> undef, float %35, i32 0
  %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer
  %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %38 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 0, i64 0, i64 3
  %39 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 0, i64 0, i64 3
  %40 = fmul <4 x float> %7, %7
  %41 = shufflevector <4 x float> %40, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef>
  %42 = fdiv <8 x float> %shuffle32, %41
  %43 = fadd <8 x float> %shuffle, %42
  %44 = fmul <8 x float> %43, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %45 = bitcast i8* %26 to <8 x float>*
  store <8 x float> %44, <8 x float>* %45, align 8, !alias.scope !8, !noalias !12
  %46 = extractelement <4 x float> %10, i32 0
  %47 = getelementptr inbounds i8, i8* %26, i64 32
  %48 = extractelement <4 x float> %10, i32 1
  %49 = extractelement <4 x float> %10, i32 2
  %50 = load float, float* %38, align 4, !alias.scope !7, !noalias !8
  %51 = load float, float* %39, align 4, !invariant.load !0, !noalias !3
  %52 = fmul float %51, %51
  %53 = insertelement <4 x float> undef, float %52, i32 3
  %54 = fdiv <4 x float> %37, %53
  %55 = insertelement <4 x float> undef, float %46, i32 0
  %56 = insertelement <4 x float> %55, float %48, i32 1
  %57 = insertelement <4 x float> %56, float %49, i32 2
  %58 = insertelement <4 x float> %57, float %50, i32 3
  %59 = fadd <4 x float> %58, %54
  %60 = fmul <4 x float> %59, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %61 = bitcast i8* %47 to <4 x float>*
  store <4 x float> %60, <4 x float>* %61, align 8, !alias.scope !8, !noalias !12
  %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0
  %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0
  %62 = bitcast float* %.phi.trans.insert to <4 x float>*
  %63 = load <4 x float>, <4 x float>* %62, align 8, !alias.scope !7, !noalias !8
  %64 = bitcast float* %.phi.trans.insert12 to <4 x float>*
  %65 = load <4 x float>, <4 x float>* %64, align 8, !invariant.load !0, !noalias !3
  %shuffle.1 = shufflevector <4 x float> %63, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %66 = getelementptr inbounds i8, i8* %26, i64 48
  %67 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 3
  %68 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 3
  %69 = fmul <4 x float> %65, %65
  %70 = shufflevector <4 x float> %69, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
  %71 = fdiv <8 x float> %shuffle32, %70
  %72 = fadd <8 x float> %shuffle.1, %71
  %73 = fmul <8 x float> %72, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %74 = bitcast i8* %66 to <8 x float>*
  store <8 x float> %73, <8 x float>* %74, align 8, !alias.scope !8, !noalias !12
  %75 = extractelement <4 x float> %69, i32 0
  %76 = extractelement <4 x float> %63, i32 0
  %77 = getelementptr inbounds i8, i8* %26, i64 80
  %78 = extractelement <4 x float> %69, i32 1
  %79 = extractelement <4 x float> %63, i32 1
  %80 = extractelement <4 x float> %69, i32 2
  %81 = extractelement <4 x float> %63, i32 2
  %82 = load float, float* %67, align 4, !alias.scope !7, !noalias !8
  %83 = load float, float* %68, align 4, !invariant.load !0, !noalias !3
  %84 = fmul float %83, %83
  %85 = insertelement <4 x float> undef, float %75, i32 0
  %86 = insertelement <4 x float> %85, float %78, i32 1
  %87 = insertelement <4 x float> %86, float %80, i32 2
  %88 = insertelement <4 x float> %87, float %84, i32 3
  %89 = fdiv <4 x float> %37, %88
  %90 = insertelement <4 x float> undef, float %76, i32 0
  %91 = insertelement <4 x float> %90, float %79, i32 1
  %92 = insertelement <4 x float> %91, float %81, i32 2
  %93 = insertelement <4 x float> %92, float %82, i32 3
  %94 = fadd <4 x float> %93, %89
  %95 = fmul <4 x float> %94, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01>
  %96 = bitcast i8* %77 to <4 x float>*
  store <4 x float> %95, <4 x float>* %96, align 8, !alias.scope !8, !noalias !12
  %97 = getelementptr inbounds i8*, i8** %buffer_table, i64 3
  %98 = bitcast i8** %97 to [1 x i8*]**
  %99 = load [1 x i8*]*, [1 x i8*]** %98, align 8, !invariant.load !0, !dereferenceable !2, !align !2
  %100 = getelementptr inbounds [1 x i8*], [1 x i8*]* %99, i64 0, i64 0
  store i8* %26, i8** %100, align 8, !alias.scope !14, !noalias !8
  ret void
}

; Function Attrs: nounwind readnone speculatable willreturn
declare <4 x float> @llvm.log.v4f32(<4 x float>) #1

attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" }
attributes #1 = { nounwind readnone speculatable willreturn }

!0 = !{}
!1 = !{i64 32}
!2 = !{i64 8}
!3 = !{!4, !6}
!4 = !{!"buffer: {index:0, offset:0, size:96}", !5}
!5 = !{!"XLA global AA domain"}
!6 = !{!"buffer: {index:5, offset:0, size:32}", !5}
!7 = !{!6}
!8 = !{!4}
!9 = !{i64 4}
!10 = !{i64 12}
!11 = !{i64 96}
!12 = !{!13, !6}
!13 = !{!"buffer: {index:3, offset:0, size:8}", !5}
!14 = !{!13}

```

This results in bad numerical answers when used through XLA.
Again, it's not that easy to give a small fully-reproducible example, but the misscompare is:

```
Expected literal:
(
f32[2,3,4] {
{
  { nan, -inf, -3181.35, -inf },
  { nan, -inf, -28.2577019, -inf },
  { nan, -inf, -28.2577019, -inf }
},
{
  { -inf, -inf, -inf, -inf },
  { -6.60753046e+28, -1.47314833e+23, -inf, -inf },
  { -2.43504347e+30, -5.42892693e+24, -inf, -inf }
}
}
)

Actual literal:
(
f32[2,3,4] {
{
  { nan, -inf, -3181.35, -inf },
  { nan, -inf, -inf, -inf },
  { inf, -inf, -28.2577019, -inf }
},
{
  { -inf, -inf, -inf, -inf },
  { -6.60753046e+28, -1.47314833e+23, -inf, -inf },
  { -2.43504347e+30, -5.42892693e+24, -inf, -inf }
}
}
)
```

Reviewers: sanjoy.google, sanjoy, ebrevnov, jdoerfert, reames, chandlerc

Subscribers: hiraditya, Charusso, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70516
swift-ci pushed a commit that referenced this pull request Dec 3, 2019
This patch adds the following intrinsics for gather loads with 64-bit offsets:
      * @llvm.aarch64.sve.ld1.gather (unscaled offset)
      * @llvm.aarch64.sve.ld1.gather.index (scaled offset)

These intrinsics map 1-1 to the following AArch64 instructions respectively (examples for half-words):
      * ld1h { z0.d }, p0/z, [x0, z0.d]
      * ld1h { z0.d }, p0/z, [x0, z0.d, lsl #1]

Committing on behalf of Andrzej Warzynski (andwar)

Reviewers: sdesmalen, huntergr, rovka, mgudim, dancgr, rengolin, efriedma

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70542
swift-ci pushed a commit that referenced this pull request Dec 3, 2019
This patch adds intrinsics for SVE gather loads for which the offsets are 32-bits wide and are:
* unscaled
  * @llvm.aarch64.sve.ld1.gather.sxtw
  * @llvm.aarch64.sve.ld1.gather.uxtw
* scaled (offsets become indices)
  * @llvm.arch64.sve.ld1.gather.sxtw.index
  * @llvm.arch64.sve.ld1.gather.uxtw.index
The offsets are either zero (uxtw) or sign (sxtw) extended to 64 bits.

These intrinsics map 1-1 to the corresponding SVE instructions (examples for half-words):
* unscaled
  * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw]
  * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw]
* scaled
  * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1]
  * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw #1]

Committed on behalf of Andrzej Warzynski (andwar)

Reviewers: sdesmalen, kmclaughlin, eli.friedman, rengolin, rovka, huntergr, dancgr, mgudim, efriedma

Reviewed By: sdesmalen

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70782
swift-ci pushed a commit that referenced this pull request Dec 18, 2019
Summary:
This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions.
It also adds a check in `Worklist::Add` that ensures that all added instructions have parents.

Simple test case that illustrates the difference when run with `--debug-only=instcombine`:

```
define i32 @test35(i32 %a, i32 %b) {
  %1 = or i32 %a, 1135
  %2 = or i32 %1, %b
  ret i32 %2
}
```

Before this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   <badref> = or i32 %2, 1135
...
```

With this patch:
```
INSTCOMBINE ITERATION #1 on test35
IC: ADDING: 3 instrs to worklist
IC: Visiting:   %1 = or i32 %a, 1135
IC: Visiting:   %2 = or i32 %1, %b
IC: ADD:   %2 = or i32 %a, %b
IC: Old =   %3 = or i32 %1, %b
    New =   <badref> = or i32 %2, 1135
IC: ADD:   %3 = or i32 %2, 1135
...
```

Reviewers: fhahn, davide, spatel, foad, grosser, nikic

Reviewed By: nikic

Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71093
swift-ci pushed a commit that referenced this pull request Jan 13, 2020
…t binding

This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30):

PASS:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325
    frame #1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295
    frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame #3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame #6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243
    frame #7: 0x000000000040106e a.out`_start + 46

vs.

FAIL - unrecognized abort() function:
./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit
  * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325
    frame #1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295
    frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2
    frame #3: 0x000000000040113a a.out`func_b at main.c:18:2
    frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2
    frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2
    frame #6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243
    frame #7: 0x000000000040106e a.out`.annobin_init.c.hot + 46

The extra ELF symbols are there due to Annobin (I did not investigate why this
problem happened specifically since F-30 and not since F-28).

It is due to:

Symbol table '.dynsym' contains 2361 entries:
Valu e          Size Type   Bind   Vis     Name
0000000000022769   5 FUNC   LOCAL  DEFAULT _nl_load_domain.cold
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_loadmsgcat.c_end.unlikely
...
000000000002276e   0 NOTYPE LOCAL  HIDDEN  .annobin_textdomain.c_end.unlikely
000000000002276e 548 FUNC   GLOBAL DEFAULT abort
000000000002276e 548 FUNC   GLOBAL DEFAULT abort@@GLIBC_2.2.5
000000000002276e 548 FUNC   LOCAL  DEFAULT __GI_abort
0000000000022992   0 NOTYPE LOCAL  HIDDEN  .annobin_abort.c_end.unlikely

GDB has some more complicated preferences between overlapping and/or sharing
address symbols, I have made here so far the most simple fix for this case.

Differential revision: https://reviews.llvm.org/D63540
swift-ci pushed a commit that referenced this pull request Jan 15, 2020
TSan spuriously reports for any OpenMP application a race on the initialization
of a runtime internal mutex:

```
Atomic read of size 1 at 0x7b6800005940 by thread T4:
  #0 pthread_mutex_lock <null> (a.out+0x43f39e)
  #1 __kmp_resume_64 <null> (libomp.so.5+0x84db4)

Previous write of size 1 at 0x7b6800005940 by thread T7:
  #0 pthread_mutex_init <null> (a.out+0x424793)
  #1 __kmp_suspend_initialize_thread <null> (libomp.so.5+0x8422e)
```

According to @AndreyChurbanov this is a false positive report, as the control
flow of the runtime guarantees the ordering of the mutex initialization and
the lock:
https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/530363

To suppress this report, I suggest the use of
TSAN_OPTIONS='ignore_uninstrumented_modules=1'.
With this patch, a runtime warning is provided in case an OpenMP application
is built with Tsan and executed without this Tsan-option.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70412
swift-ci pushed a commit that referenced this pull request Jan 16, 2020
The test is currently failing on some systems with ASAN enabled due to:
```
==22898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003da4 at pc 0x00010951c33d bp 0x7ffee6709e00 sp 0x7ffee67095c0
READ of size 5 at 0x603000003da4 thread T0
    #0 0x10951c33c in wrap_memmove+0x16c (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x1833c)
    #1 0x7fff4a327f57 in CFDataReplaceBytes+0x1ba (CoreFoundation:x86_64+0x13f57)
    #2 0x7fff4a415a44 in __CFDataInit+0x2db (CoreFoundation:x86_64+0x101a44)
    #3 0x1094f8490 in main main.m:424
    #4 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084)
0x603000003da4 is located 0 bytes to the right of 20-byte region [0x603000003d90,0x603000003da4)
allocated by thread T0 here:
    #0 0x109547c02 in wrap_calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x43c02)
    #1 0x7fff763ad3ef in class_createInstance+0x52 (libobjc.A.dylib:x86_64+0x73ef)
    #2 0x7fff4c6b2d73 in NSAllocateObject+0x12 (Foundation:x86_64+0x1d73)
    #3 0x7fff4c6b5e5f in -[_NSPlaceholderData initWithBytes:length:copy:deallocator:]+0x40 (Foundation:x86_64+0x4e5f)
    #4 0x7fff4c6d4cf1 in -[NSData(NSData) initWithBytes:length:]+0x24 (Foundation:x86_64+0x23cf1)
    #5 0x1094f8245 in main main.m:404
    #6 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084)
```

The reason is that we create a string "HELLO" but get the size wrong (it's 5 bytes instead
of 4). Later on we read the buffer and pretend it is 5 bytes long, causing an OOB read
which ASAN detects.

In general this test probably needs some cleanup as it produces on macOS 10.15 around
100 compiler warnings which isn't great, but let's first get the bot green.
swift-ci pushed a commit that referenced this pull request May 9, 2025
llvm#138091)

Check this error for more context
(https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531)

This fails with
```
* thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3)
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
    frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58
    frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838
    frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13
    frame #5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98
    frame #6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102
    frame #7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13
    frame #8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919
```

Problem :

1) The destructor currently handles no clearance for the DeviceParser
and the DeviceAct. We currently only have this

https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419

2) The ownership for DeviceCI currently is present in
IncrementalCudaDeviceParser. But this should be similar to how the
combination for hostCI, hostAction and hostParser are managed by the
Interpreter. As on master the DeviceAct and DeviceParser are managed by
the Interpreter but not DeviceCI. This is problematic because :
IncrementalParser holds a Sema& which points into the DeviceCI. On
master, DeviceCI is destroyed before the base class ~IncrementalParser()
runs, causing Parser::reset() to access a dangling Sema (and as Sema
holds a reference to Preprocessor which owns PragmaNamespace) we see
this
```
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830

```

(cherry picked from commit 529b6fc)
swift-ci pushed a commit that referenced this pull request May 13, 2025
… `getForwardSlice` matchers (llvm#115670)

Improve mlir-query tool by implementing `getBackwardSlice` and
`getForwardSlice` matchers. As an addition `SetQuery` also needed to be
added to enable custom configuration for each query. e.g: `inclusive`,
`omitUsesFromAbove`, `omitBlockArguments`.

Note: backwardSlice and forwardSlice algoritms are the same as the ones
in `mlir/lib/Analysis/SliceAnalysis.cpp`
Example of current matcher. The query was made to the file:
`mlir/test/mlir-query/complex-test.mlir`

```mlir
./mlir-query /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir -c "match getDefinitions(hasOpName(\"arith.add
f\"),2)"

Match #1:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:5:8:
  %0 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<5x5xf32>) outs(%arg1 : tensor<5x5xf32>) {
       ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:7:10: note: "root" binds here
    %2 = arith.addf %in, %in : f32
         ^
Match #2:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:10:16:
  %collapsed = tensor.collapse_shape %0 [[0, 1]] : tensor<5x5xf32> into tensor<25xf32>
               ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:13:11:
    %c2 = arith.constant 2 : index
          ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:14:18:
    %extracted = tensor.extract %collapsed[%c2] : tensor<25xf32>
                 ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:15:10: note: "root" binds here
    %2 = arith.addf %extracted, %extracted : f32
         ^
2 matches.
```
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request May 16, 2025
When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame swiftlang#1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame swiftlang#2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame swiftlang#3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request May 16, 2025
When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame swiftlang#1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame swiftlang#2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame swiftlang#3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request May 23, 2025
When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame swiftlang#1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame swiftlang#2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame swiftlang#3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request May 23, 2025
When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame swiftlang#1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame swiftlang#2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame swiftlang#3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
felipepiovezan added a commit to felipepiovezan/llvm-project that referenced this pull request May 23, 2025
When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame swiftlang#1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame swiftlang#2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame swiftlang#3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
adrian-prantl pushed a commit that referenced this pull request May 28, 2025
* [lldb][nfc] Create helper functions in SwiftLanguageRuntimeNames

These will be useful to reuse code in upcoming commits.

* [lldb] Fix stepping into ObjcC ctor from Swift

When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame #1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame #2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame #3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).
felipepiovezan added a commit that referenced this pull request May 30, 2025
* [lldb][nfc] Create helper functions in SwiftLanguageRuntimeNames

These will be useful to reuse code in upcoming commits.

* [lldb] Fix stepping into ObjcC ctor from Swift

When constructing an Objective C object of type `Foo` from Swift, this
sequence of function calls is used:

```
  * frame #0: 0x000000010000147c test.out`-[Foo initWithString:](self=0x00006000023ec000, _cmd="initWithString:", value=@"Bar") -[Foo initWithString:]  at Foo.m:9:21
    frame #1: 0x00000001000012bc test.out`@nonobjc Foo.init(string:) $sSo3FooC6stringABSS_tcfcTO  at <compiler-generated>:0
    frame #2: 0x0000000100001170 test.out`Foo.__allocating_init(string:) $sSo3FooC6stringABSS_tcfC  at Foo.h:0
    frame #3: 0x0000000100000ed8 test.out`work() $s4test4workyyF  at main.swift:5:18
```

Frames 1 and 2 are common with pure Swift classes, and LLDB has a Thread
Plan to go from `Foo.allocating_init` -> `Foo.init`.

In the case of Objcetive C interop, `Foo.init` has no user code, and is
annotated with `@nonobjc`. The debugger needs a plan to go from that
code to the Objective C implementation. This is what this patch attempts
to fix by creating a plan that runs to any symbol matching `Foo init`
(this will match all the :withBlah suffixes).

This seems to be the only possible fix for this. While Objective C
constructors are not necessarily called init, the interop layer seems to
assume this.

The only other alternative has some obstacles that could not be easily
overcome. Here's the main idea for that. The assembly for `@nonobjc
Foo.init` looks like (deleted all non branches):

```
test.out`@nonobjc Foo.init(string:):
...
    0x1000012a0 <+20>: bl     0x100001618    ; symbol stub for: Swift.String._bridgeToObjectiveC() -> __C.NSString
...
    0x1000012b8 <+44>: bl     0x100001630    ; symbol stub for: objc_msgSend
...
    0x1000012e8 <+92>: ret
```

If we had more String arguments, there would be more calls to
`_bridgeToObjectiveC`. The call to `objc_msgSend` is the important one,
and LLDB knows how to go from that to the target of the message, LLDB
has ThreadPlans for that. However, setting a breakpoint on
`objc_msgSend` would fail: the calls to `_bridgeToObjectiveC` may also
call `objc_msgSend`, so LLDB would end up in the wrong `objc_msgSend`.
This is not entirely bad, LLDB would step back to `Foo.init`.

Here's the catch: the language runtime refuses to create other plans if
PC is not at the start of the function, which makes sense, as it would
not be able to distinguish if its job was already done previously or
not, unless it had a stateful plan (which it doesn't today).

(cherry picked from commit 3f9b875)
swift-ci pushed a commit that referenced this pull request Jun 2, 2025
Fixes llvm#123300

What is seen 
```
clang-repl> int x = 42;
clang-repl> auto capture = [&]() { return x * 2; };
In file included from <<< inputs >>>:1:
input_line_4:1:17: error: non-local lambda expression cannot have a capture-default
    1 | auto capture = [&]() { return x * 2; };
      |                 ^
zsh: segmentation fault  clang-repl --Xcc="-v"

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x0000000107b4f8b8 libclang-cpp.19.1.dylib`clang::IncrementalParser::CleanUpPTU(clang::PartialTranslationUnit&) + 988
    frame #1: 0x0000000107b4f1b4 libclang-cpp.19.1.dylib`clang::IncrementalParser::ParseOrWrapTopLevelDecl() + 416
    frame #2: 0x0000000107b4fb94 libclang-cpp.19.1.dylib`clang::IncrementalParser::Parse(llvm::StringRef) + 612
    frame #3: 0x0000000107b52fec libclang-cpp.19.1.dylib`clang::Interpreter::ParseAndExecute(llvm::StringRef, clang::Value*) + 180
    frame #4: 0x0000000100003498 clang-repl`main + 3560
    frame #5: 0x000000018d39a0e0 dyld`start + 2360
```

Though the error is justified, we shouldn't be interested in exiting
through a segfault in such cases.

The issue is that empty named decls weren't being taken care of
resulting into this assert


https://github.com/llvm/llvm-project/blob/c1a229252617ed58f943bf3f4698bd8204ee0f04/clang/include/clang/AST/DeclarationName.h#L503

Can also be seen when the example is attempted through xeus-cpp-lite.


![image](https://github.com/user-attachments/assets/9b0e6ead-138e-4b06-9ad9-fcb9f8d5bf6e)
swift-ci pushed a commit that referenced this pull request Jun 2, 2025
# Symptom

We have seen SIGSEGV like this:
```
* thread #1, name = 'lldb-server', stop reason = SIGSEGV
    frame #0: 0x00007f39e529c993 libc.so.6`__pthread_kill_internal(signo=11, threadid=<unavailable>) at pthread_kill.c:46:37
    ...
  * frame #5: 0x000056027c94fe48 lldb-server`lldb_private::process_linux::GetPtraceScope() + 72
    frame #6: 0x000056027c92f94f lldb-server`lldb_private::process_linux::NativeProcessLinux::Attach(int) + 1087
    ...
```
See [full stack trace](https://pastebin.com/X0d6QhYj).

This happens on Linux where LLDB doesn't have access to
`/proc/sys/kernel/yama/ptrace_scope`.

A similar error (an unchecked `Error`) can be reproduced by running the
newly added unit test without the fix. See the "Test" section below.


# Root cause

`GetPtraceScope()`
([code](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/lldb/source/Plugins/Process/Linux/Procfs.cpp#L77))
has the following `if` statement:
```
llvm::Expected<int> lldb_private::process_linux::GetPtraceScope() {
  ErrorOr<std::unique_ptr<MemoryBuffer>> ptrace_scope_file =
      getProcFile("sys/kernel/yama/ptrace_scope");
  if (!*ptrace_scope_file)
    return errorCodeToError(ptrace_scope_file.getError());
  ...
}
```

The intention of the `if` statement is to check whether the
`ptrace_scope_file` is an `Error` or not, and return the error if it is.
However, the `operator*` of `ErrorOr` returns the value that is stored
(which is a `std::unique_ptr<MemoryBuffer>`), so what the `if` condition
actually do is to check if the unique pointer is non-null.

Note that the method `ErrorOr::getStorage()` ([called
by](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/llvm/include/llvm/Support/ErrorOr.h#L162-L164)
`ErrorOr::operator *`) **does** assert on whether or not `HasError` has
been set (see
[ErrorOr.h](https://github.com/llvm/llvm-project/blob/328f40f408c218f25695ea42c844e43bef38660b/llvm/include/llvm/Support/ErrorOr.h#L235-L243)).
However, it seems this wasn't executed, probably because the LLDB was a
release build.

# Fix

The fix is simply remove the `*` in the said `if` statement.
swift-ci pushed a commit that referenced this pull request Jun 5, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
Michael137 added a commit that referenced this pull request Jun 5, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
(cherry picked from commit 5a91892)
Michael137 added a commit that referenced this pull request Jun 6, 2025
…142952)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
(cherry picked from commit 5a91892)
adrian-prantl pushed a commit that referenced this pull request Jun 6, 2025
…142952) (#10789)

This was removed in llvm#135343 in
favour of making it a format variable, which we do here. This follows
the precedent of the `[opt]` and `[artificial]` markers.

Before:
```
 thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
 * frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3
   frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
   frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43
   frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
   frame #4: 0x0000000186345be4 dyld`start + 7040
```

After (note the `[inlined]` markers):
```
thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
* frame #0: 0x000000010000037c a.out`inlined1() at inline.cpp:4:3 [inlined]
  frame #1: 0x000000010000037c a.out`regular() at inline.cpp:6:17
  frame #2: 0x00000001000003b8 a.out`inlined2() at inline.cpp:7:43 [inlined]
  frame #3: 0x00000001000003b4 a.out`main at inline.cpp:10:3
  frame #4: 0x0000000186345be4 dyld`start + 7040
```

rdar://152642178
(cherry picked from commit 5a91892)
swift-ci pushed a commit that referenced this pull request Jun 13, 2025
These were failing on our Windows on Arm bot, or more precisely,
not even completing.

This is because Microsoft's C runtime does extra parameter validation.
So when we called _read with an invalid fd, it called an invalid
parameter handler instead of returning an error.

https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/reference/read?view=msvc-170
https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/parameter-validation?view=msvc-170

(lldb) run
Process 8440 launched: 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\unittests\Host\HostTests.exe' (aarch64)
Process 8440 stopped
* thread #1, stop reason = Exception 0xc0000409 encountered at address 0x7ffb7453564c
    frame #0: 0x00007ffb7453564c ucrtbase.dll`_get_thread_local_invalid_parameter_handler + 652
ucrtbase.dll`_get_thread_local_invalid_parameter_handler:
->  0x7ffb7453564c <+652>: brk    #0xf003

ucrtbase.dll`_invalid_parameter_noinfo:
    0x7ffb74535650 <+0>:   b      0x7ffb745354d8 ; _get_thread_local_invalid_parameter_handler + 280
    0x7ffb74535654 <+4>:   nop
    0x7ffb74535658 <+8>:   nop

You can override this handler but I'm assuming that this reading
after close isn't a crucial feature, so disabling the tests seems
like the way to go.

If it is crucial, we can check the fd before we use it.

Tests added by llvm#143946.
swift-ci pushed a commit that referenced this pull request Jun 25, 2025
# Benefit

This patch fixes:
1. After `platform select ios-simulator`, `platform process list` will
now print processes which are running in the iOS simulator. Previously,
no process will be listed.
2. After `platform select ios-simulator`, `platform attach --name
<name>` will succeed. Previously, it will error out saying no process is
found.


# Several bugs that is being fixed

1. During the process listing, add `aarch64` to the list of CPU types
for which iOS simulators are checked for.
2. Given a candidate process, when checking for simulators, the original
code will find the desired environment variable (`SIMULATOR_UDID`) and
set the OS to iOS, but then the immediate next environment variable will
set it back to macOS.
3. For processes running on simulator, set the triple's `Environment` to
`Simulator`, so that such processes can pass the filtering [in this
line](https://fburl.com/8nivnrjx). The original code leave it as the
default `UnknownEnvironment`.



# Manual test

**With this patch:**
```
royshi-mac-home ~/public_llvm/build % bin/lldb
(lldb) platform select ios-simulator

(lldb) platform process list
240 matching processes were found on "ios-simulator"

PID    PARENT USER       TRIPLE                         NAME
====== ====== ========== ============================== ============================
40511  28844  royshi     arm64-apple-ios-simulator      FocusPlayground // my toy iOS app running on simulator
... // omit
28844  1      royshi     arm64-apple-ios-simulator      launchd_sim

(lldb) process attach --name FocusPlayground
Process 40511 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x0000000104e3cb70 libsystem_kernel.dylib`mach_msg2_trap + 8
libsystem_kernel.dylib`mach_msg2_trap:
->  0x104e3cb70 <+8>: ret
... // omit
```

**Without this patch:**
```
$ bin/lldb
(lldb) platform select ios-simulator

(lldb) platform process list
error: no processes were found on the "ios-simulator" platform

(lldb) process attach --name FocusPlayground
error: attach failed: could not find a process named FocusPlayground
```


# Unittest

See PR.
swift-ci pushed a commit that referenced this pull request Jun 26, 2025
The function already exposes a work list to avoid deep recursion, this
commit starts utilizing it in a helper that could also lead to a deep
recursion.

We have observed this crash on `clang/test/C/C99/n590.c` with our
internal builds that enable aggressive optimizations and hit the limit
earlier than default release builds of Clang.

See the added test for an example with a deeper recursion that used to
crash in upstream Clang before this change with the following stack
trace:

```
  #0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
  #1 llvm::sys::RunSignalHandlers() /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Signals.cpp:106:18
  #2 SignalHandler(int, siginfo_t*, void*) /usr/local/google/home/ibiryukov/code/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
  #3 (/lib/x86_64-linux-gnu/libc.so.6+0x3fdf0)
  #4 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12772:0
  #5 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
  #6 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
  #7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
  #8 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
  #9 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 #10 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 #11 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 #12 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 #13 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 #14 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 #15 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 #16 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
 #17 CheckCommaOperand /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:0:3
 #18 AnalyzeImplicitConversions /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12644:7
 #19 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) /usr/local/google/home/ibiryukov/code/llvm-project/clang/lib/Sema/SemaChecking.cpp:12776:5
... 700+ more stack frames.
```
swift-ci pushed a commit that referenced this pull request Jul 15, 2025
Fix unnecessary conversion of C-String to StringRef in the `Cmp` lambda
inside `lookupLLVMIntrinsicByName`. This both fixes an ASAN error in the
code that happens when the `Name` StringRef passed in is not a Null
terminated StringRef, and additionally can potentially speed up the code
as well by eliminating the unnecessary computation of string length
every time a C String is converted to StringRef in this code (It seems
practically this computation is eliminated in optimized builds, but this
will avoid it in O0 builds as well).

Added a unit test that demonstrates this issue by building LLVM with
these options:

```
CMAKE_BUILD_TYPE=Debug
LLVM_USE_SANITIZER=Address
LLVM_OPTIMIZE_SANITIZED_BUILDS=OFF
```

The error reported is as follows:

```
==462665==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5030000391a2 at pc 0x56525cc30bbf bp 0x7fff9e4ccc60 sp 0x7fff9e4cc428
READ of size 19 at 0x5030000391a2 thread T0
    #0 0x56525cc30bbe in strlen (upstream-llvm-second/llvm-project/build/unittests/IR/IRTests+0x713bbe) (BuildId: 0651acf1e582a4d2)
    #1 0x7f8ff22ad334 in std::char_traits<char>::length(char const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9
    #2 0x7f8ff22a34a0 in llvm::StringRef::StringRef(char const*) /home/rjoshi/upstream-llvm-second/llvm-project/llvm/include/llvm/ADT/StringRef.h:96:33
    #3 0x7f8ff28ca184 in _ZZL25lookupLLVMIntrinsicByNameN4llvm8ArrayRefIjEENS_9StringRefES2_ENK3$_0clIjPKcEEDaT_T0_ upstream-llvm-second/llvm-project/llvm/lib/IR/Intrinsics.cpp:673:18
```
swift-ci pushed a commit that referenced this pull request Jul 16, 2025
…lvm#148205)

In the original motivating test case,
[FoldList](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1764)
had entries:
```
  #0: UseMI: %224:sreg_32 = S_OR_B32 %219.sub0:sreg_64, %219.sub1:sreg_64, implicit-def dead $scc
      UseOpNo: 1

  #1: UseMI: %224:sreg_32 = S_OR_B32 %219.sub0:sreg_64, %219.sub1:sreg_64, implicit-def dead $scc
      UseOpNo: 2
```
After calling
[updateOperand(#0)](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1773),
[tryConstantFoldOp(#0.UseMI)](https://github.com/llvm/llvm-project/blob/d8a2141ff98ee35cd1886f536ccc3548b012820b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1786)
removed operand 1, and entry #&#8203;1.UseOpNo was no longer valid,
resulting in an
[assert](https://github.com/llvm/llvm-project/blob/4a35214bddbb67f9597a500d48ab8c4fb25af150/llvm/include/llvm/ADT/ArrayRef.h#L452).

This change defers constant folding until all operands have been updated
so that UseOpNo values remain stable.
swift-ci pushed a commit that referenced this pull request Jul 29, 2025
Extend support in LLDB for WebAssembly. This PR adds a new Process
plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly
targets. It adds support for WebAssembly's memory model with separate
address spaces, and the ability to fetch the call stack from the
WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR,
https://github.com/bytecodealliance/wasm-micro-runtime) which implements
a GDB debug stub and supports the qWasmCallStack packet.

```
(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread #1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #2: 0x40000000000001fe wasm32_args.wasm
```

This PR is based on an unmerged patch from Paolo Severini:
https://reviews.llvm.org/D78801. I intentionally stuck to the
foundations to keep this PR small. I have more PRs in the pipeline to
support the other features/packets.

My motivation for supporting Wasm is to support debugging Swift compiled
to WebAssembly:
https://www.swift.org/documentation/articles/wasm-getting-started.html
swift-ci pushed a commit that referenced this pull request Aug 4, 2025
…erver (llvm#148774)

Summary:
There was a deadlock was introduced by [PR
llvm#146441](llvm#146441) which changed
`CurrentThreadIsPrivateStateThread()` to
`CurrentThreadPosesAsPrivateStateThread()`. This change caused the
execution path in
[`ExecutionContextRef::SetTargetPtr()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L513)
to now enter a code block that was previously skipped, triggering
[`GetSelectedFrame()`](https://github.com/llvm/llvm-project/blob/10b5558b61baab59c7d3dff37ffdf0861c0cc67a/lldb/source/Target/ExecutionContext.cpp#L522)
which leads to a deadlock.

Thread 1 gets m_modules_mutex in
[`ModuleList::AppendImpl`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Core/ModuleList.cpp#L218),
Thread 3 gets m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501),
but then Thread 1 waits for m_language_runtimes_mutex in
[`GetLanguageRuntime`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Process.cpp#L1501)
while Thread 3 waits for m_modules_mutex in
[`ScanForGNUstepObjCLibraryCandidate`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Plugins/LanguageRuntime/ObjC/GNUstepObjCRuntime/GNUstepObjCRuntime.cpp#L57).

This fixes the deadlock by adding a scoped block around the mutex lock
before the call to the notifier, and moved the notifier call outside of
the mutex-guarded section. The notifier call
[`NotifyModuleAdded`](https://github.com/llvm/llvm-project/blob/96148f92146e5211685246722664e51ec730e7ba/lldb/source/Target/Target.cpp#L1810)
should be thread-safe, since the module should be added to the
`ModuleList` before the mutex is released, and the notifier doesn't
modify the module list further, and the call is operates on local state
and the `Target` instance.

### Deadlocked Thread backtraces:
```
* thread #3, name = 'dbg.evt-handler', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563786bd5f40) at    futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007f2f0f1927b0, __m=0x0000563786bd5f40) at std_mutex.h:229:19
    frame #8: 0x00007f2f27946eb7 liblldb.so.21.0git`ScanForGNUstepObjCLibraryCandidate(modules=0x0000563786bd5f28, TT=0x0000563786bd5eb8) at GNUstepObjCRuntime.cpp:60:41
    frame #9: 0x00007f2f27946c80 liblldb.so.21.0git`lldb_private::GNUstepObjCRuntime::CreateInstance(process=0x0000563785e1d360, language=eLanguageTypeObjC) at GNUstepObjCRuntime.cpp:87:8
    frame #10: 0x00007f2f2746fca5 liblldb.so.21.0git`lldb_private::LanguageRuntime::FindPlugin(process=0x0000563785e1d360, language=eLanguageTypeObjC) at LanguageRuntime.cpp:210:36
    frame #11: 0x00007f2f2742c9e3 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeObjC) at Process.cpp:1516:9
    ...
    frame #21: 0x00007f2f2750b5cc liblldb.so.21.0git`lldb_private::Thread::GetSelectedFrame(this=0x0000563785e064d0, select_most_relevant=DoNoSelectMostRelevantFrame) at Thread.cpp:274:48
    frame #22: 0x00007f2f273f9957 liblldb.so.21.0git`lldb_private::ExecutionContextRef::SetTargetPtr(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:525:32
    frame #23: 0x00007f2f273f9714 liblldb.so.21.0git`lldb_private::ExecutionContextRef::ExecutionContextRef(this=0x00007f2f0f193778, target=0x0000563786bd5be0, adopt_selected=true) at ExecutionContext.cpp:413:3
    frame #24: 0x00007f2f270e80af liblldb.so.21.0git`lldb_private::Debugger::GetSelectedExecutionContext(this=0x0000563785d83bc0) at Debugger.cpp:1225:23
    frame #25: 0x00007f2f271bb7fd liblldb.so.21.0git`lldb_private::Statusline::Redraw(this=0x0000563785d83f30, update=true) at Statusline.cpp:136:41
    ...
* thread #1, name = 'lldb', stop reason = signal SIGSTOP
  * frame #0: 0x00007f2f1e2973dc libc.so.6`futex_wait(private=0, expected=2, futex_word=0x0000563785e1dd98) at futex-internal.h:146:13
   /*... a bunch of mutex related bt ... */    
   liblldb.so.21.0git`std::lock_guard<std::recursive_mutex>::lock_guard(this=0x00007ffe62be0488, __m=0x0000563785e1dd98) at std_mutex.h:229:19
    frame #8: 0x00007f2f2742c8d1 liblldb.so.21.0git`lldb_private::Process::GetLanguageRuntime(this=0x0000563785e1d360, language=eLanguageTypeC_plus_plus) at Process.cpp:1510:41
    frame #9: 0x00007f2f2743c46f liblldb.so.21.0git`lldb_private::Process::ModulesDidLoad(this=0x0000563785e1d360, module_list=0x00007ffe62be06a0) at Process.cpp:6082:36
    ...
    frame #13: 0x00007f2f2715cf03 liblldb.so.21.0git`lldb_private::ModuleList::AppendImpl(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, use_notifier=true) at ModuleList.cpp:246:19
    frame #14: 0x00007f2f2715cf4c liblldb.so.21.0git`lldb_private::ModuleList::Append(this=0x0000563786bd5f28, module_sp=ptr = 0x563785cec560, notify=true) at ModuleList.cpp:251:3
    ...
    frame #19: 0x00007f2f274349b3 liblldb.so.21.0git`lldb_private::Process::ConnectRemote(this=0x0000563785e1d360, remote_url=(Data = "connect://localhost:1234", Length = 24)) at Process.cpp:3250:9
    frame #20: 0x00007f2f27411e0e liblldb.so.21.0git`lldb_private::Platform::DoConnectProcess(this=0x0000563785c59990, connect_url=(Data = "connect://localhost:1234", Length = 24), plugin_name=(Data = "gdb-remote", Length = 10), debugger=0x0000563785d83bc0, stream=0x00007ffe62be3128, target=0x0000563786bd5be0, error=0x00007ffe62be1ca0) at Platform.cpp:1926:23
```

## Test Plan:
Built a hello world a.out
Run server in one terminal:
```
~/llvm/build/Debug/bin/lldb-server g :1234 a.out
```
Run client in another terminal
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
```

Before:
Client hangs indefinitely
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b main"
(lldb) gdb-remote 1234

^C^C
```

After:
```
~/llvm/build/Debug/bin/lldb -o "gdb-remote 1234" -o "b hello.cc:3"
(lldb) gdb-remote 1234
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fe4a60
ld-linux-x86-64.so.2`_start:
->  0x7ffff7fe4a60 <+0>: movq   %rsp, %rdi
    0x7ffff7fe4a63 <+3>: callq  0x7ffff7fe5780 ; _dl_start at rtld.c:522:1

ld-linux-x86-64.so.2`_dl_start_user:
    0x7ffff7fe4a68 <+0>: movq   %rax, %r12
    0x7ffff7fe4a6b <+3>: movl   0x18067(%rip), %eax ; _dl_skip_args
(lldb) b hello.cc:3
Breakpoint 1: where = a.out`main + 15 at hello.cc:4:13, address = 0x00005555555551bf
(lldb) c
Process 837068 resuming
Process 837068 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x00005555555551bf a.out`main at hello.cc:4:13
   1   	#include <iostream>
   2
   3   	int main() {
-> 4   	  std::cout << "Hello World" << std::endl;
   5   	  return 0;
   6   	}
```
JDevlieghere added a commit that referenced this pull request Aug 6, 2025
Extend support in LLDB for WebAssembly. This PR adds a new Process
plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly
targets. It adds support for WebAssembly's memory model with separate
address spaces, and the ability to fetch the call stack from the
WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR,
https://github.com/bytecodealliance/wasm-micro-runtime) which implements
a GDB debug stub and supports the qWasmCallStack packet.

```
(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread #1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #2: 0x40000000000001fe wasm32_args.wasm
```

This PR is based on an unmerged patch from Paolo Severini:
https://reviews.llvm.org/D78801. I intentionally stuck to the
foundations to keep this PR small. I have more PRs in the pipeline to
support the other features/packets.

My motivation for supporting Wasm is to support debugging Swift compiled
to WebAssembly:
https://www.swift.org/documentation/articles/wasm-getting-started.html

(cherry picked from commit a28e7f1)
swift-ci pushed a commit that referenced this pull request Aug 7, 2025
…lvm#152156)

With this new A320 in-order core, we follow adding the
FeatureUseFixedOverScalableIfEqualCost feature to A510 and A520
(llvm#132246), which reaps the same code generation benefits of preferring
fixed over scalable when the cost is equal.

So when we have:
```
void foo(float* a, float* b, float* dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

When compiling without the feature enabled, we get:
```
...
    ld1b    { z0.b }, p0/z, [x0, x10]
    ld1b    { z2.b }, p0/z, [x1, x10]
    add     x12, x0, x10
    ldr     z1, [x12, #1, mul vl]
    add     x12, x1, x10
    ldr     z3, [x12, #1, mul vl]
    fadd    z0.s, z2.s, z0.s
    add     x12, x2, x10
    fadd    z1.s, z3.s, z1.s
    dech    x11
    st1b    { z0.b }, p0, [x2, x10]
    incb    x10, all, mul #2
    str     z1, [x12, #1, mul vl]
...
```

When compiling with, we get:
```
...
  	ldp	    q0, q1, [x12, #-16]
	ldp	    q2, q3, [x11, #-16]
	subs	x13, x13, #8
	fadd	v0.4s, v2.4s, v0.4s
	fadd	v1.4s, v3.4s, v1.4s
	add	    x11, x11, #32
	add	    x12, x12, #32
	stp	    q0, q1, [x10, #-16]
	add	    x10, x10, #32

...
```
swift-ci pushed a commit that referenced this pull request Aug 12, 2025
M68k's SETCC instruction (`scc`) distinctly fills the destination byte
with all 1s. If boolean contents are set to `ZeroOrOneBooleanContent`,
LLVM can mistakenly think the destination holds `0x01` instead of `0xff`
and emit broken code as a result. This change corrects the boolean
content type to `ZeroOrNegativeOneBooleanContent`.

For example, this IR:

```llvm
define dso_local signext range(i8 0, 2) i8 @testBool(i32 noundef %a) local_unnamed_addr #0 {
entry:
  %cmp = icmp eq i32 %a, 4660
  %. = zext i1 %cmp to i8
  ret i8 %.
}
```

would previously build as:

```asm
testBool:                               ; @testBool
	cmpi.l	#4660, (4,%sp)
	seq	%d0
	and.l	#255, %d0
	rts
```

Notice the `zext` is erroneously not clearing the low bits, and thus the
register returns with 255 instead of 1. This patch fixes the issue:

```asm
testBool:                               ; @testBool
	cmpi.l	#4660, (4,%sp)
	seq	%d0
	and.l	#1, %d0
	rts
```

Most of the tests containing `scc` suffered from the same value error as
described above, so those tests have been updated to match the new
output (which also logically corrects them).
swift-ci pushed a commit that referenced this pull request Aug 13, 2025
## Problem

When the new setting

```
set target.parallel-module-load true
```
was added, lldb began fetching modules from the devices from multiple
threads simultaneously. This caused crashes of lldb when debugging on
android devices.

The top of the stack in the crash look something like this:
```
#0 0x0000555aaf2b27fe llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/llvm/bin/lldb-dap+0xb87fe)
 #1 0x0000555aaf2b0a99 llvm::sys::RunSignalHandlers() (/opt/llvm/bin/lldb-dap+0xb6a99)
 #2 0x0000555aaf2b2fda SignalHandler(int, siginfo_t*, void*) (/opt/llvm/bin/lldb-dap+0xb8fda)
 #3 0x00007f9c02444560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0
 #4 0x00007f9c04ea7707 lldb_private::ConnectionFileDescriptor::Disconnect(lldb_private::Status*) (usr/bin/../lib/liblldb.so.15+0x22a7707)
 #5 0x00007f9c04ea5b41 lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5b41)
 #6 0x00007f9c04ea5c1e lldb_private::ConnectionFileDescriptor::~ConnectionFileDescriptor() (usr/bin/../lib/liblldb.so.15+0x22a5c1e)
 #7 0x00007f9c052916ff lldb_private::platform_android::AdbClient::SyncService::Stat(lldb_private::FileSpec const&, unsigned int&, unsigned int&, unsigned int&) (usr/bin/../lib/liblldb.so.15+0x26916ff)
 #8 0x00007f9c0528b9dc lldb_private::platform_android::PlatformAndroid::GetFile(lldb_private::FileSpec const&, lldb_private::FileSpec const&) (usr/bin/../lib/liblldb.so.15+0x268b9dc)
```
Our workaround was to set `set target.parallel-module-load ` to `false`
to avoid the crash.

## Background

PlatformAndroid creates two different classes with one stateful adb
connection shared between the two -- one through AdbClient and another
through AdbClient::SyncService. The connection management and state is
complex, and seems to be responsible for the segfault we are seeing. The
AdbClient code resets these connections at times, and re-establishes
connections if they are not active. Similarly, PlatformAndroid caches
its SyncService, which uses an AdbClient class, but the SyncService puts
its connection into a different 'sync' state that is incompatible with a
standard connection.

## Changes in this diff

* This diff refactors the code to (hopefully) have clearer ownership of
the connection, clearer separation of AdbClient and SyncService by
making a new class for clearer separations of concerns, called
AdbSyncService.
* New unit tests are added
* Additional logs were added (see
llvm#145382 (comment)
for details)
swift-ci pushed a commit that referenced this pull request Aug 13, 2025
…namic (llvm#153420)

Canonicalizing the following IR:

```
func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> {
  %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32>
  %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8>
  %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32>
  return %2 : tensor<?x17xf32>
}
```

resulted in a crash

```
#0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db)                                                                                                                                                                                                                                                                                                                   
 #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8                                                                                                                                                                                                                
 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1                                                                                                                                                                                                                                
 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18                                                                                                                                                                                                                                         
 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t*, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3                                                                                                                                                                                                                              
 #5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050)                                                                                                                                                                                                                                                                                                                            
 #6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76                                                                                                                                                                                                                                                                                                            
 #7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6                                                                                                                                                                                                                                                                                                                         
 #8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7                                                                                                                                                                                                                                                                                                                                          
 #9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9                                                                                                                                                                                                                                                                                                                           
#10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2)                                                                                                                                                                                                                                                                                                                            
#11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3                                                                                                                                                                                    
#12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10                                                                                                                                                                                                              
#13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0
```

from the folder for `tosa::mul` since the zero value was being reshaped
to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all
dimensions to be static. So in this case, the fix is to not to fold the
op.
swift-ci pushed a commit that referenced this pull request Aug 19, 2025
…vm#153560)

Fixes llvm#153157

The proposed solution has been discussed here
(llvm#153157 (comment))

This is what we would be seeing now 

```
base) anutosh491@Anutoshs-MacBook-Air bin % ./lldb /Users/anutosh491/work/xeus-cpp/a.out
(lldb) target create "/Users/anutosh491/work/xeus-cpp/a.out"
Current executable set to '/Users/anutosh491/work/xeus-cpp/a.out' (arm64).
(lldb) b main
Breakpoint 1: where = a.out`main, address = 0x0000000100003f90
(lldb) r
Process 71227 launched: '/Users/anutosh491/work/xeus-cpp/a.out' (arm64)
Process 71227 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003f90 a.out`main
a.out`main:
->  0x100003f90 <+0>:  sub    sp, sp, #0x10
    0x100003f94 <+4>:  str    wzr, [sp, #0xc]
    0x100003f98 <+8>:  str    w0, [sp, #0x8]
    0x100003f9c <+12>: str    x1, [sp]
(lldb) expression --repl -l c -- 
  1> 1 + 1
(int) $0 = 2
  2> 2 + 2
(int) $1 = 4
```

```
base) anutosh491@Anutoshs-MacBook-Air bin % ./lldb /Users/anutosh491/work/xeus-cpp/a.out
(lldb) target create "/Users/anutosh491/work/xeus-cpp/a.out"
Current executable set to '/Users/anutosh491/work/xeus-cpp/a.out' (arm64).
(lldb) b main
Breakpoint 1: where = a.out`main, address = 0x0000000100003f90
(lldb) r
Process 71355 launched: '/Users/anutosh491/work/xeus-cpp/a.out' (arm64)
Process 71355 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100003f90 a.out`main
a.out`main:
->  0x100003f90 <+0>:  sub    sp, sp, #0x10
    0x100003f94 <+4>:  str    wzr, [sp, #0xc]
    0x100003f98 <+8>:  str    w0, [sp, #0x8]
    0x100003f9c <+12>: str    x1, [sp]
(lldb) expression --repl -l c -- 3 + 3
Warning: trailing input is ignored in --repl mode
  1> 1 + 1
(int) $0 = 2
```
swift-ci pushed a commit that referenced this pull request Aug 19, 2025
This can happen when JIT code is run, and we can't symbolize those
frames, but they should remain numbered in the stack. An example
spidermonkey trace:

```
    #0 0x564ac90fb80f  (/builds/worker/dist/bin/js+0x240e80f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #1 0x564ac9223a64  (/builds/worker/dist/bin/js+0x2536a64) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #2 0x564ac922316f  (/builds/worker/dist/bin/js+0x253616f) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #3 0x564ac9eac032  (/builds/worker/dist/bin/js+0x31bf032) (BuildId: 5d053c76aad4cfbd08259f8832e7ac78bbeeab58)
    #4 0x0dec477ca22e  (<unknown module>)
```

Without this change, the following symbolization is output:

```
    #0 0x55a6d72f980f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
    #1 0x55a6d72f980f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
    #2 0x55a6d7421a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
    #3 0x55a6d742116f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
    #4 0x55a6d80aa032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
    #4 0x2c803bd8f22e  (<unknown module>)
```

The last frame has a duplicate number. With this change the numbering is
correct:

```
    #0 0x5620c58ec80f in MOZ_CrashSequence /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:248:3
    #1 0x5620c58ec80f in Crash(JSContext*, unsigned int, JS::Value*) /builds/worker/checkouts/gecko/js/src/shell/js.cpp:4223:5
    #2 0x5620c5a14a64 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:501:13
    #3 0x5620c5a1416f in js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:597:12
    #4 0x5620c669d032 in js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICFallbackStub*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) /builds/worker/checkouts/gecko/js/src/jit/BaselineIC.cpp:1705:10
    #5 0x349f24c7022e  (<unknown module>)
```
swift-ci pushed a commit that referenced this pull request Aug 20, 2025
…gic (llvm#153086)

Given the test case:

```llvm
define fastcc i16 @testbtst(i16 %a) nounwind {
  entry:
    switch i16 %a, label %no [
      i16 11, label %yes
      i16 10, label %yes
      i16 9, label %yes
      i16 4, label %yes
      i16 3, label %yes
      i16 2, label %yes
    ]

  yes:
    ret i16 1

  no:
    ret i16 0
}
```

We currently get this result:

```asm
testbtst:                               ; @testbtst
; %bb.0:                                ; %entry
	move.l	%d0, %d1
	and.l	llvm#65535, %d1
	sub.l	#11, %d1
	bhi	.LBB0_3
; %bb.1:                                ; %entry
	and.l	llvm#65535, %d0
	move.l	#3612, %d1
	btst	%d0, %d1
	bne	.LBB0_3        ; <------- Erroneous condition
; %bb.2:                                ; %yes
	moveq	#1, %d0
	rts
.LBB0_3:                                ; %no
	moveq	#0, %d0
	rts
```

The cause of this is a line that explicitly reverses the `btst`
condition code. But on M68k, `btst` sets condition codes the same as
`and` with a bitmask, meaning `EQ` indicates failure (bit is zero) and
not success, so the condition does not need to be reversed.

In my testing, I've only been able to get switch statements to lower to
`btst`, so I wasn't able to explicitly test other options for lowering.
But (if possible to trigger) I believe they have the same logical error.
For example, in `LowerAndToBTST()`, a comment specifies that it's
lowering a case where the `and` result is compared against zero, which
means the corresponding `btst` condition should also not be reversed.

This patch simply flips the ternary expression in
`getBitTestCondition()` to match the ISD condition code with the same
M68k code, instead of the opposite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants