Skip to content

Conversation

@zyw-bot
Copy link
Collaborator

@zyw-bot zyw-bot commented Aug 26, 2025

Link: llvm/llvm-project#155415
Requested by: @nikic

@github-actions github-actions bot mentioned this pull request Aug 26, 2025
@zyw-bot
Copy link
Collaborator Author

zyw-bot commented Aug 26, 2025

Diff mode

runner: ariselab-64c-docker
baseline: llvm/llvm-project@769d5c2
patch: llvm/llvm-project#155415
sha256: 2c3de90a4ed87e8a7ccc826dbcbb772297fac1bf24ab4fd2eb019a14012e930f
commit: 9f81cbd

19800 files changed, 5876089 insertions(+), 5884141 deletions(-)

Improvements:
  loop-load-elim.NumLoopLoadEliminted 1575 -> 1883 +19.56%
  licm.NumGEPsHoisted 22577 -> 24059 +6.56%
  instcombine.NumSel 32190 -> 33336 +3.56%
  memdep.NumCacheDirtyNonLocal 304 -> 307 +0.99%
  memdep.NumCacheNonLocal 21059 -> 21245 +0.88%
  simplifycfg.NumLookupTablesHoles 2590 -> 2611 +0.81%
  indvars.NumElimIdentity 1837 -> 1847 +0.54%
  loop-simplifycfg.NumTerminatorsFolded 10698 -> 10743 +0.42%
  instcombine.NumCombined 125411575 -> 125808668 +0.32%
  memdep.NumUncacheNonLocal 26466 -> 26546 +0.30%
Regressions:
  correlated-value-propagation.NumSExt 49183 -> 48037 -2.33%
  correlated-value-propagation.NumSelects 240759 -> 237551 -1.33%
  indvars.NumElimExt 306037 -> 303299 -0.89%
  correlated-value-propagation.NumAddNSW 271868 -> 269754 -0.78%
  correlated-value-propagation.NumAddNW 500858 -> 498382 -0.49%
  simple-loop-unswitch.NumSelects 2619 -> 2609 -0.38%
  correlated-value-propagation.NumNSW 632061 -> 629953 -0.33%
  correlated-value-propagation.NumNW 1039580 -> 1037097 -0.24%
  gvn.NumGVNInstr 155875 -> 155558 -0.20%
  gvn.NumGVNPRE 155875 -> 155558 -0.20%

11 12 bench/abc/optimized/aigCanon.ll
2 3 bench/abc/optimized/extraBddMisc.ll
15 21 bench/abc/optimized/wlnRead.ll
8 9 bench/abseil-cpp/optimized/log_message.ll
11 13 bench/arrow/optimized/cpu_info.ll
21 23 bench/assimp/optimized/Q3BSPFileImporter.ll
17 21 bench/boost/optimized/process_name.ll
33 37 bench/box2d/optimized/settings.ll
40 41 bench/brotli/optimized/compound_dictionary.ll
18 20 bench/bullet3/optimized/btConvexInternalShape.ll
18 20 bench/clamav/optimized/output.ll
40 38 bench/clap-rs/optimized/421wxj3t0b5xgmkw.ll
17 18 bench/cmake/optimized/archive_blake2s_ref.ll
12 11 bench/cmake/optimized/frm_driver.ll
12 16 bench/coremark/optimized/core_main.ll
5 4 bench/coreutils-rs/optimized/2wc2yx8ferzqfnf3.ll
9 7 bench/coreutils-rs/optimized/3z39203exqv32wuh.ll
22 20 bench/coreutils-rs/optimized/h56aibhqef681ic.ll
34 35 bench/cpython/optimized/funcobject.ll
25 26 bench/curl/optimized/keylog.ll
15 17 bench/cvc5/optimized/smt2_state.ll
7 8 bench/darktable/optimized/iop_order.ll
38 30 bench/diesel-rs/optimized/2gwia6lwj254vbd7.ll
20 19 bench/double_conversion/optimized/double-to-string.ll
19 20 bench/duckdb/optimized/onepass.ll
2 1 bench/eastl/optimized/EATest.ll
16 17 bench/faiss/optimized/partitioning.ll
12 14 bench/ffmpeg/optimized/celp_math.ll
9 8 bench/ffmpeg/optimized/pixelutils.ll
15 16 bench/freetype/optimized/pshinter.ll
12 13 bench/g2o/optimized/vertex_ellipse.ll
11 12 bench/git/optimized/clar.ll
32 29 bench/git/optimized/name-hash.ll
8 6 bench/git/optimized/remote-fd.ll
9 10 bench/graphviz/optimized/agerror.ll
21 25 bench/gromacs/optimized/gmx_covar.ll
32 35 bench/gromacs/optimized/nbnxm.ll
45 38 bench/grpc/optimized/time.ll
17 19 bench/hdf5/optimized/H5Ofsinfo.ll
12 19 bench/hermes/optimized/HermesInternal.ll
13 15 bench/hermes/optimized/Math.ll
11 16 bench/hermes/optimized/hermes.ll
3 4 bench/icu/optimized/gentest.ll
8 10 bench/icu/optimized/ucnv.ll
11 12 bench/libigl/optimized/ImGuizmoWidget.ll
14 15 bench/libjpeg-turbo/optimized/jmemmgr.ll
31 30 bench/libpng/optimized/pngrutil.ll
23 24 bench/libquic/optimized/rsa.ll
13 14 bench/libquic/optimized/source_address_token.pb.ll
12 14 bench/libsodium/optimized/crypto_kx.ll
18 19 bench/libwebp/optimized/cost.ll
24 25 bench/libwebp/optimized/cost_sse2.ll
6 8 bench/lief/optimized/debug.ll
19 21 bench/linux/optimized/80003es2lan.ll
14 15 bench/linux/optimized/net.ll
18 22 bench/llama.cpp/optimized/ggml-cpu.ll
21 22 bench/llvm/optimized/ASTReaderDecl.ll
20 13 bench/llvm/optimized/DebugInfoMetadata.ll
9 8 bench/luau/optimized/Lexer.ll
30 31 bench/luau/optimized/lmem.ll
4 5 bench/lvgl/optimized/lv_file_explorer.ll
30 29 bench/lz4/optimized/lz4frame.ll
22 23 bench/meilisearch-rs/optimized/4bitt7og17dqjles.ll
25 31 bench/memcached/optimized/crawler.ll
26 24 bench/meshoptimizer/optimized/vcacheanalyzer.ll
26 24 bench/meshoptimizer/optimized/vfetchanalyzer.ll
20 21 bench/minetest/optimized/inputhandler.ll
10 14 bench/minetest/optimized/l_item.ll
7 10 bench/minetest/optimized/mesh_generator_thread.ll
16 18 bench/mold/optimized/linker-script.cc.X86_64.ll
32 44 bench/msdfgen/optimized/edge-coloring.ll
17 21 bench/nix/optimized/context.ll
21 23 bench/nix/optimized/shared.ll
19 23 bench/nlohmann_json/optimized/unit-concepts.ll
8 10 bench/node/optimized/libnode.data.ll
23 25 bench/nuttx/optimized/fs_procfs.ll
30 35 bench/nuttx/optimized/fs_procfsproc.ll
28 34 bench/ocio/optimized/Lut3DOpCPU_AVX.ll
14 12 bench/ockam-rs/optimized/17lrt90yj9gplgzp.ll
18 16 bench/ockam-rs/optimized/2zpb9qmdbtl1z92t.ll
6 7 bench/oiio/optimized/iptc.ll
30 31 bench/openblas/optimized/blas_server.ll
19 23 bench/opencv/optimized/cap_gstreamer.ll
22 24 bench/opencv/optimized/grfmt_png.ll
17 21 bench/openexr/optimized/exrmetrics.ll
10 9 bench/openjdk/optimized/NativeUtil.ll
4 6 bench/openjdk/optimized/bytecodes.ll
17 19 bench/openmpi/optimized/pctrl.ll
17 18 bench/openspiel/optimized/PBN.ll
12 13 bench/openssl/optimized/cts128.ll
21 17 bench/openssl/optimized/quic_wire_pkt.ll
8 10 bench/openusd/optimized/assetPath.ll
10 11 bench/openusd/optimized/plane.ll
20 24 bench/openvdb/optimized/AttributeArrayString.ll
9 7 bench/php/optimized/ftp_fopen_wrapper.ll
7 6 bench/php/optimized/is_tar.ll
3 5 bench/protobuf/optimized/ruby_generator.ll
11 13 bench/quantlib/optimized/fireflyalgorithm.ll
58 56 bench/quickjs/optimized/qjs.ll
14 15 bench/re2/optimized/onepass.ll
21 18 bench/recastnavigation/optimized/SampleInterfaces.ll
47 54 bench/redis/optimized/bitops.ll
30 32 bench/redis/optimized/config.ll
17 21 bench/ring-rs/optimized/4prppzcttbsz5zvc.ll
12 10 bench/ripgrep-rs/optimized/rwbxp5vay147miz.ll
2 3 bench/rocksdb/optimized/block_cache.ll
19 22 bench/rocksdb/optimized/unique_id_gen.ll
45 43 bench/ropey-rs/optimized/ch9o6osntnscbtd.ll
23 21 bench/rust-analyzer-rs/optimized/1au8fupciwcmum6.ll
19 12 bench/rust-analyzer-rs/optimized/4nk4vk785ylcn5k7.ll
17 16 bench/sdl/optimized/e_rem_pio2.ll
35 38 bench/sentencepiece/optimized/sentencepiece.pb.ll
16 19 bench/slurm/optimized/pmi1.ll
15 16 bench/slurm/optimized/slurm_step_layout.ll
7 11 bench/stat-rs/optimized/2y2d191rk1p8v5y4.ll
18 14 bench/tls-rs/optimized/1pt3w3786vo2dyk0.ll
20 18 bench/tls-rs/optimized/4vvnrvl2eceao62c.ll
39 36 bench/tree-sitter-rs/optimized/1cv8rmziqotlzxv3.ll
8 10 bench/velox/optimized/Utils.ll
30 34 bench/verilator/optimized/V3Gate.ll
13 11 bench/wasmtime-rs/optimized/4zpfk2x34146qelg.ll
29 26 bench/wireshark/optimized/packet-5co-legacy.ll
20 34 bench/wireshark/optimized/packet-gmr1_bcch.ll
6 7 bench/wireshark/optimized/packet-rtcp.ll
10 14 bench/yalantinglibs/optimized/EnumFieldGenerator.ll
12 15 bench/zed-rs/optimized/1534rgxx4q286z7j1ga0u291x.ll
17 18 bench/zed-rs/optimized/74s0htufyupfabszhrulapmbp.ll
15 14 bench/zxing/optimized/ODCode39Reader.ll
19 21 bench/zxing/optimized/ODCode39Writer.ll

@github-actions
Copy link
Contributor

The provided patch consists of numerous changes across multiple files, primarily involving modifications to getelementptr (GEP) instructions in LLVM IR. Here are the major changes:

  1. Simplification of GEP Instructions: Many getelementptr instructions that previously used array indexing (e.g., [N x Type]) have been simplified to direct element access (e.g., Type). For example, %23 = getelementptr inbounds nuw [13 x i32], ptr %21, i64 0, i64 %indvars.iv is changed to %23 = getelementptr inbounds nuw i32, ptr %21, i64 %indvars.iv. This change removes the need for the zero index and simplifies the addressing.

  2. Elimination of Redundant Calculations: Some patches remove redundant arithmetic operations, such as unnecessary shifts and additions, replacing them with more direct pointer arithmetic. For instance, in wlnRead.ll, the calculation of indices is streamlined by directly using the base pointer and offset.

  3. Improvement in Loop Handling: In several files, loop induction variables are updated to use more efficient pointer arithmetic, reducing the number of instructions needed for array access within loops. This is evident in changes like %28 = getelementptr inbounds nuw [99 x i32], ptr %2, i64 0, i64 %indvars.iv41 being simplified to %28 = getelementptr inbounds nuw i32, ptr %2, i64 %indvars.iv41.

  4. Enhancement of String and Array Operations: Changes in string and array operations, such as in Rtl_ShortenName, involve more direct pointer manipulation to set specific characters, reducing the number of intermediate calculations and improving performance.

  5. Optimization of Switch Tables: In functions using switch tables, the GEP instructions for accessing table entries are simplified, removing unnecessary indexing and directly accessing the required elements. For example, %switch.gep = getelementptr inbounds nuw [97 x ptr], ptr @switch.table.Rtl_NtkPrintOpers, i64 0, i64 %33 is changed to %switch.gep = getelementptr inbounds nuw ptr, ptr @switch.table.Rtl_NtkPrintOpers, i64 %33.

These changes collectively aim to improve the efficiency of the generated code by reducing the complexity of pointer arithmetic and eliminating redundant operations.

model: qwen-plus-latest
CompletionUsage(completion_tokens=533, prompt_tokens=112432, total_tokens=112965, completion_tokens_details=None, prompt_tokens_details=None)

Copy link

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully reviewed yet, but the tl;dr here is that this causes the gep + add -> gep + gep fold to fire much more often. This causes many improvements (where there variable part of the GEP is CSEd and we're left with a constant GEP), and a few regressions where null checks are not eliminated anymore due to lost flags.

%.0.i.i = or i1 %6, %7
%3 = getelementptr i8, ptr %0, i64 %1
%4 = getelementptr i8, ptr %3, i64 -1
%5 = icmp eq ptr %4, null
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression due to flag loss from gep+add reassociation.

br i1 %.not12, label %.critedge, label %.lr.ph
%23 = getelementptr i16, ptr %22, i64 %.promoted
%24 = getelementptr i8, ptr %23, i64 -2
%25 = icmp eq ptr %24, null
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another null comparison regression.

@nikic
Copy link

nikic commented Sep 1, 2025

/close

@github-actions github-actions bot closed this Sep 1, 2025
@dtcxzyw dtcxzyw deleted the test-run17239904636 branch September 1, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants