Ignore intrinsic calls in cross-crate-inlining cost model #145910

saethlin · 2025-08-27T01:08:42Z

I noticed in a side project that a function which just compares to [u64; 2] for equality is not cross-crate-inlinable. That was surprising to me because I didn't think that code contained a function call, but of course our array comparisons are lowered to an intrinsic. Intrinsic calls don't make a function no longer a leaf, so it makes sense to add this as an exception to the "only leaves" cross-crate-inline heuristic.

This is the useful compare link: https://perf.rust-lang.org/compare.html?start=7cb1a81145a739c4fd858abe3c624ce8e6e5f9cd&end=c3f0a64dbf9fba4722dacf8e39d2fe00069c995e&stat=instructions%3Au because it disables CGU merging in both commits, so effects that cause changes in the sysroot to perturb partitioning downstream are excluded. Perturbations to what is and isn't cross-crate-inlinable in the sysroot has chaotic effects on what items are in which CGUs after merging. It looks like before this PR by sheer luck some of the CGUs dirtied by the patch in eza incr-unchanged happened to be merged together, and with this PR they are not.

The perf runs on this PR point to a nice runtime performance improvement.

saethlin · 2025-08-27T01:08:51Z

@bors try @rust-timer queue

Ignore intrinsic calls in cross-crate-inlining cost model

joshtriplett · 2025-08-27T02:18:32Z

compiler/rustc_mir_transform/src/cross_crate_inline.rs

+                if let Some((fn_def_id, _)) = func.const_fn_def() {
+                    if self.tcx.intrinsic(fn_def_id).is_some() {


Nit: this would benefit from combining into one if using either let-chaining or is_some_and.

rust-bors · 2025-08-27T03:29:12Z

☀️ Try build successful (CI)
Build commit: e8d1f9d (e8d1f9d5716f4389b8330b02fb30ec690c68624a, parent: 160e7623e8cbbf1feab2b6e2a24733a98c7bde9c)

rust-timer · 2025-08-27T04:40:23Z

Finished benchmarking commit (e8d1f9d): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.1%	[0.4%, 2.6%]	8
Regressions ❌ (secondary)	0.3%	[0.1%, 0.5%]	4
Improvements ✅ (primary)	-0.5%	[-0.7%, -0.3%]	4
Improvements ✅ (secondary)	-0.3%	[-0.6%, -0.0%]	15
All ❌✅ (primary)	0.6%	[-0.7%, 2.6%]	12

Max RSS (memory usage)

Results (primary -1.1%, secondary -3.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.0%	[1.0%, 1.0%]	1
Regressions ❌ (secondary)	3.0%	[0.8%, 5.1%]	2
Improvements ✅ (primary)	-2.2%	[-3.6%, -0.8%]	2
Improvements ✅ (secondary)	-4.0%	[-5.9%, -1.8%]	13
All ❌✅ (primary)	-1.1%	[-3.6%, 1.0%]	3

Cycles

Results (primary 0.8%, secondary 0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.7%	[2.1%, 3.2%]	2
Regressions ❌ (secondary)	3.0%	[2.1%, 4.0%]	4
Improvements ✅ (primary)	-2.9%	[-2.9%, -2.9%]	1
Improvements ✅ (secondary)	-4.2%	[-6.0%, -2.5%]	2
All ❌✅ (primary)	0.8%	[-2.9%, 3.2%]	3

Binary size

Results (primary 0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 1.0%]	32
Regressions ❌ (secondary)	0.1%	[0.1%, 0.3%]	10
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	14
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.1%]	37
All ❌✅ (primary)	0.1%	[-0.2%, 1.0%]	46

Bootstrap: 466.645s -> 466.461s (-0.04%)
Artifact size: 391.15 MiB -> 391.41 MiB (0.07%)

saethlin · 2025-08-27T18:36:24Z

@bors try @rust-timer queue

Ignore intrinsic calls in cross-crate-inlining cost model

saethlin · 2025-08-27T18:45:42Z

@bors try cancel

rust-bors · 2025-08-27T18:45:45Z

Try build cancelled. Cancelled workflows:

https://github.com/rust-lang/rust/actions/runs/17275566119

saethlin · 2025-08-27T18:45:50Z

@bors try @rust-timer queue

Ignore intrinsic calls in cross-crate-inlining cost model

rust-bors · 2025-08-27T21:11:42Z

☀️ Try build successful (CI)
Build commit: 0f272e5 (0f272e5b0ae53eac2844ed412fcedfbe9ecf3a9d, parent: 3c91be712d3d84f6345cd22eae34c47b3a22a3d3)

rust-timer · 2025-08-27T22:31:37Z

Finished benchmarking commit (0f272e5): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.1%	[0.1%, 2.8%]	9
Regressions ❌ (secondary)	1.8%	[0.1%, 3.0%]	10
Improvements ✅ (primary)	-0.5%	[-0.6%, -0.3%]	4
Improvements ✅ (secondary)	-0.3%	[-0.6%, -0.1%]	14
All ❌✅ (primary)	0.6%	[-0.6%, 2.8%]	13

Max RSS (memory usage)

Results (primary -0.0%, secondary -2.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.9%	[1.6%, 6.2%]	2
Regressions ❌ (secondary)	4.2%	[3.3%, 5.1%]	3
Improvements ✅ (primary)	-3.9%	[-4.7%, -3.1%]	2
Improvements ✅ (secondary)	-4.1%	[-6.7%, -1.7%]	13
All ❌✅ (primary)	-0.0%	[-4.7%, 6.2%]	4

Cycles

Results (primary 2.5%, secondary 0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.5%	[2.3%, 2.7%]	2
Regressions ❌ (secondary)	3.1%	[2.1%, 4.3%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.6%	[-5.5%, -2.0%]	4
All ❌✅ (primary)	2.5%	[2.3%, 2.7%]	2

Binary size

Results (primary 0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 1.1%]	32
Regressions ❌ (secondary)	0.2%	[0.1%, 0.3%]	10
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	15
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.1%]	37
All ❌✅ (primary)	0.1%	[-0.2%, 1.1%]	47

Bootstrap: 468.329s -> 467.725s (-0.13%)
Artifact size: 391.15 MiB -> 391.41 MiB (0.07%)

saethlin · 2025-09-06T00:45:20Z

@Kobzol I figured out this case, see the updated PR description

rustbot · 2025-09-06T00:45:26Z

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2025-09-06T00:45:28Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

cjgillot · 2025-09-07T17:55:10Z

Thanks a lot for working on this. I wonder how much we should extend this to simple wrapper functions that do almost nothing except calling another function.

For perf triage: the perf run results in CGU shuffling, with wild changes in perf. Without this CGU effect, this PR is a net improvement.

@bors r+

bors · 2025-09-07T17:55:13Z

📌 Commit ab91a63 has been approved by cjgillot

It is now in the queue for this repository.

saethlin · 2025-09-07T19:35:34Z

I wonder how much we should extend this to simple wrapper functions that do almost nothing except calling another function.

I tried that before in #116898, and at a glance it has the same CGU shuffling problem and needs the same comparison trick I used here. Also that PR is 2 years old so the perf might be completely different now.

bors · 2025-09-08T03:03:25Z

⌛ Testing commit ab91a63 with merge a09fbe2...

bors · 2025-09-08T06:20:17Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing a09fbe2 to master...

github-actions · 2025-09-08T06:23:07Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 2f3f27b (parent) -> a09fbe2 (this PR)

Test differences

Show 1 test diff

1 doctest diff were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard a09fbe2c8372643a27a8082236120f95ed4e6bba --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

pr-check-1: 1348.2s -> 1538.3s (14.1%)
x86_64-gnu-tools: 3372.4s -> 3747.5s (11.1%)
dist-aarch64-apple: 6537.0s -> 7224.9s (10.5%)
x86_64-gnu-llvm-19-2: 6619.3s -> 5980.8s (-9.6%)
pr-check-2: 2256.5s -> 2440.5s (8.2%)
x86_64-gnu-llvm-19: 2528.2s -> 2728.7s (7.9%)
aarch64-msvc-1: 6585.9s -> 7091.6s (7.7%)
x86_64-rust-for-linux: 3032.3s -> 2810.3s (-7.3%)
aarch64-apple: 5716.5s -> 6132.0s (7.3%)
aarch64-msvc-2: 4939.7s -> 5244.2s (6.2%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-09-08T07:30:14Z

Finished benchmarking commit (a09fbe2): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

If the regression was expected or you think it can be justified,
please write a comment with sufficient written justification, and add
@rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
If you think that you know of a way to resolve the regression, try to create
a new PR with a fix for the regression.
If you do not understand the regression or you think that it is just noise,
you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.0%	[0.3%, 2.4%]	10
Regressions ❌ (secondary)	1.9%	[0.2%, 3.0%]	9
Improvements ✅ (primary)	-0.5%	[-0.7%, -0.4%]	5
Improvements ✅ (secondary)	-0.4%	[-0.6%, -0.1%]	14
All ❌✅ (primary)	0.5%	[-0.7%, 2.4%]	15

Max RSS (memory usage)

Results (primary 2.1%, secondary 3.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	7.8%	[7.8%, 7.8%]	1
Regressions ❌ (secondary)	4.0%	[1.5%, 6.8%]	11
Improvements ✅ (primary)	-3.5%	[-3.5%, -3.5%]	1
Improvements ✅ (secondary)	-2.3%	[-2.8%, -1.8%]	2
All ❌✅ (primary)	2.1%	[-3.5%, 7.8%]	2

Cycles

Results (primary 1.6%, secondary 2.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.6%	[1.3%, 2.0%]	3
Regressions ❌ (secondary)	3.9%	[2.7%, 4.8%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.2%	[-4.2%, -4.2%]	1
All ❌✅ (primary)	1.6%	[1.3%, 2.0%]	3

Binary size

Results (primary 0.0%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.9%]	27
Regressions ❌ (secondary)	0.1%	[0.1%, 0.3%]	10
Improvements ✅ (primary)	-0.1%	[-0.2%, -0.0%]	25
Improvements ✅ (secondary)	-0.1%	[-0.2%, -0.1%]	40
All ❌✅ (primary)	0.0%	[-0.2%, 0.9%]	52

Bootstrap: 468.032s -> 468.082s (0.01%)
Artifact size: 387.45 MiB -> 387.72 MiB (0.07%)

Starting with Rust 1.91.0 (released 2025-10-30), in upstream commit ab91a63d403b ("Ignore intrinsic calls in cross-crate-inlining cost model") [1][2], `bindings.o` stops containing DWARF debug information because the `Default` implementations contained `write_bytes()` calls which are now ignored in that cost model (note that `CLIPPY=1` does not reproduce it). This means `gendwarfksyms` complains: RUSTC L rust/bindings.o error: gendwarfksyms: process_module: dwarf_get_units failed: no debugging information? For the moment, conditionally skip `gendwarfksyms` for Rust >= 1.91.0. Cc: [email protected] # Needed in 6.12.y and later (Rust is pinned in older LTSs). Reported-by: Haiyue Wang <[email protected]> Closes: https://lore.kernel.org/rust-for-linux/[email protected]/ Link: rust-lang/rust@ab91a63 [1] Link: rust-lang/rust#145910 [2] Signed-off-by: Miguel Ojeda <[email protected]>

Starting with Rust 1.91.0 (released 2025-10-30), in upstream commit ab91a63d403b ("Ignore intrinsic calls in cross-crate-inlining cost model") [1][2], `bindings.o` stops containing DWARF debug information because the `Default` implementations contained `write_bytes()` calls which are now ignored in that cost model (note that `CLIPPY=1` does not reproduce it). This means `gendwarfksyms` complains: RUSTC L rust/bindings.o error: gendwarfksyms: process_module: dwarf_get_units failed: no debugging information? There are several alternatives that would work here: conditionally skipping in the cases needed (but that is subtle and brittle), forcing DWARF generation with e.g. a dummy `static` (ugly and we may need to do it in several crates), skipping the call to the tool in the Kbuild command when there are no exports (fine) or teaching the tool to do so itself (simple and clean). Thus do the last one: don't attempt to process files if we have no symbol versions to calculate. [ I used the commit log of my patch linked below since it explained the root issue and expanded it a bit more to summarize the alternatives. - Miguel ] Cc: [email protected] # Needed in 6.12.y and later (Rust is pinned in older LTSs). Reported-by: Haiyue Wang <[email protected]> Closes: https://lore.kernel.org/rust-for-linux/[email protected]/ Suggested-by: Miguel Ojeda <[email protected]> Link: https://lore.kernel.org/rust-for-linux/CANiq72nKC5r24VHAp9oUPR1HVPqT+=0ab9N0w6GqTF-kJOeiSw@mail.gmail.com/ Link: rust-lang/rust@ab91a63 [1] Link: rust-lang/rust#145910 [2] Signed-off-by: Sami Tolvanen <[email protected]> Signed-off-by: Miguel Ojeda <[email protected]>

Starting with Rust 1.91.0 (released 2025-10-30), in upstream commit ab91a63d403b ("Ignore intrinsic calls in cross-crate-inlining cost model") [1][2], `bindings.o` stops containing DWARF debug information because the `Default` implementations contained `write_bytes()` calls which are now ignored in that cost model (note that `CLIPPY=1` does not reproduce it). This means `gendwarfksyms` complains: RUSTC L rust/bindings.o error: gendwarfksyms: process_module: dwarf_get_units failed: no debugging information? There are several alternatives that would work here: conditionally skipping in the cases needed (but that is subtle and brittle), forcing DWARF generation with e.g. a dummy `static` (ugly and we may need to do it in several crates), skipping the call to the tool in the Kbuild command when there are no exports (fine) or teaching the tool to do so itself (simple and clean). Thus do the last one: don't attempt to process files if we have no symbol versions to calculate. [ I used the commit log of my patch linked below since it explained the root issue and expanded it a bit more to summarize the alternatives. - Miguel ] Cc: [email protected] # Needed in 6.17.y. Reported-by: Haiyue Wang <[email protected]> Closes: https://lore.kernel.org/rust-for-linux/[email protected]/ Suggested-by: Miguel Ojeda <[email protected]> Link: https://lore.kernel.org/rust-for-linux/CANiq72nKC5r24VHAp9oUPR1HVPqT+=0ab9N0w6GqTF-kJOeiSw@mail.gmail.com/ Link: rust-lang/rust@ab91a63 [1] Link: rust-lang/rust#145910 [2] Signed-off-by: Sami Tolvanen <[email protected]> Tested-by: Haiyue Wang <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Miguel Ojeda <[email protected]>

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 27, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Aug 27, 2025

Auto merge of #145910 - saethlin:ignore-intrinsic-calls, r=<try>

e8d1f9d

Ignore intrinsic calls in cross-crate-inlining cost model

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 27, 2025

This comment has been minimized.

Sign in to view

joshtriplett reviewed Aug 27, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 27, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Aug 27, 2025

Auto merge of #145910 - saethlin:ignore-intrinsic-calls, r=<try>

551798d

Ignore intrinsic calls in cross-crate-inlining cost model

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 27, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Aug 27, 2025

Auto merge of #145910 - saethlin:ignore-intrinsic-calls, r=<try>

0f272e5

Ignore intrinsic calls in cross-crate-inlining cost model

This comment has been minimized.

Sign in to view

saethlin force-pushed the ignore-intrinsic-calls branch from 5dc2b2e to 53bb74b Compare September 6, 2025 00:34

Ignore intrinsic calls in cross-crate-inlining cost model

ab91a63

saethlin force-pushed the ignore-intrinsic-calls branch from 53bb74b to ab91a63 Compare September 6, 2025 00:44

saethlin marked this pull request as ready for review September 6, 2025 00:45

rustbot assigned jieyouxu Sep 6, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 6, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 7, 2025

cjgillot added the perf-regression-triaged The performance regression has been triaged. label Sep 7, 2025

jieyouxu assigned cjgillot and unassigned jieyouxu Sep 7, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 8, 2025

bors merged commit a09fbe2 into rust-lang:master Sep 8, 2025
11 checks passed

rustbot added this to the 1.91.0 milestone Sep 8, 2025

saethlin deleted the ignore-intrinsic-calls branch September 8, 2025 06:33

saethlin mentioned this pull request Oct 5, 2025

Avoid LocalCopy instantiation for #[inline] on -Copt-level=0 #147351

Draft

		if let Some((fn_def_id, _)) = func.const_fn_def() {
		if self.tcx.intrinsic(fn_def_id).is_some() {

Ignore intrinsic calls in cross-crate-inlining cost model #145910

Ignore intrinsic calls in cross-crate-inlining cost model #145910

Uh oh!

Conversation

saethlin commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saethlin commented Aug 27, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

rust-bors bot commented Aug 27, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Aug 27, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

saethlin commented Aug 27, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

saethlin commented Aug 27, 2025

Uh oh!

rust-bors bot commented Aug 27, 2025

Uh oh!

saethlin commented Aug 27, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Aug 27, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Aug 27, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

saethlin commented Sep 6, 2025

Uh oh!

rustbot commented Sep 6, 2025

Uh oh!

rustbot commented Sep 6, 2025

Uh oh!

cjgillot commented Sep 7, 2025

Uh oh!

bors commented Sep 7, 2025

Uh oh!

saethlin commented Sep 7, 2025

Uh oh!

bors commented Sep 8, 2025

Uh oh!

bors commented Sep 8, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 8, 2025

Test differences

Job duration changes

Uh oh!

rust-timer commented Sep 8, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

saethlin commented Aug 27, 2025 •

edited

Loading