Skip to content

Conversation

@AndyAyersMS
Copy link
Member

The inline budget is set by the root method being compiled. If we inline a large method into a small one, we sometimes run out of budget trying to inline the large method's callees.

Increase the budget from 20 to 22 to give us a bit more breathing room.

Fixes some regressions from #114996 / #116054

The inline budget is set by the root method being compiled. If we inline
a large method into a small one, we sometimes run out of budget trying
to inline the large method's callees.

Increase the budget from 20 to 22 to give us a bit more breathing room.

Fixes some regressions from dotnet#114996 / dotnet#116054
Copilot AI review requested due to automatic review settings August 12, 2025 15:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR increases the JIT inline budget from 20 to 22 to address performance regressions introduced in previous changes. The inline budget controls how much estimated compile time increase is allowed through method inlining.

  • Increases the DEFAULT_INLINE_BUDGET constant from 20 to 22
  • Addresses scenarios where inlining large methods into small ones exhausts the budget before the large method's callees can be inlined
  • Fixes regressions from previous JIT optimizations

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 12, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

Locally on my box

baseline: 45
***** 20
| FormatterDecimal | 123456.789 | 68.46 ns | 1.235 ns | 1.155 ns | 68.70 ns | 66.59 ns | 71.06 ns |         - |
| FormatterDecimal | 123456.789 | 68.58 ns | 1.008 ns | 0.943 ns | 68.60 ns | 66.79 ns | 70.01 ns |         - |
| FormatterDecimal | 123456.789 | 40.62 ns | 0.682 ns | 0.638 ns | 40.29 ns | 39.55 ns | 41.81 ns |         - |
| FormatterDecimal | 123456.789 | 68.15 ns | 1.418 ns | 1.633 ns | 68.03 ns | 65.81 ns | 71.17 ns |         - |
| FormatterDecimal | 123456.789 | 68.30 ns | 1.540 ns | 1.773 ns | 67.78 ns | 65.93 ns | 72.92 ns |         - |
***** 21
| FormatterDecimal | 123456.789 | 36.85 ns | 0.739 ns | 0.821 ns | 36.62 ns | 35.65 ns | 38.08 ns |         - |
| FormatterDecimal | 123456.789 | 38.49 ns | 0.807 ns | 0.863 ns | 38.76 ns | 36.76 ns | 39.47 ns |         - |
| FormatterDecimal | 123456.789 | 45.16 ns | 0.866 ns | 0.810 ns | 45.16 ns | 43.89 ns | 46.35 ns |         - |
| FormatterDecimal | 123456.789 | 43.05 ns | 0.851 ns | 0.835 ns | 42.94 ns | 41.76 ns | 44.59 ns |         - |
| FormatterDecimal | 123456.789 | 45.21 ns | 0.924 ns | 0.864 ns | 45.17 ns | 43.88 ns | 46.75 ns |         - |
***** 22
| FormatterDecimal | 123456.789 | 35.85 ns | 0.599 ns | 0.560 ns | 35.85 ns | 34.73 ns | 36.99 ns |         - |
| FormatterDecimal | 123456.789 | 38.21 ns | 0.619 ns | 0.579 ns | 38.09 ns | 37.21 ns | 39.22 ns |         - |
| FormatterDecimal | 123456.789 | 35.21 ns | 0.785 ns | 0.904 ns | 35.30 ns | 33.59 ns | 37.29 ns |         - |
| FormatterDecimal | 123456.789 | 35.34 ns | 0.730 ns | 0.716 ns | 35.24 ns | 34.19 ns | 37.07 ns |         - |
| FormatterDecimal | 123456.789 | 38.67 ns | 0.803 ns | 0.788 ns | 38.62 ns | 37.35 ns | 40.09 ns |         - |

and

baseline: 90
***** 20
| Decode_DecodingRequired | 97.17 ns | 1.962 ns | 1.739 ns | 96.72 ns | 95.03 ns | 100.8 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 97.99 ns | 2.766 ns | 2.960 ns | 97.60 ns | 92.90 ns | 102.3 ns | 0.0341 |     216 B |
| Decode_DecodingRequired | 94.46 ns | 2.287 ns | 2.634 ns | 94.54 ns | 90.68 ns | 99.26 ns | 0.0344 |     216 B |
| Decode_DecodingRequired | 96.87 ns | 2.415 ns | 2.781 ns | 96.27 ns | 92.78 ns | 102.7 ns | 0.0343 |     216 B |
| Decode_DecodingRequired | 101.4 ns |  1.50 ns |  1.40 ns | 101.3 ns | 98.71 ns | 103.3 ns | 0.0341 |     216 B |
***** 21
| Decode_DecodingRequired | 83.74 ns | 1.674 ns | 1.566 ns | 84.11 ns | 80.29 ns | 85.55 ns | 0.0343 |     216 B |
| Decode_DecodingRequired | 80.26 ns | 1.170 ns | 1.037 ns | 80.43 ns | 78.02 ns | 81.68 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 83.16 ns | 1.599 ns | 1.496 ns | 83.25 ns | 80.98 ns | 85.58 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 82.10 ns | 1.566 ns | 1.465 ns | 81.83 ns | 80.19 ns | 84.77 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 82.50 ns | 1.530 ns | 1.432 ns | 82.27 ns | 80.49 ns | 85.39 ns | 0.0341 |     216 B |
***** 22
| Decode_DecodingRequired | 83.84 ns | 1.556 ns | 1.456 ns | 83.67 ns | 81.61 ns | 86.10 ns | 0.0343 |     216 B |
| Decode_DecodingRequired | 88.27 ns | 2.496 ns | 2.875 ns | 88.95 ns | 82.64 ns | 93.10 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 82.22 ns | 1.000 ns | 0.835 ns | 82.54 ns | 80.77 ns | 83.64 ns | 0.0342 |     216 B |
| Decode_DecodingRequired | 80.53 ns | 1.614 ns | 1.794 ns | 80.55 ns | 78.02 ns | 85.28 ns | 0.0343 |     216 B |
| Decode_DecodingRequired | 81.59 ns | 1.951 ns | 2.246 ns | 81.41 ns | 77.85 ns | 85.67 ns | 0.0341 |     216 B |

@tannergooding
Copy link
Member

The inline budget is set by the root method being compiled. If we inline a large method into a small one, we sometimes run out of budget trying to inline the large method's callees.

Might it be worth special casing this particular scenario, especially for cases like forwarding methods which are common for public entry points?

One could imagine that if we have a simple forwarding call or a trivial method that calls a big method then we adjust the budget to the larger. -- I'm sure there's some tuning that could be done to find a good balance on when/why we override the base budget

@AndyAyersMS
Copy link
Member Author

The inline budget is set by the root method being compiled. If we inline a large method into a small one, we sometimes run out of budget trying to inline the large method's callees.

Might it be worth special casing this particular scenario, especially for cases like forwarding methods which are common for public entry points?

One could imagine that if we have a simple forwarding call or a trivial method that calls a big method then we adjust the budget to the larger. -- I'm sure there's some tuning that could be done to find a good balance on when/why we override the base budget

Yeah I think we need to revisit this whole area.

Another thing we have in mind is to prioritize inlines instead of inlining in IL order and "depth first" like we do now. Our current approach often leaves simple top-level inlines on the table when the budget is reached. Unfortunately this is not a simple change.

@tannergooding
Copy link
Member

Another thing we have in mind is to prioritize inlines instead of inlining in IL order and "depth first" like we do now. Our current approach often leaves simple top-level inlines on the table when the budget is reached. Unfortunately this is not a simple change.

👍. Particularly if we could do something like "below always inline threshold" first, then some mix of AggressiveInlining weighted against IL size, I think we'd see some improvements in the larger edge cases

@AndyAyersMS
Copy link
Member Author

@EgorBot --arm --intel --filter System.Buffers.Text.Tests.Utf8FormatterTests.FormatterDecimal(value: 123456.789)

@EgorBo
Copy link
Member

EgorBo commented Aug 12, 2025

@AndyAyersMS to be fair, I'm not sure it's going to work on an already merged PR, I think you actually need --commit 4a5eb9de38e0f45937e5a688a980da32f3a5f83c vs previous

@EgorBo
Copy link
Member

EgorBo commented Aug 12, 2025

@EgorBot --arm --intel --commit 4a5eb9d vs previous --filter System.Buffers.Text.Tests.Utf8FormatterTests.FormatterDecimal(value: 123456.789)

@AndyAyersMS
Copy link
Member Author

Ah, thanks... in these results I assume PR is main before this PR?

@EgorBo
Copy link
Member

EgorBo commented Aug 12, 2025

Ah, thanks... in these results I assume PR is main before this PR?

Yeah, it is something I need to fix, or at least rename to "commit" vs "previous". So yeah, in the results "PR" is actually showing changes before your PR.

@DrewScoggins
Copy link
Member

Strange bounce back behavior from this change. dotnet/perf-autofiling-issues#61217

@github-actions github-actions bot locked and limited conversation to collaborators Sep 26, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants