Skip to content

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Sep 22, 2020

I was experimenting with using per function sections instead of a single .text section in cg_clif to see if it reduced binary size. It brings the size down for the test I used from 26MB to 15MB like the output of cg_llvm in debug mode. When I measured the linker time the result was a lot less positive: ~0.4s turned into ~0.7s. That is a 75% slowdown. It would be interesting to know what the cost is for cg_llvm over the full rustc-perf benchmark suite.

Can someone start a perf run for me?

I was experimenting with using per function sections instead of a single `.text` section in cg_clif to see if it reduced binary size. It brings the size down for the test I used from 26MB to 15MB like the output of cg_llvm in debug mode. When I measured the linker time the result was a lot less positive: ~0.4s turned into ~0.7s. That is a 75% slowdown. It would be interesting to know what the cost is for cg_llvm over the full rustc-perf benchmark suite.
@rust-highfive
Copy link
Contributor

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 22, 2020
@bjorn3 bjorn3 changed the title [DO NOT MERGE] Measure cost of function section [DO NOT MERGE] Measure cost of per function section Sep 22, 2020
@jonas-schievink
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Sep 22, 2020

⌛ Trying commit 71f63d0 with merge 8a4aabcca5f2e9187df7a96c20f8cba616c0be39...

@bors
Copy link
Collaborator

bors commented Sep 22, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: 8a4aabcca5f2e9187df7a96c20f8cba616c0be39 (8a4aabcca5f2e9187df7a96c20f8cba616c0be39)

@rust-timer
Copy link
Collaborator

Queued 8a4aabcca5f2e9187df7a96c20f8cba616c0be39 with parent c113030, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (8a4aabcca5f2e9187df7a96c20f8cba616c0be39): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@jyn514
Copy link
Member

jyn514 commented Sep 22, 2020

OMG that's incredible!

@jyn514 jyn514 closed this Sep 22, 2020
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 22, 2020

Big wins of up to 20%. No regressions at all.

@jyn514
Copy link
Member

jyn514 commented Sep 22, 2020

(sorry, GitHub on mobile is awful)

@jyn514 jyn514 reopened this Sep 22, 2020
@jyn514 jyn514 added A-linkage Area: linking into static, shared libraries and binaries I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 22, 2020
@mati865
Copy link
Member

mati865 commented Sep 22, 2020

It brings the size down for the test I used from 26MB to 15MB like the output of cg_llvm in debug mode

Function sections are supposed to reduce the size of final binary since we pass -Wl,--gc-sections, at the cost of linker having more work to do.
Maybe something went wrong?

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 22, 2020

That the final size got smaller when using per function sections as it is supposed to. I may have worded it a bit confusingly.

@eddyb eddyb marked this pull request as draft September 22, 2020 15:35
@eddyb
Copy link
Member

eddyb commented Sep 22, 2020

(Changed this to a draft to make the "[DO NOT MERGE]" part even more clear)

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 28, 2020

It would also be nice to check the perf impact of not passing --gc-sections, but keeping -ffunction-sections enabled.

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 30, 2020

I have replaced disabling -ffunction-sections with disabling --gc-sections. This should be possible to enable by default in debug mode. Can I get another perf run?

@lqd
Copy link
Member

lqd commented Sep 30, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Sep 30, 2020

⌛ Trying commit c566b5dcaf8e29cf8c6ca16164cd032ab9a1be4c with merge 62ede63726a44d18001f60e2a93b55965b9ca2d8...

@bors
Copy link
Collaborator

bors commented Sep 30, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: 62ede63726a44d18001f60e2a93b55965b9ca2d8 (62ede63726a44d18001f60e2a93b55965b9ca2d8)

@rust-timer
Copy link
Collaborator

Queued 62ede63726a44d18001f60e2a93b55965b9ca2d8 with parent 511ed9f, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (62ede63726a44d18001f60e2a93b55965b9ca2d8): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@lqd
Copy link
Member

lqd commented Sep 30, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Sep 30, 2020

⌛ Trying commit 6d1124e with merge 7aa1ca89673ff9147df9467391cd50b2eb496016...

@bors
Copy link
Collaborator

bors commented Sep 30, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: 7aa1ca89673ff9147df9467391cd50b2eb496016 (7aa1ca89673ff9147df9467391cd50b2eb496016)

@rust-timer
Copy link
Collaborator

Queued 7aa1ca89673ff9147df9467391cd50b2eb496016 with parent d92d28e, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (7aa1ca89673ff9147df9467391cd50b2eb496016): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@bjorn3
Copy link
Member Author

bjorn3 commented Nov 11, 2020

As of #78414 you can use -Zfunction-sections=no optionally combined with cargo build -Zbuild-std to also build the sysroot with this flag.

@bjorn3 bjorn3 closed this Nov 11, 2020
@bjorn3 bjorn3 deleted the patch-3 branch November 11, 2020 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries I-compiletime Issue: Problems and improvements with respect to compile times. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants