Skip to content

Enable LTO for rustc_driver.so#101403

Merged
bors merged 5 commits intorust-lang:masterfrom
bjorn3:dylib_lto
Oct 23, 2022
Merged

Enable LTO for rustc_driver.so#101403
bors merged 5 commits intorust-lang:masterfrom
bjorn3:dylib_lto

Conversation

@bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Sep 4, 2022

Alternative to #97154

This enables LTO'ing dylibs behind a feature flag and uses this feature for compiling rustc_driver.so.

@rustbot rustbot added T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 4, 2022
@rust-highfive
Copy link
Contributor

r? @oli-obk

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 4, 2022
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 4, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2022
@bors
Copy link
Collaborator

bors commented Sep 4, 2022

⌛ Trying commit d3365e41c4520a61804dbf8bdc1501523b90b7ad with merge af5e9125d450e00077d4680e4e02685ba33e9f8e...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Sep 4, 2022

☀️ Try build successful - checks-actions
Build commit: af5e9125d450e00077d4680e4e02685ba33e9f8e (af5e9125d450e00077d4680e4e02685ba33e9f8e)

@rust-timer
Copy link
Collaborator

Queued af5e9125d450e00077d4680e4e02685ba33e9f8e with parent c2d140b, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (af5e9125d450e00077d4680e4e02685ba33e9f8e): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.7% [-10.5%, -0.5%] 227
Improvements ✅
(secondary)
-4.3% [-10.6%, -0.6%] 259
All ❌✅ (primary) -4.7% [-10.5%, -0.5%] 227

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
2.9% [1.0%, 5.1%] 27
Regressions ❌
(secondary)
3.3% [0.9%, 11.7%] 169
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.1% [-5.9%, -2.3%] 2
All ❌✅ (primary) 2.9% [1.0%, 5.1%] 27

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.9% [-10.3%, -2.1%] 177
Improvements ✅
(secondary)
-5.3% [-15.7%, -2.3%] 177
All ❌✅ (primary) -4.9% [-10.3%, -2.1%] 177

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2022
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 5, 2022

This will need to add a check that the dependency formats match for all crate types. Otherwise LTO might think it needs to link in a crate at LTO time when compiling, but one of the crate types would link in a dylib containing the crate, giving duplicate symbols, or the other way around.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 6, 2022

cc @Kobzol

@Kobzol
Copy link
Member

Kobzol commented Sep 6, 2022

We have actually been discussing this with bjorn3 😅 But thanks!

@Kobzol
Copy link
Member

Kobzol commented Sep 29, 2022

Regarding the perf. results:

  • -3 to -11% wall-time reductions on primary benchmarks, without a single regression (!) and similar results for instruction counts and cycles. This is an incredible result. As an example, it improves diesel walltime by around -10%.
  • 1% reduction in bootstrap time.
  • Sadly there are some Max-RSS regressions. It's mostly around 5% for helloworld and similar small secondary crates. The primary crates have smaller regressions, up to 2%.

Overall, I think that it's a great result. The RSS hit is unfortunate, but I think that it's worth it to take it for ~5-10% improvement on real world crates. Using LTO could also remedy some of the issues with small functions not being inlined and requiring manual #[inline] sprinkling.

It remains to be seen if there will be some problems with it. One thing that comes to mind is whether LTO won't make perf. swings more common (i.e. modifying a small part of code could produce large perf. changes because of LTO instability), but we have to test that in real usage I suppose.

Also I wonder if we should print some warning or bail completely for now if LTO for dylibs is used outside of rustc. We should also maybe add some configuration option (like rust.lto), only enable it for Linux x64 builds and check it in the bootstrap compilation code, since we haven't tested it anywhere else and currently LTO is being applied unconditionally for stage 2 builds.

@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Sep 29, 2022
@Kobzol
Copy link
Member

Kobzol commented Sep 29, 2022

r? @Mark-Simulacrum

This PR implements LTO for the librustc_driver dylib. It's usage is currently only intended for this specific dylib, therefore it requires -Zunstable-options because of its experimental status. A new config entry rust.lto was added, which controls whether the LTO will be enabled. Currently it is enabled for x64 Linux dist CI builds. My previous comment discusses the perf results.

@rust-log-analyzer

This comment has been minimized.

@Noratrieb
Copy link
Member

Hi, I've seen you changed some diagnostic structs in your PR. After #103345, the way we refer to fluent messages changed. They are now in a flat namespace with the same identifier as in the fluent file. For example, parser::cool_thing is now parser_cool_thing and parser::suggestion just suggestion.
You should rebase to the latest master and change your fluent message references as described above. Thanks!

@bjorn3
Copy link
Member Author

bjorn3 commented Oct 23, 2022

Done by @Kobzol

@bors r=Mark-Simulacrum

@bors
Copy link
Collaborator

bors commented Oct 23, 2022

📌 Commit 565b7e0 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 23, 2022
@bors
Copy link
Collaborator

bors commented Oct 23, 2022

⌛ Testing commit 565b7e0 with merge 1ca6777...

@bors
Copy link
Collaborator

bors commented Oct 23, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 1ca6777 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 23, 2022
@bors bors merged commit 1ca6777 into rust-lang:master Oct 23, 2022
@rustbot rustbot added this to the 1.66.0 milestone Oct 23, 2022
@bjorn3 bjorn3 deleted the dylib_lto branch October 23, 2022 18:40
@bors bors mentioned this pull request Oct 23, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1ca6777): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.2% [-9.6%, -0.4%] 230
Improvements ✅
(secondary)
-4.0% [-9.5%, -0.4%] 257
All ❌✅ (primary) -4.2% [-9.6%, -0.4%] 230

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
1.7% [0.4%, 3.1%] 27
Regressions ❌
(secondary)
3.2% [2.1%, 5.4%] 8
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.7% [0.4%, 3.1%] 27

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.7% [-10.3%, -1.8%] 182
Improvements ✅
(secondary)
-5.1% [-10.9%, -2.2%] 191
All ❌✅ (primary) -4.7% [-10.3%, -1.8%] 182

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@RalfJung
Copy link
Member

Looks like this possibly caused #103538

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-testsuite Area: The testsuite used to check the correctness of rustc A-translation Area: Translation infrastructure, and migrating existing diagnostics to SessionDiagnostic merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.