Enable LTO for rustc_driver.so by bjorn3 · Pull Request #101403 · rust-lang/rust

bjorn3 · 2022-09-04T09:27:35Z

Alternative to #97154

This enables LTO'ing dylibs behind a feature flag and uses this feature for compiling rustc_driver.so.

rust-highfive · 2022-09-04T09:27:38Z

r? @oli-obk

(rust-highfive has picked a reviewer for you, use r? to override)

bjorn3 · 2022-09-04T09:28:25Z

@bors try @rust-timer queue

rust-timer · 2022-09-04T09:28:26Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-09-04T09:28:34Z

⌛ Trying commit d3365e41c4520a61804dbf8bdc1501523b90b7ad with merge af5e9125d450e00077d4680e4e02685ba33e9f8e...

bors · 2022-09-04T11:01:59Z

☀️ Try build successful - checks-actions
Build commit: af5e9125d450e00077d4680e4e02685ba33e9f8e (af5e9125d450e00077d4680e4e02685ba33e9f8e)

rust-timer · 2022-09-04T11:02:00Z

Queued af5e9125d450e00077d4680e4e02685ba33e9f8e with parent c2d140b, future comparison URL.

rust-timer · 2022-09-04T21:48:49Z

Finished benchmarking commit (af5e9125d450e00077d4680e4e02685ba33e9f8e): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.7%	[-10.5%, -0.5%]	227
Improvements ✅ (secondary)	-4.3%	[-10.6%, -0.6%]	259
All ❌✅ (primary)	-4.7%	[-10.5%, -0.5%]	227

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	2.9%	[1.0%, 5.1%]	27
Regressions ❌ (secondary)	3.3%	[0.9%, 11.7%]	169
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.1%	[-5.9%, -2.3%]	2
All ❌✅ (primary)	2.9%	[1.0%, 5.1%]	27

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.9%	[-10.3%, -2.1%]	177
Improvements ✅ (secondary)	-5.3%	[-15.7%, -2.3%]	177
All ❌✅ (primary)	-4.9%	[-10.3%, -2.1%]	177

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

bjorn3 · 2022-09-05T09:59:55Z

This will need to add a check that the dependency formats match for all crate types. Otherwise LTO might think it needs to link in a crate at LTO time when compiling, but one of the crate types would link in a dylib containing the crate, giving duplicate symbols, or the other way around.

oli-obk · 2022-09-06T10:41:10Z

cc @Kobzol

Kobzol · 2022-09-06T10:43:17Z

We have actually been discussing this with bjorn3 😅 But thanks!

Kobzol · 2022-09-29T13:48:48Z

Regarding the perf. results:

-3 to -11% wall-time reductions on primary benchmarks, without a single regression (!) and similar results for instruction counts and cycles. This is an incredible result. As an example, it improves diesel walltime by around -10%.
1% reduction in bootstrap time.
Sadly there are some Max-RSS regressions. It's mostly around 5% for helloworld and similar small secondary crates. The primary crates have smaller regressions, up to 2%.

Overall, I think that it's a great result. The RSS hit is unfortunate, but I think that it's worth it to take it for ~5-10% improvement on real world crates. Using LTO could also remedy some of the issues with small functions not being inlined and requiring manual #[inline] sprinkling.

It remains to be seen if there will be some problems with it. One thing that comes to mind is whether LTO won't make perf. swings more common (i.e. modifying a small part of code could produce large perf. changes because of LTO instability), but we have to test that in real usage I suppose.

Also I wonder if we should print some warning or bail completely for now if LTO for dylibs is used outside of rustc. We should also maybe add some configuration option (like rust.lto), only enable it for Linux x64 builds and check it in the bootstrap compilation code, since we haven't tested it anywhere else and currently LTO is being applied unconditionally for stage 2 builds.

Kobzol · 2022-09-29T18:39:48Z

r? @Mark-Simulacrum

This PR implements LTO for the librustc_driver dylib. It's usage is currently only intended for this specific dylib, therefore it requires -Zunstable-options because of its experimental status. A new config entry rust.lto was added, which controls whether the LTO will be enabled. Currently it is enabled for x64 Linux dist CI builds. My previous comment discusses the perf results.

Noratrieb · 2022-10-23T11:52:06Z

Hi, I've seen you changed some diagnostic structs in your PR. After #103345, the way we refer to fluent messages changed. They are now in a flat namespace with the same identifier as in the fluent file. For example, parser::cool_thing is now parser_cool_thing and parser::suggestion just suggestion.
You should rebase to the latest master and change your fluent message references as described above. Thanks!

bjorn3 · 2022-10-23T12:18:05Z

Done by @Kobzol

@bors r=Mark-Simulacrum

bors · 2022-10-23T12:18:07Z

📌 Commit 565b7e0 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2022-10-23T15:00:34Z

⌛ Testing commit 565b7e0 with merge 1ca6777...

bors · 2022-10-23T18:10:45Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 1ca6777 to master...

rust-timer · 2022-10-23T20:06:19Z

Finished benchmarking commit (1ca6777): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.2%	[-9.6%, -0.4%]	230
Improvements ✅ (secondary)	-4.0%	[-9.5%, -0.4%]	257
All ❌✅ (primary)	-4.2%	[-9.6%, -0.4%]	230

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	1.7%	[0.4%, 3.1%]	27
Regressions ❌ (secondary)	3.2%	[2.1%, 5.4%]	8
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[0.4%, 3.1%]	27

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.7%	[-10.3%, -1.8%]	182
Improvements ✅ (secondary)	-5.1%	[-10.9%, -2.2%]	191
All ❌✅ (primary)	-4.7%	[-10.3%, -1.8%]	182

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

RalfJung · 2022-10-25T18:01:01Z

Looks like this possibly caused #103538

rust-highfive assigned oli-obk Sep 4, 2022

rustbot added T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 4, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 4, 2022

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2022

This comment has been minimized.

Sign in to view

bjorn3 mentioned this pull request Sep 4, 2022

[WIP] Use static linking for compiling rustc in CI #97154

Closed

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 4, 2022

Kobzol mentioned this pull request Sep 7, 2022

Use linker plugin LTO for compiling rustc #101524

Closed

bjorn3 force-pushed the dylib_lto branch from d3365e4 to 88b7c40 Compare September 29, 2022 13:49

This comment has been minimized.

Sign in to view

rustbot added the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Sep 29, 2022

bjorn3 and others added 4 commits October 23, 2022 13:43

Allow LTO for dylibs

32238ce

Add rust.lto config option

cba1681

Introduce dedicated -Zdylib-lto flag for enabling LTO on dylibs

c5c8680

Update LLVM submodule

565b7e0

This comment has been minimized.

Sign in to view

bjorn3 force-pushed the dylib_lto branch from df4d1b1 to 565b7e0 Compare October 23, 2022 12:16

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 23, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 23, 2022

bors merged commit 1ca6777 into rust-lang:master Oct 23, 2022

rustbot added this to the 1.66.0 milestone Oct 23, 2022

bors mentioned this pull request Oct 23, 2022

Support #[global_allocator] without the allocator shim #86844

Merged

bjorn3 deleted the dylib_lto branch October 23, 2022 18:40

bors mentioned this pull request Oct 23, 2022

Port pgo.sh to Python #103019

Merged

Mark-Simulacrum mentioned this pull request Oct 25, 2022

rustc-dev component recently became a lot bigger #103538

Closed

nnethercote mentioned this pull request Oct 26, 2022

Tracking issue for speeding up rustc via its build configuration #103595

Open

30 tasks

lqd mentioned this pull request Oct 26, 2022

Enable ThinLTO for rustc on x64 msvc #103591

Merged

This was referenced Dec 13, 2022

Missing ICE info after some compiler panics #105637

Closed

Temporarily disable building rustc with ThinLTO on x86_64-unknown-linux-gnu and x86_64-pc-windows-msvc #105662

Closed

tmandry mentioned this pull request Apr 21, 2023

ICE: assertion failed: ptr::eq(context.tcx.gcx as *const _ as *const (), tcx.gcx as *const _ as *const ()), compiler/rustc_middle/src/ty/context/tls.rs #110564

Open

bjorn3 mentioned this pull request May 26, 2023

Allow linking against dylibs in LTO mode #31854

Open

bjorn3 mentioned this pull request Jul 13, 2023

Linking dylib with "lto = true": assertion failed: !is_full_lto_enabled(sess) #50324

Closed

Uh oh!

Conversation

bjorn3 commented Sep 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Sep 4, 2022

Uh oh!

bjorn3 commented Sep 4, 2022

Uh oh!

rust-timer commented Sep 4, 2022

Uh oh!

bors commented Sep 4, 2022

Uh oh!

This comment has been minimized.

bors commented Sep 4, 2022

Uh oh!

rust-timer commented Sep 4, 2022

Uh oh!

rust-timer commented Sep 4, 2022

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

bjorn3 commented Sep 5, 2022

Uh oh!

oli-obk commented Sep 6, 2022

Uh oh!

Kobzol commented Sep 6, 2022

Uh oh!

Kobzol commented Sep 29, 2022

Uh oh!

This comment has been minimized.

Kobzol commented Sep 29, 2022

Uh oh!

This comment has been minimized.

Noratrieb commented Oct 23, 2022

Uh oh!

bjorn3 commented Oct 23, 2022

Uh oh!

bors commented Oct 23, 2022

Uh oh!

bors commented Oct 23, 2022

Uh oh!

bors commented Oct 23, 2022

Uh oh!

rust-timer commented Oct 23, 2022

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

RalfJung commented Oct 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

bjorn3 commented Sep 4, 2022 •

edited

Loading