Fix variable deallocation order in panic unwinding paths#149435
Fix variable deallocation order in panic unwinding paths#149435sladyn98 wants to merge 9 commits intorust-lang:mainfrom
Conversation
|
r? @wesleywiser rustbot has assigned @wesleywiser. Use |
|
r? @dianne |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
It looks like there's a bug causing an assertion failure when building the standard library. I've given it a look and offered a guess at what's causing it below. There's still work to do here beyond fixing that, though.
First, could you add a ui test to demonstrate that this fixes #147875? It looks like it might not yet, since the code for scheduling unwind drops on calls panicking looks unchanged.
Second, after verifying that this results in the correct borrow-checking behavior, we need to make sure that this change doesn't negatively affect codegen. Per the old comment on needs_cleanup, at least at the time it was written, LLVM didn't handle the unnecessary cleanup blocks and StorageDeads particularly well. If you can demonstrate with codegen tests that that's not an issue anymore, and perf isn't too bad, that might be all that's needed. But my expectation is that we'll have to get rid of or ignore the StorageDeads later in compilation (sometime after they serve their purpose in borrowck). Unless there's a reason to keep the StorageDeads around longer, my gut feeling is that this cleanup would be best as a post-borrowck MIR pass (maybe as part of CleanupPostBorrowck?), since then optimization passes can be done on cleaner MIR and we can test it works with MIR tests rather than codegen tests. Could you also add a test for this not affecting later stages of compilation? If you accomplish that by removing the unwind-path StorageDeads as part of a MIR pass, that'd be a mir-opt test.
Before you push again, you'll probably want to run the codegen and mir-opt tests to make sure the former is clean and to bless the latter. Regardless of what approach we take here, if we're changing how the MIR is built, there should be differences in the MIR building test output (part of the mir-opt suite).
|
Reminder, once the PR becomes ready for a review, use |
|
Also, could you change the PR description? #147875 on its own doesn't allow destructors to access freed memory, it doesn't allow for the creation of dangling references, and I'm at least not aware of a safety guarantee that it violates. You should only get unsoundness out of it if you write unsafe code on the assumption that the borrow checker will enforce the relative drop order of locals that may have destructors and those that definitely don't. Of course, per language team decision, consistent drop order is a promise Rust would like to make. But it's not quite the same as the borrow-checker failing to ensure places outlive their references. |
|
So what i did was write this simple rust program panic drop.rs I ran the llvm to get the intermediate representaion and on looking at the IR I cannot find any llvm.lifetime.end statements suggesting to us that on master the StorageDead statements are missing, which according to my understanding means that the borrowchecker does not know when the storage becomes invalid. Let me now write the UI test to see what is up |
|
edit: adjusted wording |
5afe7c2 to
59a7e56
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
This still needs CI to pass before I can review it properly. I've left a few comments on obvious things, but I don't think reviewing the code changes would be helpful at this point. Please test your changes locally. You don't have to run the whole test suite yourself, but for this change, you'll at least want make sure that the mir-opt and codegen tests all pass, that any relevant ui tests pass, and that tidy passes as well.
Could you rebase onto a more recent commit, also? I don't expect there will be conflicts in the MIR building part of this, but I'm not sure about the rest.
I don't mean to be harsh, but this is a relatively complex and nuanced change. If you're not familiar with what's being changed, why it's being changed, the consequences/needs of that, and general contribution procedure, I'd recommend gaining familiarity with easier issues instead.
59a7e56 to
44fbdb3
Compare
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
d8a1764 to
2ed7b45
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e6b9a18 to
1556a29
Compare
This comment has been minimized.
This comment has been minimized.
1556a29 to
44b8dda
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
424738c to
f05ca97
Compare
This commit fixes several issues related to StorageDead and ForLint drops:
1. Add StorageDead and ForLint drops to unwind_drops for all functions
- Updated diverge_cleanup_target to include StorageDead and ForLint drops
in the unwind_drops tree for all functions (not just coroutines), but only
when there's a cleanup path (i.e., when there are Value or ForLint drops)
- This ensures proper drop ordering for borrow-checking on panic paths
2. Fix break_for_tail_call to handle StorageDead and ForLint drops
- Don't skip StorageDead drops for non-drop types
- Adjust unwind_to pointer for StorageDead and ForLint drops, matching
the behavior in build_scope_drops
- Only adjust unwind_to when it's valid (not DropIdx::MAX)
- This prevents debug assert failures when processing drops in tail calls
3. Fix index out of bounds panic when unwind_to is DropIdx::MAX
- Added checks to ensure unwind_to != DropIdx::MAX before accessing
unwind_drops.drop_nodes[unwind_to]
- Only emit StorageDead on unwind paths when there's actually an unwind path
- Only add entry points to unwind_drops when unwind_to is valid
- This prevents panics when there's no cleanup needed
4. Add test for explicit tail calls with StorageDead drops
- Tests that tail calls work correctly when StorageDead and ForLint drops
are present in the unwind path
- Verifies that unwind_to is correctly adjusted for all drop kinds
These changes make the borrow-checker stricter and more consistent by ensuring
that StorageDead statements are emitted on unwind paths for all functions when
there's a cleanup path, allowing unsafe code to rely on drop order being enforced
consistently.
- Add StorageDead to unwind paths for all functions (not just coroutines) - Modify CleanupPostBorrowck to remove StorageDead from cleanup blocks - Add tests for the fix and StorageDead removal
When processing drops in reverse order, unwind_to might not point to the current drop. Only adjust unwind_to when the drop matches what unwind_to is pointing to, rather than asserting they must match.
Fix lifetime issues in rust-analyzer where automaton doesn't live long enough for op.union(). Move op declaration inside each match arm to ensure proper lifetime scope. This fixes compilation errors that are blocking CI, though these are pre-existing issues unrelated to the StorageDead changes.
When processing drops in reverse order, unwind_to might not point to the current drop. Make the unwind_to adjustment conditional on the drop matching, matching the behavior in build_scope_drops. This prevents assertion failures when unwind_to points to a different drop than the one being processed.
- Remove conditional logic and optimization from diverge_cleanup_target - Remove conditional logic from build_exit_tree - Always add StorageDead when there are Value/ForLint drops - Cleanup passes (CleanupPostBorrowck, RemoveNoopLandingPads) handle removal - Fixes reviewer comments about code duplication and fragility
Simplify boolean expressions in scope.rs to fix clippy::needless_bool lint failures, and update test expectations for dropck_trait_cycle_checked and ctfe-arg-bad-borrow to match new StorageDead behavior.
Made-with: Cursor
f05ca97 to
3fd5fc6
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
The job Click to see the possible cause of the failure (guessed by this bot)For more information how to resolve CI failures of this job, visit this link. |
There was a problem hiding this comment.
Well, that's awkward. I took another look at the r-a error too, and I think it and the cargo error are the same thing: a Drop local is modified to reference a later-declared non-Drop local, then moved so it's dropped before the non-Drop local is deallocated. Previously this worked fine. After this change, if there's a function call before the move happens, the non-Drop local is StorageDead on the call's unwind path before the Drop local's destructor is called, even though that's not the order on the happy path. Given that this has come up multiple times in PR CI, I imagine it'd be fairly common in the wild.
I've only looked at the smaller test diffs so far, but I have some comments I'd like addressed before I dig deeper.
@rustbot author
| bb8: { | ||
| + _12 = const false; | ||
| StorageDead(_6); | ||
| StorageDead(_5); |
There was a problem hiding this comment.
Why's there an extra StorageDead on the normal execution path here?
There was a problem hiding this comment.
What is this testing? A test for the change in static semantics should be something that compiled before this change and doesn't compile after, like the example in #147875 or a minimization of the r-a or cargo failures in CI.
There was a problem hiding this comment.
I imagine this is from an old revision?
| o2.set0(&o2); //~ ERROR `o2` does not live long enough | ||
| o2.set1(&o3); //~ ERROR `o3` does not live long enough | ||
| o2.set0(&o2); | ||
| o2.set1(&o3); | ||
| o3.set0(&o1); //~ ERROR `o1` does not live long enough | ||
| o3.set1(&o2); //~ ERROR `o2` does not live long enough | ||
| o3.set1(&o2); |
There was a problem hiding this comment.
Do you know what's going on here?
View all comments
This PR fixes a soundness bug where local variables are deallocated out of order during panic unwinding, allowing destructors to access freed memory. This violates Rust's safety guarantees and has caused real-world unsoundness in crates like generatively.
This PR removes the is_generator check and unconditionally emits StorageDead statements during unwinding for ALL functions, bringing non-generator behavior in line with generators. It ensures that during unwinding, when a local variable goes out of scope, its storage is properly marked as dead via StorageDead, allowing the borrow checker to enforce the
invariant that values must outlive their references even in panic paths.
Fixes #147875