- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Preserve local scopes in generator MIR #60840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve local scopes in generator MIR #60840
Conversation
| cc @rust-lang/lang The assumption this PR is making is that once  Let me know if I need to file an issue separately to discuss this! | 
        
          
                src/librustc_mir/build/scope.rs
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (StorageDead statements) that it removed, in order to do the analysis for the generator layout optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this unifies the unwind and generator_drop paths again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (
StorageDeadstatements) that it removed, in order to do the analysis for the generator layout optimization.
That doesn't make sense. StorageDead is unrelated to drop flags. We do not want drop flags to be created.
Additionally unwind has to drop generator arguments (we probably want to add those back) while the generator_drop path does not have them. I think you should just generate StorageDead statements for the generator_drop path, but not do the other changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, yeah I completely misunderstood what was going on here.
Still, from looking at the code I couldn't find any places where the two paths are handled any differently (after my change to generate StorageDead statements for the generator_drop path), which is why I thought this change was okay. So, I guess I'm still not 100% sure what's going on here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll split up the commit so this is easier to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I split up the commit and reverted this change (but left it in the history for now).
| For additional context, @RalfJung pointed out that under stacked borrows, the creation of the  This means that code like this is UB: playground example | 
| This seems reasonable to me; the only thing I would ask is that we document the assumed invariant in a less ephemeral place than the code-base since it seems easy to forget for users, lang designers, and compiler devs alike. | 
This reverts commit 26a7228f0fd5929f2134ac36180eb1e8ff5e16e3.
9f6afd0    to
    26c37d7      
    Compare
  
    | 
 There is an issue on the reference to document guarantees around drop and panicing: rust-lang/reference#348 I added a reference to my above comment, so it should get included once documentation is added :) | 
| We probably want a pass to remove the StorageDead statements in the unwind path after the generator transformation. | 
| 
 Do you think it would make sense to have a  OTOH, this seemingly only affects the unwind path, of which test coverage is generally spotty at best. ;) (And Miri doesn't even support unwinding yet.) 
 I realized later that there is a gap in this argument: the  Relying on  
 I don't think we have a topic yet to track  I'll open one in the UCG repository. EDIT: It's at rust-lang/unsafe-code-guidelines#129. 
 Okay, so this is the meat of the PR as far as I am concerned: this changes the semantics of  I have a few questions about this though: 
 | 
| I think drop elaboration can turn a drop of a local into drops of its fields, which would effectively remove the implicit  | 
| Hm, that might be a problem. Depending on how we decide about the semantics for  | 
| 
 We already remove StorageLive/StorageDead statements for any of the saved locals. For the rest of them, would it buy us anything to remove them? | 
| Would it be possible to split further discussion of the proper semantics of  Note that this PR is blocking a significant generator size optimization (#60187) which is currently blocking the stabilization of async/await (#52924). The lack of this optimization is currently causing frequent stack overflows and memory usage issues for production users (Fuchsia), as well as other nightly users of async/await. I'd love it if we could land these PRs, unblocking async/await and fixing these problems for production users, and continue asynchronous discussion on the best way to model things in MIR (which, AFAICT, does not affect the validity of the optimization here). Is there a blocking issue that needs to be addressed first before we can merge? | 
| 
 Fine for me. Feel free to open an issue. :) The worst-case outcome I can see here is that we end up having to materialize the implicit  | 
| 
 Opened #61015. 
 I certainly don't have any opposition to this in theory, but it seems worth experimenting to see what solutions we can come up with that don't significantly regress compiler performance (or other ways we can recover the compiler performance). In the meantime, let's get this in-- | 
| 📌 Commit 26c37d7 has been approved by  | 
| @cramertj Did you actually review the PR? | 
| 
 Well, the idea of "observing a frozen stack" is probably incompatible with this change (and certainly incompatible with the optimization it enables for generators). Technically it isn't wrong to remove the  The reasons given in the code aren't all that compelling because it boils down to "we thought clang did it this way but actually it doesn't." My understanding is that we never changed the behavior because that would hurt performance. However, we're already going to pay the cost here of emitting these  | 
| @Zoxc Yes, I looked over it and it looked reasonable to me, but I'm much less familiar with this code than you are, so if you have specific concerns you'd like to see addressed before merging feel free to r-. | 
…mertj Preserve local scopes in generator MIR Part of #52924, depended upon by the generator layout optimization #60187. This PR adds `StorageDead` statements in more places in generators, so we can see when non-`Drop` locals have gone out of scope and recover their storage. The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions. For `Drop` locals, we modify the `MaybeStorageLive` analysis to use `drop` to indicate that storage is no longer live for the local. Once `drop` returns or unwinds to our function, we implicitly assume that the local is `StorageDead`. Instead of using `drop`, it is possible to emit more `StorageDead` statements in the MIR for `Drop` locals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant. r? @Zoxc (feel free to reassign) cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await
| ☀️ Test successful - checks-travis, status-appveyor | 
Emit StorageDead along unwind paths for generators Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way). This finally enables the optimization implemented in #60187. r? @eddyb cc @Zoxc @cramertj @RalfJung
Emit StorageDead along unwind paths for generators Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way). This finally enables the optimization implemented in #60187. r? @eddyb cc @Zoxc @cramertj @RalfJung
…nd, r=eddyb Emit StorageDead along unwind paths for generators Completion of the work done in rust-lang#60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also rust-lang#61060 which tried to fix this in a different way). This finally enables the optimization implemented in rust-lang#60187. r? @eddyb cc @Zoxc @cramertj @RalfJung
Emit StorageDead along unwind paths for generators Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way). This finally enables the optimization implemented in #60187. r? @eddyb cc @Zoxc @cramertj @RalfJung
Emit StorageDead along unwind paths for generators Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way). This finally enables the optimization implemented in #60187. r? @eddyb cc @Zoxc @cramertj @RalfJung
Generator optimization: Overlap locals that never have storage live at the same time The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization. More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.** To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with. Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout. For the layout computation, we use a simplified approach. 1. Start with the set of locals assigned to only one variant. The rest are disqualified. 2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping. Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone". So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this: You can think of a generator as an enum, where some fields are shared between variants. e.g. ```rust enum Generator { Unresumed, Poisoned, Returned, Suspend0(A, B, X), Suspend1(B), Suspend2(A, Y, Z), } ``` where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them. Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time. --- Depends on: - [x] #59897 Multi-variant layouts for generators - [x] #60840 Preserve local scopes in generator MIR - [x] #61373 Emit StorageDead along unwind paths for generators Before merging: - [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889) - [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location) - [x] Clean up TODO - [x] Fix the layout code to enforce that the same field never moves around in the generator - [x] Add tests for async/await - [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity) - [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at #59897 (comment))
…ddyb Generator optimization: Overlap locals that never have storage live at the same time The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization. More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.** To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with. Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout. For the layout computation, we use a simplified approach. 1. Start with the set of locals assigned to only one variant. The rest are disqualified. 2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping. Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone". So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this: You can think of a generator as an enum, where some fields are shared between variants. e.g. ```rust enum Generator { Unresumed, Poisoned, Returned, Suspend0(A, B, X), Suspend1(B), Suspend2(A, Y, Z), } ``` where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them. Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time. --- Depends on: - [x] rust-lang#59897 Multi-variant layouts for generators - [x] rust-lang#60840 Preserve local scopes in generator MIR - [x] rust-lang#61373 Emit StorageDead along unwind paths for generators Before merging: - [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened rust-lang#60889) - [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location) - [x] Clean up TODO - [x] Fix the layout code to enforce that the same field never moves around in the generator - [x] Add tests for async/await - [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity) - [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at rust-lang#59897 (comment))
Part of #52924, depended upon by the generator layout optimization #60187.
This PR adds
StorageDeadstatements in more places in generators, so we can see when non-Droplocals have gone out of scope and recover their storage.The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.
For
Droplocals, we modify theMaybeStorageLiveanalysis to usedropto indicate that storage is no longer live for the local. Oncedropreturns or unwinds to our function, we implicitly assume that the local isStorageDead.Instead of using
drop, it is possible to emit moreStorageDeadstatements in the MIR forDroplocals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant.r? @Zoxc (feel free to reassign)
cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await