Skip to content

Conversation

@tmandry
Copy link
Member

@tmandry tmandry commented May 14, 2019

Part of #52924, depended upon by the generator layout optimization #60187.

This PR adds StorageDead statements in more places in generators, so we can see when non-Drop locals have gone out of scope and recover their storage.

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

For Drop locals, we modify the MaybeStorageLive analysis to use drop to indicate that storage is no longer live for the local. Once drop returns or unwinds to our function, we implicitly assume that the local is StorageDead.

Instead of using drop, it is possible to emit more StorageDead statements in the MIR for Drop locals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant.

r? @Zoxc (feel free to reassign)
cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 14, 2019
@tmandry
Copy link
Member Author

tmandry commented May 14, 2019

cc @rust-lang/lang

The assumption this PR is making is that once drop returns to a function, whether it succeeds or panics, it is UB for the function to then access that local.

Let me know if I need to file an issue separately to discuss this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (StorageDead statements) that it removed, in order to do the analysis for the generator layout optimization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this unifies the unwind and generator_drop paths again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (StorageDead statements) that it removed, in order to do the analysis for the generator layout optimization.

That doesn't make sense. StorageDead is unrelated to drop flags. We do not want drop flags to be created.
Additionally unwind has to drop generator arguments (we probably want to add those back) while the generator_drop path does not have them. I think you should just generate StorageDead statements for the generator_drop path, but not do the other changes.

Copy link
Member Author

@tmandry tmandry May 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, yeah I completely misunderstood what was going on here.

Still, from looking at the code I couldn't find any places where the two paths are handled any differently (after my change to generate StorageDead statements for the generator_drop path), which is why I thought this change was okay. So, I guess I'm still not 100% sure what's going on here :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll split up the commit so this is easier to see.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split up the commit and reverted this change (but left it in the history for now).

@cramertj
Copy link
Member

For additional context, @RalfJung pointed out that under stacked borrows, the creation of the &mut Self reference in Drop::drop means that all existing raw pointers to the local are invalidated, so the requirement introduced in this PR is likely already enforced under the existing model (though there was also some mention that drop is already kind of "special", so it might be that we could create some kind of exception here).

This means that code like this is UB: playground example
Note that when run under MIRI, this code already produces stacked-borrows errors.

@Centril
Copy link
Contributor

Centril commented May 15, 2019

This seems reasonable to me; the only thing I would ask is that we document the assumed invariant in a less ephemeral place than the code-base since it seems easy to forget for users, lang designers, and compiler devs alike.

@tmandry tmandry force-pushed the preserve-scope-in-generator-mir branch from 9f6afd0 to 26c37d7 Compare May 15, 2019 01:33
@tmandry
Copy link
Member Author

tmandry commented May 16, 2019

the only thing I would ask is that we document the assumed invariant in a less ephemeral place than the code-base

There is an issue on the reference to document guarantees around drop and panicing: rust-lang/reference#348

I added a reference to my above comment, so it should get included once documentation is added :)

@Zoxc
Copy link
Contributor

Zoxc commented May 19, 2019

We probably want a pass to remove the StorageDead statements in the unwind path after the generator transformation.

@RalfJung
Copy link
Member

RalfJung commented May 20, 2019

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

Do you think it would make sense to have a -Z flag for this that Miri could set, so that Miri checks these stricter rules in more places?

OTOH, this seemingly only affects the unwind path, of which test coverage is generally spotty at best. ;) (And Miri doesn't even support unwinding yet.)

@RalfJung pointed out that under stacked borrows, the creation of the &mut Self reference in Drop::drop means that all existing raw pointers to the local are invalidated, so the requirement introduced in this PR is likely already enforced under the existing model.

I realized later that there is a gap in this argument: the drop implementation itself could create a raw pointer and leak it into global state, and taht could legitimately be used later.

Relying on StorageDead is probably better.

Let me know if I need to file an issue separately to discuss this!

I don't think we have a topic yet to track StorageDead/StorageLive issues in general. That would also cover #42371 then.

I'll open one in the UCG repository. EDIT: It's at rust-lang/unsafe-code-guidelines#129.


For Drop locals, we modify the MaybeStorageLive analysis to use drop to indicate that storage is no longer live for the local. Once drop returns or unwinds to our function, we implicitly assume that the local is StorageDead.

Okay, so this is the meat of the PR as far as I am concerned: this changes the semantics of Drop to implicitly have a StorageDead in both the return and the unwind edge. If we decide to go with this approach, please open an issue in Miri so that we can implement this in Miri's semantics. It would also be helpful if you could come up with some example code that should be flagged as UB.

I have a few questions about this though:

  • Drop works on an arbitrary place, StorageDead works on a local. I suppose the new semantics only occurs when the place is exactly a local, and does not involve any projections or so?

  • In early passes of MIR construction (before drop flag elaboration), Drop is actually a conditional drop. Does the StorageDead only occur if the Drop happens, or does it always occur? Are we sure that drop flag elaboration is semantics-preserving under whatever choice we make here (and document, I hope)?

  • What about DropAndReplace?

@Zoxc
Copy link
Contributor

Zoxc commented May 20, 2019

I think drop elaboration can turn a drop of a local into drops of its fields, which would effectively remove the implicit StorageDead.

@RalfJung
Copy link
Member

RalfJung commented May 21, 2019

Hm, that might be a problem. Depending on how we decide about the semantics for StorageLive, when _3 is live doing StorageDead(_3); StorageLive(_3) might be okay while just StorageLive(_3) is not okay. So, removing a StorageDead for a local that later gets made live again could be a non-semantics-preserving transformation.

@tmandry
Copy link
Member Author

tmandry commented May 21, 2019

We probably want a pass to remove the StorageDead statements in the unwind path after the generator transformation.

We already remove StorageLive/StorageDead statements for any of the saved locals. For the rest of them, would it buy us anything to remove them?

@Zoxc
Copy link
Contributor

Zoxc commented May 21, 2019

@tmandry There's a list of reasons here. Not sure how important it is.

@cramertj
Copy link
Member

Would it be possible to split further discussion of the proper semantics of StorageDead/StorageLive out into a separate issue, and continue with landing this PR? I think there's general agreement that touching variables after drop has been called on them is undefined behavior, which AFAICT is the only guarantee necessary to make the optimization this unblocks valid.

Note that this PR is blocking a significant generator size optimization (#60187) which is currently blocking the stabilization of async/await (#52924). The lack of this optimization is currently causing frequent stack overflows and memory usage issues for production users (Fuchsia), as well as other nightly users of async/await.

I'd love it if we could land these PRs, unblocking async/await and fixing these problems for production users, and continue asynchronous discussion on the best way to model things in MIR (which, AFAICT, does not affect the validity of the optimization here). Is there a blocking issue that needs to be addressed first before we can merge?

@RalfJung
Copy link
Member

Would it be possible to split further discussion of the proper semantics of StorageDead/StorageLive out into a separate issue, and continue with landing this PR?

Fine for me. Feel free to open an issue. :)

The worst-case outcome I can see here is that we end up having to materialize the implicit StorageDead after drop in generators, to get the semantics straight. Would that be a deal-breaker?

@cramertj
Copy link
Member

Fine for me. Feel free to open an issue. :)

Opened #61015.

The worst-case outcome I can see here is that we end up having to materialize the implicit StorageDead after drop in generators, to get the semantics straight. Would that be a deal-breaker?

I certainly don't have any opposition to this in theory, but it seems worth experimenting to see what solutions we can come up with that don't significantly regress compiler performance (or other ways we can recover the compiler performance).

In the meantime, let's get this in--
@bors r+

@bors
Copy link
Collaborator

bors commented May 21, 2019

📌 Commit 26c37d7 has been approved by cramertj

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 21, 2019
@Zoxc
Copy link
Contributor

Zoxc commented May 21, 2019

@cramertj Did you actually review the PR?

@tmandry
Copy link
Member Author

tmandry commented May 21, 2019

@tmandry There's a list of reasons here. Not sure how important it is.

Well, the idea of "observing a frozen stack" is probably incompatible with this change (and certainly incompatible with the optimization it enables for generators). Technically it isn't wrong to remove the StorageDead after the optimization runs; it's equivalent to recovering these "frozen stack" semantics for the locals that didn't happen to be saved in the generator. But I don't think there's a good reason why we'd want that, and it could lead to more confusion.

The reasons given in the code aren't all that compelling because it boils down to "we thought clang did it this way but actually it doesn't." My understanding is that we never changed the behavior because that would hurt performance. However, we're already going to pay the cost here of emitting these StorageDeads in generators, and I doubt removing them later will recover that cost.

@cramertj
Copy link
Member

@Zoxc Yes, I looked over it and it looked reasonable to me, but I'm much less familiar with this code than you are, so if you have specific concerns you'd like to see addressed before merging feel free to r-.

bors added a commit that referenced this pull request May 22, 2019
…mertj

Preserve local scopes in generator MIR

Part of #52924, depended upon by the generator layout optimization #60187.

This PR adds `StorageDead` statements in more places in generators, so we can see when non-`Drop` locals have gone out of scope and recover their storage.

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

For `Drop` locals, we modify the `MaybeStorageLive` analysis to use `drop` to indicate that storage is no longer live for the local. Once `drop` returns or unwinds to our function, we implicitly assume that the local is `StorageDead`.

Instead of using `drop`, it is possible to emit more `StorageDead` statements in the MIR for `Drop` locals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant.

r? @Zoxc (feel free to reassign)
cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await
@bors
Copy link
Collaborator

bors commented May 22, 2019

⌛ Testing commit 26c37d7 with merge 1cc822c...

@bors
Copy link
Collaborator

bors commented May 22, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: cramertj
Pushing 1cc822c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 22, 2019
@bors bors merged commit 26c37d7 into rust-lang:master May 22, 2019
@tmandry tmandry deleted the preserve-scope-in-generator-mir branch May 22, 2019 17:41
bors added a commit that referenced this pull request May 30, 2019
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung
bors added a commit that referenced this pull request May 31, 2019
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung
Centril added a commit to Centril/rust that referenced this pull request Jun 4, 2019
…nd, r=eddyb

Emit StorageDead along unwind paths for generators

Completion of the work done in rust-lang#60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also rust-lang#61060 which tried to fix this in a different way).

This finally enables the optimization implemented in rust-lang#60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung
bors added a commit that referenced this pull request Jun 5, 2019
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung
bors added a commit that referenced this pull request Jun 6, 2019
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung
bors added a commit that referenced this pull request Jun 12, 2019
Generator optimization: Overlap locals that never have storage live at the same time

The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization.

More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**

To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.

Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.

For the layout computation, we use a simplified approach.

1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.

Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".

So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:

You can think of a generator as an enum, where some fields are shared between variants. e.g.

```rust
enum Generator {
  Unresumed,
  Poisoned,
  Returned,
  Suspend0(A, B, X),
  Suspend1(B),
  Suspend2(A, Y, Z),
}
```

where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.

Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.

---

Depends on:
- [x] #59897 Multi-variant layouts for generators
- [x] #60840 Preserve local scopes in generator MIR
- [x] #61373 Emit StorageDead along unwind paths for generators

Before merging:

- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at #59897 (comment))
Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019
…ddyb

Generator optimization: Overlap locals that never have storage live at the same time

The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization.

More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**

To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.

Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.

For the layout computation, we use a simplified approach.

1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.

Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".

So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:

You can think of a generator as an enum, where some fields are shared between variants. e.g.

```rust
enum Generator {
  Unresumed,
  Poisoned,
  Returned,
  Suspend0(A, B, X),
  Suspend1(B),
  Suspend2(A, Y, Z),
}
```

where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.

Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.

---

Depends on:
- [x] rust-lang#59897 Multi-variant layouts for generators
- [x] rust-lang#60840 Preserve local scopes in generator MIR
- [x] rust-lang#61373 Emit StorageDead along unwind paths for generators

Before merging:

- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened rust-lang#60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at rust-lang#59897 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants