-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++/C#: Reuse unaliased SSA when building aliased SSA #3235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, since whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA. The main changes in this PR are in `SSAConstruction.qll`. Instead of throwing away all `Phi` instructions from the previous IR iteration, we bring them along. When computing the definition for a given use, if that use already had a definition in the previous iteration, we reuse that definition. This is slightly complicated by the possibility of degenerate (single-operand) `Phi` instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate `Phi` instruction, we recurse to the definition of that `Phi` instruction's sole reachable input operand. See the new test cases for a couple examples. In aliased SSA's `AliasConfiguration.qll`, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time. I also tweaked the `getInstructionUniqueId()` predicate to reuse the unique ID from the previous stage, which preserves ordering of `Phi` instructions in a block to minimize test output diffs. The `points_to` test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR. Finally, I added `PhiInstruction.getInputOperand()`. I'm surprised we didn't have it already.
@@ -129,11 +191,12 @@ private module Cached { | |||
tag = oldOperand.getOperandTag() and | |||
( | |||
( | |||
not instruction instanceof UnmodeledUseInstruction and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious as to why we didn't need this condition before? Or is it you having to tell the optimizer that one instruction cannot satisfy both disjunctions after changing the predicate?
@@ -58,10 +77,20 @@ private module Cached { | |||
) | |||
} | |||
|
|||
private predicate willResultBeModeled(OldInstruction oldInstruction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this name confusing, partly because it reads as a question rather than an assertion, partly because it seems like it could mean will it ever be modeled rather than will this stage model it.
if | ||
phiOperandOverlap instanceof MustExactlyOverlap and | ||
originalOverlap instanceof MustExactlyOverlap | ||
then overlap instanceof MustExactlyOverlap | ||
else | ||
// Pedantically, multiple levels of `MustTotallyOverlap` could combine to yield a | ||
// `MustExactlyOverlap`. We won't worry about that because unaliased SSA always produces | ||
// exact overlap, and even if it did not, `MustTotallyOverlap` is a valid conservative | ||
// approximation. | ||
overlap instanceof MustTotallyOverlap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A helper predicate would read better.
unique(OldIR::PhiInputOperand operand | | ||
operand = oldInstruction.(OldIR::PhiInstruction).getAnInputOperand() and | ||
operand.getPredecessorBlock() instanceof OldBlock | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get the green light to use unique
.
Superceded by #3340 |
Leaving this PR as a draft until I have perf measurements from real snapshots to validate the expected speedup
We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, since whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA.
The main changes in this PR are in
SSAConstruction.qll
. Instead of throwing away allPhi
instructions from the previous IR iteration, we bring them along. When computing the definition for a given use, if that use already had a definition in the previous iteration, we reuse that definition. This is slightly complicated by the possibility of degenerate (single-operand)Phi
instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degeneratePhi
instruction, we recurse to the definition of thatPhi
instruction's sole reachable input operand. See the new test cases for a couple examples.In aliased SSA's
AliasConfiguration.qll
, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time.I also tweaked the
getInstructionUniqueId()
predicate to reuse the unique ID from the previous stage, which preserves ordering ofPhi
instructions in a block to minimize test output diffs.The
points_to
test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR.Finally, I added
PhiInstruction.getInputOperand()
. I'm surprised we didn't have it already.