C++/C#: Reuse unaliased SSA when building aliased SSA #3235

dbartol · 2020-04-08T19:38:04Z

Leaving this PR as a draft until I have perf measurements from real snapshots to validate the expected speedup

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, since whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA.

The main changes in this PR are in SSAConstruction.qll. Instead of throwing away all Phi instructions from the previous IR iteration, we bring them along. When computing the definition for a given use, if that use already had a definition in the previous iteration, we reuse that definition. This is slightly complicated by the possibility of degenerate (single-operand) Phi instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate Phi instruction, we recurse to the definition of that Phi instruction's sole reachable input operand. See the new test cases for a couple examples.

In aliased SSA's AliasConfiguration.qll, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time.

I also tweaked the getInstructionUniqueId() predicate to reuse the unique ID from the previous stage, which preserves ordering of Phi instructions in a block to minimize test output diffs.

The points_to test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR.

Finally, I added PhiInstruction.getInputOperand(). I'm surprised we didn't have it already.

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, since whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA. The main changes in this PR are in `SSAConstruction.qll`. Instead of throwing away all `Phi` instructions from the previous IR iteration, we bring them along. When computing the definition for a given use, if that use already had a definition in the previous iteration, we reuse that definition. This is slightly complicated by the possibility of degenerate (single-operand) `Phi` instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate `Phi` instruction, we recurse to the definition of that `Phi` instruction's sole reachable input operand. See the new test cases for a couple examples. In aliased SSA's `AliasConfiguration.qll`, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time. I also tweaked the `getInstructionUniqueId()` predicate to reuse the unique ID from the previous stage, which preserves ordering of `Phi` instructions in a block to minimize test output diffs. The `points_to` test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR. Finally, I added `PhiInstruction.getInputOperand()`. I'm surprised we didn't have it already.

MathiasVP · 2020-04-09T09:23:39Z

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll

@@ -129,11 +191,12 @@ private module Cached {
      tag = oldOperand.getOperandTag() and
      (
        (
+          not instruction instanceof UnmodeledUseInstruction and


I'm curious as to why we didn't need this condition before? Or is it you having to tell the optimizer that one instruction cannot satisfy both disjunctions after changing the predicate?

rdmarsh2 · 2020-04-08T21:30:05Z

csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll

@@ -58,10 +77,20 @@ private module Cached {
    )
  }

+  private predicate willResultBeModeled(OldInstruction oldInstruction) {


I found this name confusing, partly because it reads as a question rather than an assertion, partly because it seems like it could mean will it ever be modeled rather than will this stage model it.

jbj · 2020-04-14T14:08:32Z

csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll

+          if
+            phiOperandOverlap instanceof MustExactlyOverlap and
+            originalOverlap instanceof MustExactlyOverlap
+          then overlap instanceof MustExactlyOverlap
+          else
+            // Pedantically, multiple levels of `MustTotallyOverlap` could combine to yield a
+            // `MustExactlyOverlap`. We won't worry about that because unaliased SSA always produces
+            // exact overlap, and even if it did not, `MustTotallyOverlap` is a valid conservative
+            // approximation.
+            overlap instanceof MustTotallyOverlap


A helper predicate would read better.

jbj · 2020-04-14T14:16:13Z

csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll

+      unique(OldIR::PhiInputOperand operand |
+        operand = oldInstruction.(OldIR::PhiInstruction).getAnInputOperand() and
+        operand.getPredecessorBlock() instanceof OldBlock
+      )


We need to get the green light to use unique.

dbartol · 2020-04-23T17:32:59Z

Superceded by #3340

dbartol added C# C++ labels Apr 8, 2020

dbartol added this to the 1.24 milestone Apr 8, 2020

dbartol assigned rdmarsh2 Apr 8, 2020

MathiasVP reviewed Apr 9, 2020

View reviewed changes

alexet changed the base branch from master to rc/1.24 April 9, 2020 16:03

rdmarsh2 reviewed Apr 9, 2020

View reviewed changes

jbj reviewed Apr 14, 2020

View reviewed changes

dbartol mentioned this pull request Apr 23, 2020

C++/C#: Reuse some SSA def/use info from previous iteration #3340

Closed

dbartol closed this Apr 23, 2020

kamarcum unassigned rdmarsh2 Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++/C#: Reuse unaliased SSA when building aliased SSA #3235

C++/C#: Reuse unaliased SSA when building aliased SSA #3235

Uh oh!

dbartol commented Apr 8, 2020

Uh oh!

MathiasVP Apr 9, 2020 •

edited

Loading

Uh oh!

rdmarsh2 Apr 8, 2020

Uh oh!

jbj Apr 14, 2020

Uh oh!

jbj Apr 14, 2020

Uh oh!

dbartol commented Apr 23, 2020

Uh oh!

Uh oh!

C++/C#: Reuse unaliased SSA when building aliased SSA #3235

C++/C#: Reuse unaliased SSA when building aliased SSA #3235

Uh oh!

Conversation

dbartol commented Apr 8, 2020

Uh oh!

MathiasVP Apr 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rdmarsh2 Apr 8, 2020

Choose a reason for hiding this comment

Uh oh!

jbj Apr 14, 2020

Choose a reason for hiding this comment

Uh oh!

jbj Apr 14, 2020

Choose a reason for hiding this comment

Uh oh!

dbartol commented Apr 23, 2020

Uh oh!

Uh oh!

MathiasVP Apr 9, 2020 •

edited

Loading