Skip to content

C++: Fix data-flow dispatch perf with globals #3637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 8, 2020

Conversation

jbj
Copy link
Contributor

@jbj jbj commented Jun 8, 2020

There wasn't a good join order for the "store to global var" case in the virtual dispatch library. When a global variable had millions of accesses but few stores to it, the flowsFrom predicate would join to see all those millions of accesses before filtering down to stores only. The solution is to pull out a storeIntoGlobal helper predicate that pre-computes which accesses are stores.

To make the code clearer, I've also pulled out a repeated chunk of code into a new addressOfGlobal helper predicate.

For the kamailio/kamailio project, these are the tuple counts before:

Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@21a1df (iteration 3)
Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
...
59002      ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
58260      ~1%     {3} r31 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#fb AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
2536187389 ~6%     {3} r32 = JOIN r31 WITH Instruction::VariableInstruction::getASTVariable_dispred#fb_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r31.<2>
2536187389 ~6%     {3} r33 = JOIN r32 WITH project#Instruction::VariableAddressInstruction#class#3#ff AS R ON FIRST 1 OUTPUT r32.<0>, true, r32.<2>
58208      ~0%     {3} r34 = JOIN r33 WITH Instruction::StoreInstruction::getDestinationAddress_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r33.<2>

Tuple counts after:

Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@6073c5 (iteration 3)
Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
...
59002    ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
58260    ~1%     {3} r23 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
58208    ~0%     {3} r24 = JOIN r23 WITH DataFlowDispatch::VirtualDispatch::storeIntoGlobal#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r23.<2>
58208    ~0%     {3} r25 = JOIN r24 WITH DataFlowUtil::InstructionNode#ff_10#join_rhs AS R ON FIRST 1 OUTPUT true, r24.<2>, R.<1>

Notice that the final tuple count, 58208, is the same before and after.

The kamailio/kamailio project seems to have been affected by this issue because it has global variables to do with logging policy, and these variables are loaded from in every place where their logging macro is used.

There wasn't a good join order for the "store to global var" case in the
virtual dispatch library. When a global variable had millions of
accesses but few stores to it, the `flowsFrom` predicate would join to
see all those millions of accesses before filtering down to stores only.
The solution is to pull out a `storeIntoGlobal` helper predicate that
pre-computes which accesses are stores.

To make the code clearer, I've also pulled out a repeated chunk of code
into a new `addressOfGlobal` helper predicate.

For the kamailio/kamailio project, these are the tuple counts before:

    Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@21a1df (iteration 3)
    Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
    ...
    59002      ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
    58260      ~1%     {3} r31 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#fb AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
    2536187389 ~6%     {3} r32 = JOIN r31 WITH Instruction::VariableInstruction::getASTVariable_dispred#fb_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r31.<2>
    2536187389 ~6%     {3} r33 = JOIN r32 WITH project#Instruction::VariableAddressInstruction#class#3#ff AS R ON FIRST 1 OUTPUT r32.<0>, true, r32.<2>
    58208      ~0%     {3} r34 = JOIN r33 WITH Instruction::StoreInstruction::getDestinationAddress_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r33.<2>

Tuple counts after:

    Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@6073c5 (iteration 3)
    Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
    ...
    59002    ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
    58260    ~1%     {3} r23 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
    58208    ~0%     {3} r24 = JOIN r23 WITH DataFlowDispatch::VirtualDispatch::storeIntoGlobal#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r23.<2>
    58208    ~0%     {3} r25 = JOIN r24 WITH DataFlowUtil::InstructionNode#ff_10#join_rhs AS R ON FIRST 1 OUTPUT true, r24.<2>, R.<1>

Notice that the final tuple count, 58208, is the same before and after.

The kamailio/kamailio project seems to have been affected by this issue
because it has global variables to do with logging policy, and these
variables are loaded from in every place where their logging macro is
used.
@jbj jbj added the C++ label Jun 8, 2020
@jbj jbj requested a review from a team as a code owner June 8, 2020 10:09
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaner code and better performance. LGTM!

@jbj
Copy link
Contributor Author

jbj commented Jun 8, 2020

I introduced this performance bug in #3501, which targeted rc/1.24 😬. That PR was merged after the last deploy of rc/1.24 to lgtm.com, so it's never been tested on lgtm.com.

I have no idea how many codebases will experience this issue. Apparently none of the projects in CPP-Differences did.

We'll have to find out whether this PR should be targeted at rc/1.24 so it'll show up in the next point release. Cc @alexet.

@dbartol dbartol merged commit 398678a into github:master Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants