-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: IR dataflow edges through outparams #2704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: IR dataflow edges through outparams #2704
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks promising! Please add tests to demonstrate that it works.
csharp/ql/src/semmle/code/csharp/ir/implementation/raw/Instruction.qll
Outdated
Show resolved
Hide resolved
@@ -743,7 +743,7 @@ class ReturnValueInstruction extends ReturnInstruction { | |||
final Instruction getReturnValue() { result = getReturnValueOperand().getDef() } | |||
} | |||
|
|||
class ReturnIndirectionInstruction extends Instruction { | |||
class ReturnIndirectionInstruction extends VariableInstruction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until now I thought that ReturnIndirectionInstruction
was for simulating a read of a returned pointer. I can see now it's for simulating a read of a pointer parameter. Please add QLDoc.
I've got the input parameter indirection case covered in #2737, so it's fine that it's missing from this PR. |
399cb85
to
677f0f0
Compare
Update test expectations
There's internal tests failing now. The changes in CWE-120 look incorrect, the others are restoring results that were lost when we enabled the IR. I'll look into the CWE-120 changes before I open an internal PR. |
It looks like the new false positives in the CWE-120 test come via the Chi chain on |
I don't fully understand that problem. Can you turn it into a test case so we can see the IR? As I wrote in my initial review, I'd also like to see test cases that demonstrate that the new flow is working as intended. |
I added an example of that to |
This PR will fix 2 out of the 3 SAMATE test regressions identified in https://git.semmle.com/Semmle/code/pull/36464 (the CWE-78 and CWE-197 ones). Is there anything I can do to help with this PR? |
For the record, I believe these are the changes in
|
Sorry for taking so long to follow up on this. The additional results in CWE-120 don't look terrible to me, and I don't think they're blocking for this PR. The query still has the same primary-location results as before, and we can address the duplication of sources in a separate PR. All these result changes show that this is a disruptive PR that can uncover other latent bugs, so we'll want to run CPP-Differences. I'll trigger one now. |
I've been working my way through the CPP-Differences results with the path explanations from #3189. I'm seeing a lot of results where fields are getting conflated when a |
#3118 doesn't immediately fix those. I'll look at it some more tomorrow. |
The CPP-Differences results have arrived: https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1025/ |
@rdmarsh2 It looks like your latest merge from master accidentally added dozens of little test files from your working copy 😬. I recommend rebasing them away. |
I had a deeper look at the results now and came to a different set of conclusions: g/openjdk/jdkcpp/unbounded-write
These should be recoverable with field flow. Overall jbj/ql@a715866 reduces query results from 92 to 89. cpp/path-injection
Overall jbj/ql@a715866 reduces query results from 16 to 8. The remaining results look like true positives. cpp/uncontrolled-allocation-size
Overall jbj/ql@a715866 reduces query results from 4 to 0. All of them seem like false positives. cpp/uncontrolled-process-operation
jbj/ql@a715866 keeps the 2 query results. They both look like true positives. g/torvalds/linuxcpp/path-injection
Overall jbj/ql@a715866 reduces query results from 25 to 23. Both results look like true positives. |
Here are my notes about the three snapshots I looked at. TL;DR: cherry-picking jbj@a715866 will fix all new FPs but move four TPs that should be recoverable with field flow. Git
Wireshark
PHP
|
except if it's from a union. This prevents field conflation through buffers of `UnknownType`.
b09ff54
to
9f40886
Compare
I think that characters from the
Fixed by merging from the latest master.
I think so, but I hit a join order issue with |
This would otherwise have lost a good qltest result at CWE-134/semmle/funcs/funcsLocal.c:58:9:58:10
The |
I've just been hit by this join order issue as well. I'm investigating it now. |
Latest CPP-Differences finished: https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1032 |
Looking through the latest openjdk results, all the results that @MathiasVP mentioned going away with a cherry-pick of a715866 are back in the current PR after restoring flows through from the source value operand of partial loads of arrays and unions. Results on the other snapshots line up with the master merge and cherry pick at 9f40886 g/openjdk/jdkcpp/unbounded-write
cpp/path-injection
cpp/uncontrolled-allocation-size |
The |
I'm now convinced that none of these result changes come from new incorrect flow edges. @MathiasVP @jbj I think this is mergeable once the QL changes are reviewed. |
For submodule consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This connects
ReturnIndirectionInstruction
s toReadSideEffectInstruction
s in the data flow library by adding a new case toTReturnKind
. It doesn't cover flow into functions via indirect parameters - I think that will need some changes to the shared library to use a class similar toReturnKind
for argument IDs.