Skip to content

C++: Fix struct field conflation in IR data flow #3501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 19, 2020

Conversation

jbj
Copy link
Contributor

@jbj jbj commented May 18, 2020

I recommend reading this PR one commit at a time.

From the main commit:

The virtual-dispatch code for globals was missing any relationship between the union field access and the global variable, which meant it propagated function-pointer flow between any two fields of a global struct. This resulted in false positives from cpp/tainted-format-string on projects using SDL, such as WohlSoft/PGE-Project.

In addition to fixing that bug, this commit also brings the code up to date with the new style of modeling flow through global variables: DataFlow::Node.asVariable().

The false positives can be seen on https://lgtm.com/rules/2159571090/alerts/ if you're lucky enough that WohlSoft/PGE-Project shows up at the top.

Cc @alexet as release manager. I've targeted this at 1.24 because it's a regression since 1.23 that could affect a lot of projects.

jbj added 3 commits May 18, 2020 10:42
This cleans up the test results, which were confusing because functions
like `sink` had multiple locations.

There are some additional results now involving casts to `const char *`
because previously it varied whether `sink` used `const`, and now it
always does.
This test demonstrates that IR data flow conflates unrelated fields of a
global struct-typed variable and that this bug is not present in the old
AST-based implementation of `semmle.code.cpp.security.TaintTracking`.
The virtual-dispatch code for globals was missing any relationship
between the union field access and the global variable, which meant it
propagated function-pointer flow between any two fields of a global
struct. This resulted in false positives from
`cpp/tainted-format-string` on projects using SDL, such as
WohlSoft/PGE-Project.

In addition to fixing that bug, this commit also brings the code up to
date with the new style of modeling flow through global variables:
`DataFlow::Node.asVariable()`.
@jbj jbj added the C++ label May 18, 2020
@jbj jbj added this to the 1.24 milestone May 18, 2020
@jbj jbj requested a review from MathiasVP May 18, 2020 14:39
@jbj
Copy link
Contributor Author

jbj commented May 18, 2020

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes LGTM (assuming the CPP-differences does not reveal any performance problems). I'll let @alexet decide if this can go into rc/1.24.

@jbj
Copy link
Contributor Author

jbj commented May 19, 2020

Performance looks unchanged.

@alexet
Copy link
Contributor

alexet commented May 19, 2020

LGTM as it fixes a regression without performance issues.

@alexet alexet merged commit 57dbe57 into github:rc/1.24 May 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants