Skip to content

JS: add dataflow barrier for if(!x) #2778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Feb 20, 2020

Conversation

erik-krogh
Copy link
Contributor

Consider the example:

var source = ...
if (source) {
  source = sanitize(source)
}
sink(source);

Previously we would detect flow from source to sink, as it can pass unsanitized through the "else" branch of the if.

Looking at how to solve this I encountered a barrier that @asgerf wrote in PrototypePollutionUtility.ql that solved the exact issue.

I moved that barrier into an AdditionalBarrierGuardNode that works for all DataFlow configurations.
(It looks very different when moved from isBarrier to a AdditionalBarrierGuardNode, but it does the same thing).

@erik-krogh erik-krogh added the JS label Feb 6, 2020
@erik-krogh erik-krogh requested a review from a team as a code owner February 6, 2020 12:04
@erik-krogh
Copy link
Contributor Author

erik-krogh commented Feb 6, 2020

Given that I added a barrier for all DataFlow configurations, we will need a thorough evaluation.

But I would prefer some feedback before i start that evaluation.

@asgerf
Copy link
Contributor

asgerf commented Feb 6, 2020

I agree that this should be the default behavior for taint-tracking configurations 👍

For general data-flow configurations I'm not so sure.
Unfortunately there's no way to opt out of a standard barrier guard, so this means it's no longer possible to track falsy values at all. Although it's rarely needed, it should be possible to do so if you want to.

For example, CorMisconfigurationForCredentials actually tracks null literals for which this barrier is too restrictive (in theory; I'm not aware of any results that would be affected). That happens to be a taint-tracking configuration so it would be affected anyway, but it's a nice example of a query that actually cares about falsy values.

Another way to handle this which we've talked about is to block all nodes that the type inference has inferred to be falsy. So the way to "opt out" might one day be to make it configurable which abstract values you want to track/block. (This is on the JS2020 list and I'm still hoping to get around to it).

In the meantime, I'd suggest making this the default for taint-tracking configurations and expose a nice way to opt-in for data-flow configurations.

@erik-krogh
Copy link
Contributor Author

In the meantime, I'd suggest making this the default for taint-tracking configurations and expose a nice way to opt-in for data-flow configurations.

👍

Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, other LGTM. 👍

Co-Authored-By: Asger F <[email protected]>
@erik-krogh erik-krogh added the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Feb 6, 2020
@erik-krogh
Copy link
Contributor Author

Evaluation came back bad, I'll look into it.

@erik-krogh
Copy link
Contributor Author

Seems like the bad evaluation was due to the misoptimization in PrototypePollutionUtility that got fixed in #2779.

I'm running another evaluation.

@erik-krogh
Copy link
Contributor Author

The evaluation (part2) is looking like the performance hit is too great for the benefit we get.

There is 1 instance of VarAccessBarrierGuard for every VarAccess, and I'm thinking that might be causing the performance regression.
I'll try to use UselessConditional::isExplicitConditional to limit the number of VarAccessBarrierGuard instances.

@erik-krogh
Copy link
Contributor Author

erik-krogh commented Feb 17, 2020

An evaluation over the weekend looks good in terms of performance, except gecko-dev seemingly timing out. (I will look at that)

@asgerf
Copy link
Contributor

asgerf commented Feb 17, 2020

Maybe we should try the same solution I used in PrototypePollutionUtility, in which the guard node for x is directly marked as a barrier instead of using a barrier guard. The difference is that barrier guards trigger reasoning about dominators, which is redundant in this case where SSA encodes dominance information anyway.

@erik-krogh
Copy link
Contributor Author

Maybe we should try the same solution I used in PrototypePollutionUtility, in which the guard node for x is directly marked as a barrier instead of using a barrier guard.

I've done this now, and performance seems fine from a first look (I forgot --dpm on that run, and performance was ±10%, so I'm redoing with --dpm).

Although I don't fancy the or i placed in TaintTracking::Configuration::isBarrier.

@erik-krogh
Copy link
Contributor Author

Performance with --dpm looks good.
(The amphtml performance result is a leftover that I didn't clear).

Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! A few nits otherwise LGTM.

@@ -355,6 +355,11 @@ module TaintedPath {
}
}

/**
* A check of the form `if(x)`, which sanitizes `x` in its "else" branch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Could you make the class private?

This was referenced May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish JS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants