-
Notifications
You must be signed in to change notification settings - Fork 1.8k
CPP: Don't taint the return value of strlen #3592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
override predicate hasSpecificReadSideEffect(ParameterIndex i, boolean buffer) { | ||
getParameter(i).getUnspecifiedType() instanceof PointerType and | ||
buffer = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see a good alternative to repeating all these predicates? As I understand it, all we want is to exclude strlen
and friends from the taint rule. Could that be done with a subclass that just overrides the taint predicate to none()
? It might require repeating part of the charpred, but maybe that can be pulled out to a private predicate or class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could certainly avoid the repetition with this strategy, but I think there will be a cost in comprehensibility. I consider the repeated predicates here to have a very low mental cost, because this kind of repetition is expected across the models library and the repeated predicate bodies themselves are either trivial or very simple. Like rows in a database.
If we do make this change, I'd definitely want to add the helper predicate for the StrLenFunction
charpred, because otherwise we'd be solving repetition in one place by adding repetition in another. That means we'd be adding a helper predicate and complicating the class hierarchy. Probably slightly fewer lines of code, but arguably harder to read overall.
Another approach for improving things here might be to add helpful tools / defaults to the abstract classes in the model interfaces, so that this stuff can be expressed more concisely in the most common cases (e.g. having the three AliasFunction
predicates default to none()
so that you only need to override the one(s) you want). That will make the models more concise, though potentially mistakes might be harder to spot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm convinced by your argument that the intended way to use models
is to inherit from all relevant interface classes every time. If that's too much boiler-plate, we should solve it everywhere and not just for strlen
.
Remove taint flow to the return value of
strlen
and similar functions. The current thinking is that string lengths should not be considered tainted - see #3533 where we decided we didn't want the returned lengths from formatting functions to be considered tainted (and discussedstrlen
), and #3569 where we removed taint flow to the return value ofstrftime
for this reason. This approach should lead to better query results (fewer false positives where the existence of taint is highly dubious).This approach can be seen as old wisdom from the security taint library (now DefaultTaintTracking), which had long made an exception of
strlen
for practical reasons. Now that this is encoded in our models library, I was able to remove one of these exceptions from DefaultTaintTracking without affecting results.