C++: Remove more dataflow FPs after frontend upgrade #13965
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements two dataflow improvements. The combination of the two effects means we won't have new FPs on the
cpp/invalid-pointer-derefafter the frontend upgrade that @jketema is working on.The first commit improves out detection of which writes are considered "certain" (i.e., which completely overwrites an allocation) by using value numbering. Previously, we'd detect that that
x = 0in:overwrites an entire allocation because it writes directly to a
VariableAddressInstruction. Thus we wouldn't report flow fromsourcetosink.However, we didn't detect that in the case of
*p = 0in:since
*p = 0wrote to the result of aLoadInstruction(and not aVariableAddressInstruction). However, value numbering can easily see that*pwrites to an address that is equal to aVariableAddressInstruction.The second commit is a bit more complex. It addresses the following problem:
Consider the following snippet:
*--(*x) = 0;after the frontend upgrade this translates into the following IR (where I've stripped away the irrelevant parts):
and assume we have flow into
*r4(i.e., theCopyValueinstruction behind one level of indirection. This can be interpreted as "the value ofxis tainted"). Dataflow then transfers flow from*r4to*&:r4(i.e., the address operand ofr9). However, notice that there's aStoreto the exact same address just before the load of the address. So, in fact, we shouldn't have flow from*r4to*&:r4because there's a write to*&:r4in between those.This last commit addresses this problem by removing those dataflow edges when there's a
StoreInstructionin between the defining instruction and the operand.The predicate introduced isn't exactly pretty, but it seems to perform okay on the databases I've tried so far. Ideally, I'd like to replace this with use-use flow through the IRs instructions and operands (instead of the current def-use strategy that's native to what the IR provides), but that's a slightly more involved change than what I'd like to do here.