New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Fix dataflow inconsistencies #15040
Conversation
9cc0853
to
e648058
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions.
| @@ -84,7 +83,7 @@ private predicate parameterIsRedefined(Parameter p) { | |||
| class FieldAddress extends Operand { | |||
| FieldAddressInstruction fai; | |||
|
|
|||
| FieldAddress() { fai = this.getDef() } | |||
| FieldAddress() { fai = this.getDef() and not Ssa::ignoreOperand(this) } | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise we end up generating dataflow nodes that aren't used anywhere: https://github.com/github/codeql/pull/15040/files#diff-e9e4c8dbaa54c9f16e1533c09bb66ff3ec456e747f98de139672cfa0412a463cR40-R42. Consider, for exapmle, this:
void f(int*);
struct S { int x; };
void test() {
S s;
f(&s.x);
}the IR will look like:
r1 = &s.x;
r2 = call to f : r1
m3 = WriteSideEffect : r1
...
And that WriteSideEffect isn't used for dataflow (and neither is its operand), so we shouldn't generate a PostUpdateNode for that r1 operand occurring on the WriteSideEffect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's because we now do operand = any(FieldAddress fa).getObjectAddressOperand() instead of having a node for the field address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, yeah.
| override Expr getDefinedExpr() { | ||
| result = fieldAddress.getObjectAddress().getUnconvertedResultExpression() | ||
| final override Node getPreUpdateNode() { hasOperandAndIndex(result, operand, indirectionIndex) } | ||
|
|
||
| final override Expr getDefinedExpr() { | ||
| result = operand.getDef().getUnconvertedResultExpression() | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that most of the test changes are due to the definition of getDefinedExpr now being different for fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, no. The behavior of getDefinedExpr is basically never used in any queries or tests. The only place I can think of is to implement the asPartialDefinition predicate, which also really isn't used by any of our queries.
A few small things that are used by queries did change, though. In particular, the location of a PostFieldUpdateNode used to be given by the location of the field, whereas now it's given by the location of the qualifier.
And slightly more subtle: The post-update node used to be its own dataflow node (i.e., the PostUpdateFieldNode IPA branch), and the pre-update node of that used to be the qualifier of the field. Now, the post-update node is a PostUpdateNodeImpl IPA branch, and the pre-update node is the field itself. This shouldn't matter for queries if they're not touching internal stuff, though.
For some reason I made the incorrect decision to have the post-update node for a field write, and the post-update node for an argument node, as two separate dataflow nodes. That doesn't make a lot of sense since the two aren't mutually exclusive. For example, you can have a field write to an argument node in a situation like:
This PR cleans up this situation by merging those two dataflow nodes into a single IPA branch. This gets rid of a bunch of inconsistency errors 🎉