-
Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: Fix more conflation in dataflow #13425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Fix more conflation in dataflow #13425
Conversation
| void does_not_write_source_to_dereference(int *p) // $ ast-def=p ir-def=*p | ||
| { | ||
| int x = source(); | ||
| p = &x; | ||
| *p = 42; | ||
| } | ||
|
|
||
| void test_does_not_write_source_to_dereference() | ||
| { | ||
| int x; | ||
| does_not_write_source_to_dereference(&x); | ||
| sink(x); // $ ast,ir=733:7 SPURIOUS: ast,ir=726:11 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a testcase I came up with as an attempt to further reduce the testcase added in 90ffb45. However, it looks like this is a different conflation problem since test this wasn't fixed by this PR.
To be clear: This isn't a regression caused by this PR. I just tagged it along here to avoid a merge conflict with this PR.
|
We seem to be missing some taint flow now (see missing DCA results). |
jketema
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll
Outdated
Show resolved
Hide resolved
|
DCA nightly suite result changes:
DCA MCTV suite result changes:
|
That would explain the large performance improvement on vim 😅 |
Question is what. The vim dataflow paths are rather horrendous as usual. |
|
If you want to get started on debugging this while I'm away I suggest:
1-3 is probably a whole day kind of task already. So if you're too busy with other things feel free to leave it for me to do once I'm back |
I think I might just give it a go. |
Gave it a go, but wasn't able to make any progress. I tried restricting the source of one of the problematic queries as much possible, and tried making is start later on a path that is disappearing. However, even with this partial flow is basically not computable on my machine. |
|
Thanks for all the investigations on this Jeroen! I've pushed a reduced testcase demonstrating the missing flow on vim (turns out our model for Luckily, the fix was super easy. I'll start another DCA run to see what the impact of this is. |
|
I don't think the |
|
Good point. Yeah, the strncpy fix was strictly a taintflow fix, and that query uses only dataflow. So we still need to figure out if those lost results are because we broke something, or because we fixed conflation issues |
|
Investigation so far: It looks like we're losing all the vim results that come from the uf_name field. These are indeed all FPs (because it's a flexible array member). But I'm still trying to figure out why we lose them. |
|
I've done a spot check of some of the lost results and it does look like the it's caused by the now-fixed conflation 🎉. For example, the flow starts here and then moves to:
And at this point we're suddenly tracking the indirection now (instead of the pointer). And after this PR this is no longer happening 🎉. |
…s to not allow pointer conflation.
…e.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
…e.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
1ec3eb9 to
992af55
Compare
|
DCA looks great! We got the lost result on |
There is also a test regression that need to be looked at. |
|
Thanks for the heads up. That looks like 992af55 fixed a missing flow, but I'll double check to verify that. |
992af55 to
aca4716
Compare
|
@jketema this PR should be all ready now. I've force-pushed a testcase that demonstrates the missing flow we saw on The query test change was just some changes to path explanations. So nothing major to see there. |
aca4716 to
79fb6a6
Compare
This PR fixes two conflation issues that were giving us a bunch of FPs on https://github.com/lief-project/lief.
The first fix is very simple (see 153df2c), and the second fix took a couple of days of debugging (see 7c32721) 😂.
Commit-by-commit review recommended.