Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Remove more dataflow FPs after frontend upgrade #13965

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Aug 15, 2023

This PR implements two dataflow improvements. The combination of the two effects means we won't have new FPs on the cpp/invalid-pointer-deref after the frontend upgrade that @jketema is working on.

  • The first commit improves out detection of which writes are considered "certain" (i.e., which completely overwrites an allocation) by using value numbering. Previously, we'd detect that that x = 0 in:

    int x = source();
    x = 0;
    sink(x);

    overwrites an entire allocation because it writes directly to a VariableAddressInstruction. Thus we wouldn't report flow from source to sink.
    However, we didn't detect that in the case of *p = 0 in:

    int x = source();
    int* p = &x;
    *p = 0;
    sink(*p);

    since *p = 0 wrote to the result of a LoadInstruction (and not a VariableAddressInstruction). However, value numbering can easily see that *p writes to an address that is equal to a VariableAddressInstruction.

  • The second commit is a bit more complex. It addresses the following problem:
    Consider the following snippet:

    *--(*x) = 0;

    after the frontend upgrade this translates into the following IR (where I've stripped away the irrelevant parts):

    r1(char)           = Constant[0]              :
    r2(glval<char **>) = VariableAddress[x]       :
    r3(char **)        = Load[x]                  : &:r2
    r4(glval<char *>)  = CopyValue                : r3
    r5(char *)         = Load[?]                  : &:r4
    r6(int)            = Constant[1]              :
    r7(char *)         = PointerSub[1]            : r5, r6
    m10(char *)        = Store[?]                 : &:r4, r7
    r9(char *)         = Load[?]                  : &:r4
    r10(glval<char>)   = CopyValue                : r9
    m11(char)          = Store[?]                 : &:r10, r1

    and assume we have flow into *r4 (i.e., the CopyValue instruction behind one level of indirection. This can be interpreted as "the value of x is tainted"). Dataflow then transfers flow from *r4 to *&:r4 (i.e., the address operand of r9). However, notice that there's a Store to the exact same address just before the load of the address. So, in fact, we shouldn't have flow from *r4 to *&:r4 because there's a write to *&:r4 in between those.

    This last commit addresses this problem by removing those dataflow edges when there's a StoreInstruction in between the defining instruction and the operand.

    The predicate introduced isn't exactly pretty, but it seems to perform okay on the databases I've tried so far. Ideally, I'd like to replace this with use-use flow through the IRs instructions and operands (instead of the current def-use strategy that's native to what the IR provides), but that's a slightly more involved change than what I'd like to do here.

@MathiasVP MathiasVP requested a review from a team as a code owner August 15, 2023 10:24
@github-actions github-actions bot added the C++ label Aug 15, 2023
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Aug 15, 2023
Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if DCA is happy

@MathiasVP MathiasVP merged commit 90888e5 into github:main Aug 15, 2023
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants