New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go: Make flow configurations use new data flow API #13820
base: main
Are you sure you want to change the base?
Go: Make flow configurations use new data flow API #13820
Conversation
85baa81
to
1d2c268
Compare
1d2c268
to
028eb9b
Compare
028eb9b
to
9919221
Compare
9919221
to
355f9c6
Compare
|
|
355f9c6
to
d8ed7fe
Compare
Removed edges and labels are mostly duplicates. No alerts are lost.
d8ed7fe
to
9376ead
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work updating all of this!
- For queries where the expected test output changed and you haven't done so yet, it would be good to note why it changed in the commit description or in the PR description.
- Most/all of the old data/taint flow configurations had docs comments explaining their purpose. However, the new
ConfigSigimplementations and data/taint flow configurations mostly do not have matching docs comments. In general, it would be nice to have more documentation/comments.
There also seem to a good few ConfigSigs which are basically the same as a result of the [..]Customizations approach along the lines of:
private module Config implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof Source }
predicate isSink(DataFlow::Node sink) { sink instanceof Sink }
predicate isBarrier(DataFlow::Node node) { node instanceof Barrier }
}Would it be possible (in the future, not as part of this PR) to refactor this so that we have a generic implementation that we can just parameterise over the module that's providing the Source, Sink, and Barrier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For queries where the expected test output has changed, it would be good to include some discussion in the PR description why this has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generally isn't obvious why. I know one cause is when there were more than one config in scope in the old version. I'm not sure how much value there is in tracking it down. I'm going to bet that in the corresponding PRs for java and C# they have similar changes and they haven't looked into what caused them. (Of course if there was a change in the results it would be different. It's just because the only change is in path nodes and edges.)
| predicate isBarrier(DataFlow::Node nd) { nd instanceof Sanitizer } | ||
| } | ||
|
|
||
| private module FindLargeLensFlow = TaintTracking::Global<FindLargeLensConfig>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note that this is private while FindLargeLensConfiguration is not. Is that an intentional change? If so, do we think that marking FindLargeLensConfiguration as deprecated and pointing at FindLargeLensFlow as an alternative makes sense?
Also, more generally, we have documentation for the old taint tracking Configuration classes, but we don't currently have any documentation for the new taint tracking modules. Should we have documentation for each of the new modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tended to make things private when they don't need to be public. An example of when they need to be public: an additional predicate in the config module is used in the .ql file. (Though it could be moved out of the module and made a top-level predicate.)
FindLargeLensFlow is a helper flow that is only used in this file. I think it is appropriate for it to be private. If someone was using FindLargeLensConfiguration then I guess they should make a copy of FindLargeLensFlow and the things it relies on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense that these could have been private to begin with. Although we don't know for sure whether e.g. FindLargeLensConfiguration has been used by anyone externally since it is public, we also don't change its visibility. Marking it as deprecated instead seems fair then, as long as the comment doesn't suggest to use the private replacement instead and notes what you noted in your reply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we get the edges here but didn't before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we get the edges here but didn't before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the differences?
| @@ -148,6 +150,115 @@ class ConversionWithoutBoundsCheckConfig extends TaintTracking::Configuration { | |||
| } | |||
| } | |||
|
|
|||
| /** Flow state for ConversionWithoutBoundsCheckConfig. */ | |||
| newtype MyFlowState = | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have a less generic name, e.g. IntegerConversionState?
| source = | ||
| any(Function f | f.getName() = ["getUntrustedString", "getUntrustedStruct"]) | ||
| .getACall() | ||
| .getResult() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note that the test previously ranged over all of UntrustedFlowSource but now is an inlined UntrustedSource. Do we miss out anything important by not having this range over all of UntrustedFlowSource anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - we're just trying to match the two functions used as sources in the test files. I can only imagine was written like this to be more like a real source definition.
| exists(Function fn | fn.hasQualifiedName(_, ["getUntrustedString", "getUntrustedBytes"]) | | ||
| source = fn.getACall().getResult() | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This previously ranged over all of UntrustedFlowSource, but is now more limited. Do we miss out on anything as a result of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I don't think it matters for tests.
| @@ -9,86 +9,88 @@ import ( | |||
| //go:generate depstubber -vendor k8s.io/apimachinery/pkg/runtime ProtobufMarshaller,ProtobufReverseMarshaller | |||
|
|
|||
| func source() interface{} { | |||
| return make([]byte, 1, 1) | |||
| return make([]byte, 1) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My IDE told me it was better to do it this way. Presumably it's a built-in Go linter of some kind.
| ) | ||
| } | ||
|
|
||
| int fieldFlowBranchLimit() { result = 1000 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment noting why you added this here for this particular configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't needed. I've removed it. I must have added it by mistake, probably copying from somewhere else.
I highly recommend reviewing commit by commit (and spreading it out over several sessions). All configurations have tests (normally of the related query). There are no changes in results, though sometimes there are changes in exact
PathNodes and edges between them.