Start sharing Concepts across dynamic languages#8307
Start sharing Concepts across dynamic languages#8307hmac wants to merge 20 commits intogithub:mainfrom
Conversation
ab88a4d to
8f6470f
Compare
| import RouteHandler_getAResponseHeader | ||
| import HeaderDefinition_defines | ||
| import SystemCommandExecution | ||
| import SystemCommandExecutions |
There was a problem hiding this comment.
I changed this because the previous name generated a warning in the test output, that it clashed with the module of the same name in ConceptsSpecific.qll.
| /** | ||
| * Holds if this expression flows into `sink` in zero or more (possibly | ||
| * inter-procedural) steps. | ||
| */ |
There was a problem hiding this comment.
Not related to this PR, but it was causing a QLDoc test failure.
|
This may not be the approach we want to take, but I think it's a reasonable one and can help continue the discussion of how we want to share this stuff in practice. I'd be keen to hear your thoughts! @github/codeql-ruby @github/codeql-javascript @github/codeql-python |
|
Splendid work overall! I do have some concerns about breaking changes though.
The Another point to consider is that the
Some of the refactorings to a /** DEPRECATED. Extend `CommandExecution` or `CommandExecution::Range` instead. */
deprecated class SystemCommandExecution = CommandExecution::Range;
class CommandExecution { .. }
module CommandExecution {
abstract class Range { ... }
}It sucks having to arbitrarily rename classes, but it also sucks to break things for our users. |
That makes sense for this scenario. I wonder, what should we do if we want to share query-related files like this one? Simple files like this are 90% boilerplate and often just depend on
I agree. Do we have any policy on how long we keep deprecated classes around before completing such a breaking change? It would be sad if we had to keep such things around forever, and we can't have that many users that it's literally impossible for them to all migrate, surely? In Ruby I reckon we have around ~0 which makes it a bit easier! |
|
|
Assuming we have to rename all the existing classes in the JS Then for the remaining three: I think we should leave these in This PR will only tackle the first four names, of course. How does that sound? The Python and Ruby concepts already all use the |
|
(Sorry about the ping there, I think it's because I updated some class names in documentation) |
d785e08 to
0d8aff3
Compare
|
I haven't had a lot of bandwidth to look at this yet, but I am very interested in getting this to work 👍 |
476f592 to
763d6a1
Compare
To revisit this, I realise we can use the exact same pattern, e.g. as done here. So I've removed the // ruby/ql/lib/codeql/ConceptsSpecific.qll
module Imports {
import codeql.ruby.DataFlow
}
module Concepts {
...
}
// ruby/ql/lib/codeql/Concepts.qll
private import ConceptsSpecific::Imports
import ConceptsSpecific::ConceptsI think this does what we need without polluting the global scope with either unnecessary extra files or extra modules. |
|
Thanks so much for taking initiative on this 🔥 💪 🙏 Strategy for sharing concepts
I'm not 100% convinced that this strategy for sharing concepts is the right one. After a bit of standardizing, we'll have a beefy Concepts.qll with many items. But what will then happen when we start adding support for a new language? No concepts will have any concrete models initially, and I assume that it will take some time to fill in the gaps. This means that potential customers that write QL code will see these concepts being available, but not being able to use them (since they will produce no results) 😬 Another point is that maybe not all concepts even makes sense for a specific language; the most prominent example I can think of is When I thought about how to share concepts initially, I thought about an other solution: Define each concept in shared file, such as import concepts.SqlExecution as SharedSqlExecution
class Range extends SharedSqlExecution::SqlExecution instanceof SharedSqlExecution::SqlExecution::Range {
predicate additionalPredicateOnlyNeededInThisLanguage() {
super.additionalPredicateOnlyNeededInThisLanguage()
}
}
module SqlExecution {
abstract class Range extends SharedSqlExecution::SqlExecution::Range {
abstract predicate additionalPredicateOnlyNeededInThisLanguage();
}
}Although I'm not 100% convinced whether that's a good idea 🤔 Problems with concepts I've seen
I think we need to aim for building a solution that is flexible enough that you will not be blocked for weeks just because you want to change an existing concept slightly, but also a solution that encourages "upstreaming" changes, and not just leaving them in your own version. PRO/CON comparisonThis is an attempt to get my thoguhts and feelings down on the two proposed solutions (although there might be others) Solution 1: Shared
|
This prevents a circular dependency when we override shared concepts, which leads to ambiguous name resolution errors.
Update a member predicate of SystemCommandExecution to match the naming in the JS version.
Also add a member predicate `isShellInterpreted`, and rename `getCommand` to `getACommandArgument`. This brings it in line with the JS version.
This brings it in line with the Ruby and Python versions.
This aligns it with the approach taken in Python and Ruby.
This lays the groundwork to override the CommandExecution concept.
Move the isSync and getOptionsArg predicates to a JS-specific version of the CommandExecution concept, as they are not used in Python or Ruby.
763d6a1 to
4f96871
Compare
|
@asgerf, to keep you in the loop:
Moving forward, Before we merge, we should add a some architecture documentation to the top of |
RasmusWL
left a comment
There was a problem hiding this comment.
With the current way the changes are introduced in this PR, we would be breaking our deprecation policy 😳
| "python/ql/lib/semmle/python/ConceptsShared.qll", | ||
| "ruby/ql/lib/codeql/ruby/ConceptsShared.qll", | ||
| "javascript/ql/lib/semmle/javascript/ConceptsShared.qll" |
There was a problem hiding this comment.
Can we move this to an internal location, such as "python/ql/lib/semmle/python/internal/ConceptsShared.qll"? -- then it's obvious that end-users and query writers should not use this file directly
We can also move ConceptsImports to that location.
| /** DEPRECATED: use `CommandExecution::Range` instead. */ | ||
| deprecated class SystemCommandExecution = CommandExecution::Range; |
There was a problem hiding this comment.
Can we move the deprecated aliases to the language specific Concepts.qll files instead? Then we don't introduce deprecated aliases for new languages that adopt ConceptsShared 😉
|
|
||
| /** Gets the argument that specifies the command to be executed. */ | ||
| DataFlow::Node getCommand() { result = range.getCommand() } | ||
| DataFlow::Node getACommandArgument() { result = super.getACommandArgument() } |
There was a problem hiding this comment.
This rename is not ok in itself, since we need to follow standard deprecation policy for getCommand ...
| abstract class Range extends DataFlow::Node { | ||
| /** Gets the argument that specifies the command to be executed. */ | ||
| abstract DataFlow::Node getCommand(); | ||
| abstract DataFlow::Node getACommandArgument(); |
| * for instance by spawning a new process. | ||
| */ | ||
| abstract class SystemCommandExecution extends DataFlow::Node { | ||
| class SystemCommandExecution extends DataFlow::Node instanceof SystemCommandExecution::Range { |
There was a problem hiding this comment.
this change in itself is also not compatible with deprecation policy 😞
|
I tried to rewrite the commits so we would follow deprecation policy (like I pointed out in review above). That spurred me to look closer at the suggested changes. I specifically looked at CommandExecution, and how it was used in the 3 languages, where I found some inconsistencies. Overall I'm not sure we want the shared concept to contain as many member-predicates initially (at least not without reworking the queries in both Python and Ruby). When we have come to an agreement on how we want to share concepts, as highlighted by @esbena here, I would propose we split this PR up into:
If you're interested in the details of my investigation, see details below. Detailsinvestigation of how command execution concept is usedCurrently, on main: Ruby
|
|
👍🏻 to the split of this PR. |
agreed 👌 |
|
Superseded by #8476 |
This PR attempts to move towards the sharing of
Concepts.qllacross Ruby, Python and JavaScript. To achieve this it adds two main things:Re-export common libraries at common paths
The classes in
Concepts.qllall depend on at least the language's dataflow library. This is currently imported using something likeimport codeql.ruby.DataFloworimport semmle.javascript.dataflow.DataFlow. In order to shareConcepts.qll, we need common paths for these libraries across the languages.I've chosen
codeqlas the common prefix for this files, so we havecodeql.DataFlowcodeql.Conceptscodeql.TaintTrackingNot all of these files are needed for the rest of the changes in this PR, but they will allow us to share common queries in the future.
Introduce
ConceptsSpecific.qllThis module contains concept classes which are specific to the language. What is left in
Concepts.qllare classes that are shared across all three languages. Over time, as we standardise our concepts, the idea is we move classes fromConceptsSpecifictoConcepts.I've started by sharing the following concepts:
FileSystemAccessFileSystemReadAccessFileSystemWriteAccessSystemCommandExecutionThis required changing a few member predicate names to be consistent across the languages, and using the
Rangepattern in JS, as we do in Ruby and Python.This PR is structured to be reviewed commit-by-commit. I've tried to make it clear that nothing has changed when moving everything from
ConceptstoConceptsSpecificby renaming the file and re-adding the original.To do