JS: Add individual per-security-query counting queries #9193

TomBolton · 2022-05-17T12:33:00Z

The ATM project uses an AML pipeline to select which CodeQL databases to use as the evaluation databases for each security query being boosted.

Previously, evaluation sets for all security queries would be produced at once, using the CountAlertsAndEndpoints.ql query. However, this query does not scale with adding more security queries, and it is not efficient to create evaluation sets for all queries during every pipeline run (when adding a new query, you simply want to create an evaluation set for that one query, and not the others).

Therefore, after a discussion, it was deemed better to specify one security query that you would like to create an evaluation set for. For this, we would need a way to count alerts and endpoints for a single query. The most simple solution is to add per-query counting queries, Count{QUERY_NAME}.ql, which can be used by the selection pipeline.

The selection pipeline has been tested with the CountCodeInjection.ql query.

esbena · 2022-05-17T13:06:23Z

This introduces a lot of duplicate codeql code that can be avoided using imports and query predicates.

I think each query could look something like this:

import semmle.javascript.security.dataflow.CodeInjectionQuery
import CountThings

Where CountThings.qll looks something like:

import javascript
import evaluation.EndToEndEvaluation

query predicate countThings(int numAlerts, int numSinks) { 
  numAlerts =
    count(DataFlow::Node source, DataFlow::Node sink |
      cfg.hasFlow(source, sink) and not isFlowExcluded(source, sink)
    ) and
  numSinks = count(DataFlow::Node sink |
    exists(DataFlow::Configuration cfg | cfg.isSink(sink) or cfg.isSink(sink, _))
  )

Could you consider if such a refactoring is possible / worth it?

TomBolton · 2022-05-17T14:14:51Z

Yes, good point @esbena and I definitely think it's worth it.

Basic QL question: what would the individual queries look like? I.e. how would they actually use the CountThings import? I've checked the documentation and it's not clear how to import and use a predicate

esbena · 2022-05-17T14:35:16Z

The trick is that query predicate contributes to the results of the query file when imported. You can view it as an importable from/where/select

esbena · 2022-05-17T14:37:33Z

Ah, there was a critical typo: I have updated the example file name to end with qll now!

TomBolton · 2022-05-17T15:30:53Z

I've just attempted the suggestion refactor @esbena - would you mind giving it another look when you have time?

TomBolton · 2022-05-23T14:19:35Z

@github/codeql-ml-powered-queries-reviewers if anyone has a spare moment, could someone provide a quick review?

esbena

the as CodeInjection/TaintedPath/... namings can be dropped now.
Otherwise LGTM.

TomBolton · 2022-05-24T14:04:56Z

Thanks for the review @esbena - I've now removed the as in the imports.

I accidentally dismissed your approval, would you mind approving again sorry?

esbena

(Your edit caused the approval dismissal)

TomBolton added the JS label May 17, 2022

TomBolton requested a review from a team May 17, 2022 12:33

github-actions bot removed the JS label May 17, 2022

esbena previously approved these changes May 23, 2022

View reviewed changes

TomBolton dismissed esbena’s stale review via 5b34741 May 24, 2022 14:01

TomBolton added 3 commits May 24, 2022 15:02

add individual per-security-query counting queries

3396438

refactor counting code into a library

7e32614

simplify imports in counting queries

91fa17a

TomBolton force-pushed the tombolton/add-counting-queries branch from 5b34741 to 91fa17a Compare May 24, 2022 14:02

esbena approved these changes May 25, 2022

View reviewed changes

TomBolton merged commit 67572bb into main May 25, 2022

TomBolton deleted the tombolton/add-counting-queries branch May 25, 2022 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JS: Add individual per-security-query counting queries #9193

JS: Add individual per-security-query counting queries #9193

TomBolton commented May 17, 2022

esbena commented May 17, 2022 •

edited

Loading

TomBolton commented May 17, 2022

esbena commented May 17, 2022

esbena commented May 17, 2022

TomBolton commented May 17, 2022

TomBolton commented May 23, 2022

esbena left a comment

TomBolton commented May 24, 2022

esbena left a comment

JS: Add individual per-security-query counting queries #9193

JS: Add individual per-security-query counting queries #9193

Conversation

TomBolton commented May 17, 2022

esbena commented May 17, 2022 • edited Loading

TomBolton commented May 17, 2022

esbena commented May 17, 2022

esbena commented May 17, 2022

TomBolton commented May 17, 2022

TomBolton commented May 23, 2022

esbena left a comment

Choose a reason for hiding this comment

TomBolton commented May 24, 2022

esbena left a comment

Choose a reason for hiding this comment

esbena commented May 17, 2022 •

edited

Loading