Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: query to automatically extract type definitions from library code #13750

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

alexrford
Copy link
Contributor

@alexrford alexrford commented Jul 14, 2023

Currently limited and WIP - this automatically extracts some type definitions from a library codebase in a MaD amenable format.

Predicates

5 query predicates are added: sourceModel, sinkModel, summaryModel, typeModel, and typeVariableModel. These correspond to the extensible predicates in ApiGraphModelsExtensions.qll. Of these, currently only typeModel is populated.

There are 3 major predicates that contribute to the typeModel output:

typeModelReturns

This predicate looks at returned values from methods in the codebase and determines if they return an instance of some class that we can find. For instance, in the case of some code like:

class Foo
  def get_bar
    Bar.new
  end

  def get_bar_indirectly
    get_bar
  end
end

class Bar
end

We can determine that the return value from get_bar or get_bar_indirectly may be a Bar instance, giving us typeModel rows of:

type1 type2 path
Bar Foo Method[get_bar].ReturnValue
Bar Foo Method[get_bar_indirectly].ReturnValue

typeModelParameters

This predicate looks at cases where we call some method and pass it an expression with a known type as an argument e.g. in

class Foo
  def execute_with_database(db, action)
    ...
  end
end
...

db = Some::Database.new
foo = Foo.new

foo.execute_with_database(db, "SELECT name FROM users")

we get:

type1 type2 path
Some::Database Foo Method[execute_with_database].Parameter[0]

typeModelBlockArgumentParameters

This looks at cases where a method calls yield or block.call in order to invoke its block argument, and passes an argument of a known type to that block e.g.
e.g. in

class Foo
  def initialize
    // do some initialization...
    if block_given?
      yield self
    end
  end
end
...

In this case, we can determine that the initialize method (which will usually be invoked via Foo.new() will call its block argument with an instance of Foo as the first argument to that block, if a block argument is provided. This gives us:
we get:

type1 type2 path
Foo Foo! Method[new].Argument[block].Parameter[0]

where the ! suffix indicates a call against the class object rather than an instance of that class.

Tooling

This PR also adds a very (very) rough python script at ruby/scripts/generate_model.py. Pointing this script at a CodeQL Ruby database will run GenerateModel.ql against that database with options to output to a file.


This is heavily WIP at the moment - many of the limitations are TODOs in the code, but as a quick list of things that are not yet supported or need additional work:

  • singleton methods
  • non-positional parameters
  • parameters to block arguments
  • non-positional parameters to block arguments
  • better tracking of types of DataFlow::ExprNodes using ApiGraphs - struggled with this in practice due to various performance and implementation issues
  • cases where there are multiple potential access paths to a value (improved, still not perfect)
  • construction of access paths in general is rough - possibly API graphs can help here (extension work)
  • heuristic ways to determine types outside of calls to SomeClass.new (extension work)

@alexrford alexrford added WIP This is a work-in-progress, do not merge yet! Ruby labels Jul 14, 2023
@alexrford alexrford force-pushed the rb/extract-types-experiment branch from 035f053 to ae36bba Compare July 20, 2023 14:17
@alexrford alexrford force-pushed the rb/extract-types-experiment branch from f7aecd2 to 4089bc5 Compare July 21, 2023 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ruby WIP This is a work-in-progress, do not merge yet!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant