Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: model the standard library's Pathname class #9708

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nickrolfe
Copy link
Contributor

@nickrolfe nickrolfe commented Jun 24, 2022

No description provided.

@github-actions github-actions bot added the Ruby label Jun 24, 2022
@nickrolfe nickrolfe added the no-change-note-required label Jun 27, 2022
@nickrolfe nickrolfe marked this pull request as ready for review Jun 27, 2022
@nickrolfe nickrolfe requested a review from as a code owner Jun 27, 2022
@hmac
Copy link
Contributor

@hmac hmac commented Jun 28, 2022

I think could be a case where Models as Data comes in handy. A lot of the flow summaries here match on just the method name, which I worry could cause false positives. If we use type summaries we can construct a concept of what a Pathname instance is, and then target just those instances. For example:

  private class PathnameTypeSummary extends ModelInput::TypeModelCsv {
    override predicate row(string row) {
      // package1;type1;package2;type2;path
      row =
        [
          // Pathname.new : Pathname
          ";Pathname;;;Member[Pathname].Instance",
          // pathname.join(path) : Pathname
          ";Pathname;;Pathname;Method[join].ReturnValue"
        ]
    }
  }

  private class PathnameTaintSummary extends ModelInput::SummaryModelCsv {
    override predicate row(string row) {
      row =
        [
          // Pathname.new(path)
          ";;Member[Pathname].Method[new];Argument[0];ReturnValue;taint",
          // pathname.join
          ";Pathname;Method[join];Argument[self,any];ReturnValue;taint",
          // pathname.parent
          ";Pathname;Method[parent];Argument[self];ReturnValue;taint",
        ]
    }
  }
}

This will pass the following test:

def m_join
	a = Pathname.new(source 'a')
	b = Pathname.new('foo')
	c = Pathname.new(source 'c')
	sink a.join(b, c) # $ hasTaintFlow=a $ hasTaintFlow=c
end

def m_parent
	a = Pathname.new(source 'a')
	sink a.parent() # $ hasTaintFlow=a
	
	b = a.join("foo")
	sink b.parent() # $ hasTaintFlow=a
end

The last example, where we call parent on the result of join call, wouldn't work (I think) with just API Graphs.

@nickrolfe
Copy link
Contributor Author

@nickrolfe nickrolfe commented Jun 29, 2022

@hmac thank you – you have no idea how useful your comment was! I had been quietly worried about the flow summaries matching only on the method name, but wasn't sure we had any way of dealing with that. And now, based on your example, I understand how to write MaD-based summaries.

I've replaced the flow summaries with MaD-based ones, beefed them up to cover a few more methods, and also bulked up the test.

It seems like PathnameInstance and PathnameTypeSummary are now trying to do pretty similar things. I wonder if it's possible to avoid that duplication.

@@ -14,6 +14,7 @@ import core.Hash
import core.String
import core.Regexp
import core.IO
import core.Pathname
Copy link
Contributor

@hmac hmac Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super pedantic nit, but I think pathname is technically an extension to the standard library. It's vendored here but I think the canonical source is https://github.com/ruby/pathname. So it would be slightly more correct to move core/Pathname.qll to stdlib/Pathname.qll and move this import to Stdlib.qll.

* Every `PathnameInstance` is considered to be a `FileNameSource`.
*/
class PathnameInstance extends FileNameSource, DataFlow::Node {
PathnameInstance() { this = pathnameInstance() }
Copy link
Contributor

@hmac hmac Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean about duplication between pathnameInstance and the type summary. This seems to work:

PathnameInstance() {
  this = ModelOutput::getATypeNode("", "Pathname").getAValueReachableFromSource()
}

Copy link
Contributor Author

@nickrolfe nickrolfe Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's good to know, but it doesn't seem to follow assignments to other variables, so the test I added for the file-related concepts has some failures. I think I'll leave my existing implementation of PathnameInstance for now.

@hmac
Copy link
Contributor

@hmac hmac commented Jun 30, 2022

It would be interesting to see if we get any DCA change with the new FileNameSource and FileSystemAccess instances. Ah I'm blind - I see you've already done one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required Ruby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants