Skip to content

Ruby: add rb/sensitive-get-query query#10369

Merged
alexrford merged 26 commits into
github:mainfrom
alexrford:rb/sensitive-get-query
Oct 14, 2022
Merged

Ruby: add rb/sensitive-get-query query#10369
alexrford merged 26 commits into
github:mainfrom
alexrford:rb/sensitive-get-query

Conversation

@alexrford

@alexrford alexrford commented Sep 10, 2022

Copy link
Copy Markdown
Contributor

Finds cases where an HTTP GET request handler takes sensitive input, such as a password or other credential, from the query string of the request. The bulk of this PR concerns porting SensitiveNode to Ruby.

This uses for HTTP::Server::RequestHandlers, which is currently implemented for Rails ActionController and the ruby graphql gem. In the case of GraphQL, both POST and GET requests are by default both accepted and handled uniformly - the string getAnHttpMethod() implementation is trivial here as I couldn't find a documented way to change this behaviour.

@github-actions

github-actions Bot commented Sep 10, 2022

Copy link
Copy Markdown
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-598/SensitiveGetQuery.qhelp

Sensitive data read from GET request

Sensitive information such as passwords should not be transmitted within the query string of the requested URL. Sensitive information within URLs may be logged in various locations, including the user's browser, the web server, and any proxy servers between the two endpoints. URLs may also be displayed on-screen, bookmarked or emailed around by users. They may be disclosed to third parties via the Referer header when any off-site links are followed. Placing sensitive information into the URL therefore increases the risk that it will be captured by an attacker.

Recommendation

Use HTTP POST to send sensitive information as part of the request body; for example, as form data.

Example

The following example shows two route handlers that both receive a username and a password. The first receives this sensitive information from the query parameters of a GET request, which is transmitted in the URL. The second receives this sensitive information from the request body of a POST request.

Rails.application.routes.draw do
  get "users/login", to: "#login_get" # BAD: sensitive data transmitted through query parameters
  post "users/login", to: "users#login_post" # GOOD: sensitive data transmitted in the request body
end
class UsersController < ActionController::Base
  def login_get
    password = params[:password]
    authenticate_user(params[:username], password)
  end

  def login_post
    password = params[:password]
    authenticate_user(params[:username], password)
  end

  private
  def authenticate_user(username, password)
    # ... authenticate the user here
  end
end

References

Comment thread ruby/ql/lib/codeql/ruby/frameworks/GraphQL.qll Fixed
Comment thread ruby/ql/src/queries/security/cwe-598/SensitiveGetQuery.ql Fixed
@github github deleted a comment from ThanhVy1121 Sep 12, 2022
@alexrford alexrford marked this pull request as ready for review September 19, 2022 19:57
@alexrford alexrford requested a review from a team as a code owner September 19, 2022 19:57
@calumgrant calumgrant requested a review from hmac September 20, 2022 08:38

@hmac hmac left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I have a few small comments and I also think we should do a DCA run, but otherwise 👍.

|
localFlowWithElementReference(mid, to)
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a normal dataflow configuration-style query work here? Or is local flow a better choice?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably try this out with a full TaintTracking configuration as well to see if there's any difference in performance or results. In the JS version it seems like using global dataflow/taint tracking didn't improve results, but this may be different for Ruby. FWIW Java uses full taint tracking for its version of this query.

CredentialsMethodName() { nameIndicatesSensitiveData(this, classification) }

override SensitiveDataClassification getClassification() { result = classification }
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've ported this from the JS version, it makes me wonder: is there value in these abstract classes being part of some shared Concept? It's a lot of extra work so I'm ok if we don't do it now, but it might be worth talking about.

@alexrford alexrford Sep 23, 2022

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that would be sensible. When I was working on this PR I ran into some issues where the JS version of this class had been moved from the AST layer to the DataFlow layer. I only noticed this change by chance, but it would have been easier to notice and stay synched on if there had been some shared Concept used by both.


override string getFramework() { result = "ActionController" }

override string getAnHttpMethod() { result = this.getARoute().getHttpMethod() }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see that the routes modelling is coming in useful in some places!

Comment thread ruby/ql/lib/codeql/ruby/security/SensitiveGetQueryQuery.qll Fixed
Comment thread ruby/ql/src/queries/security/cwe-598/SensitiveGetQuery.ql Fixed
@alexrford

Copy link
Copy Markdown
Contributor Author

Finally got around to updating this PR. There are a couple of main changes:

  • Switched to using full taint tracking rather than local flow. I've kicked off a DCA run to see how this affects results/performance. I've used the Query/Customization pattern to implement this, which leaves us with an unfortunately named SensitiveGetQueryQuery.qll library. I converted the query to a path-problem as part of this change, which is not consistent with the JS version, but it seems worth using the path graph if we have it.
  • Fixed an issue where data from headers in the request headers were also treated as if they were transmitted as plaintext, where the other versions of this query are correctly limited to just query params. This required adding a string RequestInputAccess#getKind() predicate.

@alexrford

Copy link
Copy Markdown
Contributor Author

I've rolled the taint tracking changes back since this had a fairly large performance impact and the results that it found should all be picked up with local flow anyway.

@alexrford

Copy link
Copy Markdown
Contributor Author

I've rolled the taint tracking changes back since this had a fairly large performance impact and the results that it found should all be picked up with local flow anyway.

I spoke too soon - the performance impact seems to be due to something else. I'll look into this properly tomorrow.

@alexrford

Copy link
Copy Markdown
Contributor Author

I think the bulk of the performance impact (~3.5% increase in analysis time) is from computing the new SensitiveNodes. I've not been able to make any significant improvements here. I'm conflicted over if this regression is acceptable - the SensitiveNode supports 5 different queries in JS of which I think 4 can be ported to Ruby whilst reusing this class. As it stands we only use this class for rb/sensitive-get-query in this PR.

@aibaars aibaars left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work; I have some comment and questions.

<qhelp>
<overview>
<p>
Sensitive information such as user passwords should not be transmitted within the query string of the requested URL.

@aibaars aibaars Oct 13, 2022

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean passwords or user names and passwords. While user passwords is a correct term, I think transmitting non-user passwords would be equally problematic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just passwords makes more sense here. We don't report user names here as they're not considered sensitive in this context.

Comment thread ruby/ql/src/queries/security/cwe-598/SensitiveGetQuery.qhelp Outdated
Comment thread ruby/ql/lib/codeql/ruby/Concepts.qll Outdated
* Gets the kind of the accessed input,
* Can be one of "parameter", "header", "body", "url", "cookie".
*
* Note that this predicate is functional.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this Note ? I guess functional in this context indicates that the predicate returns exactly one result. This is kind of implied in the name of the predicate get instead of getA and the Gets the and lack of if any suffix in the comment.

I think the note may confuse users and make them wonder if there are any predicates that do not work/function somehow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to back out of the kind related changes here as #10602 implements this in a neater way. FWIW this was more or less copied from the JS version of this, my understanding of functional in this sense was that the return values could be relied on to have a precise semantic meaning. This is in contrast to string getSourceType() which is a bit fuzzier in what it might return.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for conflicting with your changes here! When we talked about it in the meeting, I misunderstood and thought your changes were already merged, so when I couldn't find them I assumed they were for a different concept and so I would have to add them in my PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for conflicting with your changes here! When we talked about it in the meeting, I misunderstood and thought your changes were already merged, so when I couldn't find them I assumed they were for a different concept and so I would have to add them in my PR.

No worries, thanks for the improved implementation. I've updated this PR to use it.

Comment thread ruby/ql/lib/codeql/ruby/Concepts.qll Outdated
* Gets the kind of the accessed input,
* Can be one of "parameter", "header", "body", "url", "cookie".
*
* Note that this predicate is functional.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gone now due to pulling in the changes from the other PR.

*/
pragma[nomagic]
private predicate writesProperty(DataFlow::Node node, string name) {
exists(VariableWriteAccess vwa | vwa.getVariable().getName() = name |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly instance and class variables have @ and @@ prefixes in their names, do we need to account for those here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters in this case since nameIndicatesSensitiveData matches against some fairly permissive regexps (no anchors). This does remind me that at some point we discussed making Variable#getName() strip the prefixes from class/instance variables, but we didn't end up making this change. It's a bit of a shame as I think that's probably better default behaviour.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I take that back - the heuristics exclude anything containing non-alphanumerics. It might be worth changing nameIndicatesSensitiveData to ignore leading @ characters as we deal with variable names in multiple places in this file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nameIndicatesSensitiveData is in a shared file dealing with cross language heuristics, so I've updated the places in this file that deal with variable accesses instead.

class UsersController < ApplicationController

def login_get
password = params[:password] # BAD: route handler uses GET query parameters to receive sensitive data

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you also add a test case with an instance variable? Like

@password = params[:some_thing]
authenticate_user(params[:username], @password)

class UsersController < ApplicationController

def login_get
password = params[:password] # BAD: route handler uses GET query parameters to receive sensitive data

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think you best change the test case in one that uses params[:password] but with a non-sensitive variable name, and another with a non-sensitive query parameter name and a sensitive variable name.

@alexrford

alexrford commented Oct 13, 2022

Copy link
Copy Markdown
Contributor Author

I'll update this tomorrow to reflect the kind changes from the other PR before running DCA again.

@alexrford

Copy link
Copy Markdown
Contributor Author

Updated to address review comments. DCA has a couple of FPs that are interesting, as they look to be safe by virtue of appearing in an if statement body that checks for request.post? or request.get?.

@aibaars aibaars left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, perhaps we should implement some sanitizers/barriers for if request.post? checks.

@alexrford

Copy link
Copy Markdown
Contributor Author

LGTM, perhaps we should implement some sanitizers/barriers for if request.post? checks.

Do you think that this is a blocker for the current PR?

@alexrford alexrford merged commit 2c5129e into github:main Oct 14, 2022
@alexrford alexrford deleted the rb/sensitive-get-query branch October 14, 2022 21:35
@aibaars

aibaars commented Oct 17, 2022

Copy link
Copy Markdown
Contributor

LGTM, perhaps we should implement some sanitizers/barriers for if request.post? checks.

Do you think that this is a blocker for the current PR?

No problem ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants