Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS/PY/RB: get ReDoSUtil in sync for ruby #7173

Merged
merged 3 commits into from Nov 24, 2021
Merged

Conversation

@erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Nov 18, 2021

No description provided.

@erik-krogh erik-krogh marked this pull request as ready for review Nov 18, 2021
@erik-krogh erik-krogh requested review from as code owners Nov 18, 2021
@esbena
Copy link
Contributor

@esbena esbena commented Nov 19, 2021

From the PR title, it sounds to me like this only changes the Ruby implementation, but that is clearly not the case. Can you elaborate a bit?

Loading

@nickrolfe
Copy link
Contributor

@nickrolfe nickrolfe commented Nov 19, 2021

When I added the ReDoS queries for Ruby, I started from the existing Python parser and the JS/Python ReDoSUtil.qll, but I found the only forms of character class they handled were simple escapes like \d and \w. I had to extend them to support POSIX character classes like [[:digit:]] and the similar \p{} construct.

I haven't reviewed @erik-krogh's changes closely yet, but it looks he's generalised the concept of a character class so that the code can be shared with Ruby without losing support for those constructs. Is that right?

Edit: if that is correct, then this also paves the way for supporting the \N{name} syntax added in Python 3.8.

Loading

@erik-krogh
Copy link
Contributor Author

@erik-krogh erik-krogh commented Nov 19, 2021

I haven't reviewed @erik-krogh's changes closely yet, but it looks he's generalised the concept of a character class so that the code can be shared with Ruby without losing support for those constructs. Is that right?

That is exactly right.
I introduced a new predicate isEscapeClass(RegExpTerm term, string clazz), which is used instead of RegExpCharacterClassEscape.

The new predicate generalizes escape classes, and "normalizes" them to \d, \s, \w.
Each language must implement this new predicate.
The JS/Python implementations are currently trivial (and identical), but the Ruby implementation maps e.g. [[:digit:]] to \d.

Loading

Copy link
Contributor

@nickrolfe nickrolfe left a comment

Just a typo in a comment, but otherwise looks great. Thanks for doing this!

Loading

Loading
Co-authored-by: Nick Rolfe <nickrolfe@github.com>
esbena
esbena approved these changes Nov 24, 2021
@erik-krogh erik-krogh merged commit 3bab8c6 into github:main Nov 24, 2021
20 checks passed
Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants