question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Better unsafe regex detector

See original GitHub issue

Hi all,

I’m a systems/security researcher at Virginia Tech and have been studying the incidence of vulnerable regexes in the wild.

This plugin’s unsafe regex detector relies on safe-regex, which uses star height (nested quantifiers) to identify unsafe regexes.

Pros:

  1. safe-regex is fast.
  2. safe-regex is an npm module which makes it easy to work with.
  3. safe-regex has no non-JS dependencies.

As a result, safe-regex is great for CI use cases.

Cons:

  1. safe-regex is incorrectly implemented and substack is not maintaining it.
  2. safe-regex has lots of false positives (e.g. (ab+)+).
  3. safe-regex will only identify one type of exponential-time vulnerability, and ignores all polynomial-time vulnerabilities. In my research I found that, in the wild, polynomial-time vulnerabilities are far more common than exp-time vulnerabilities.

There are some alternatives to safe-regex that report exploit strings so you can tell if they’re correct or not.

  1. Rathnayake’s rxxr2. Like safe-regex, this only checks for star height-style vulnerabilities. But it doesn’t have false positives as far as I can tell.
  2. Wustholz’s REXPLOITER. This tests star height and other exp-time vulnerabilities, plus poly-time vulnerabilities.
  3. Weideman’s RegexStaticAnalysis. Like Wustholz’s REXPLOITER, but open-source and it works better.

Unfortunately:

  1. These alternatives all have non-JS dependencies (e.g. OCaml or Java) and have inconsistent interfaces.
  2. Some (especially Weideman) can take minutes to test a single regex.

My project vuln-regex-detector provides a convenient wrapper for these alternatives, and enforces time and memory limits to get results or fail relatively quickly.

However, I’d be surprised if developers were willing to wait even 30 seconds for linting. To address that, I’m nearly done implementing a server side so queries can be answered by hitting the server for a pre-computed answer instead of doing the expensive computation locally. The server processes not-seen-before queries in the background so subsequent queries will get a real answer.

Once that’s done, would you folks be interested in hitting my server first and falling back to safe-regex if my server hasn’t seen the query before? I’ve got a sample client that can be used with a one-line tweak for this use case.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:4
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
davisjamcommented, Sep 12, 2019

The analysis could be written in JavaScript, and I would happily incorporate it into safe-regex. I can point anyone interested in this in the right direction.

On Wed, Sep 11, 2019, 9:51 PM Matthew Herbst notifications@github.com wrote:

@davisjam https://github.com/davisjam is there a technical reason preventing the tools from being written in JS, or has that work just not been done by anyone?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodesecurity/eslint-plugin-security/issues/28?email_source=notifications&email_token=AFOD3L2QPZXEZ2NYGR73E3LQJGOCNA5CNFSM4EYDDY7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6QMV4Q#issuecomment-530631410, or mute the thread https://github.com/notifications/unsubscribe-auth/AFOD3L2EY7V6EJKQCHBZAO3QJGOCNANCNFSM4EYDDY7A .

2reactions
davisjamcommented, Sep 11, 2019

The server-side code is available, but would have to be run on the user’s side.

Unfortunately the existing advanced analyses are written in Java and OCaml, not JS.

I have half a million labeled regexes, so I suppose another option is to ship this database in safe-regex in compressed form as a “cache” of sorts.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Better unsafe regex detector · Issue #28 - GitHub
This plugin's unsafe regex detector relies on safe-regex, which uses star height (nested quantifiers) to identify unsafe regexes.
Read more >
es-lint-security flagging an unsafe regular expression
When y fails, the regex engine backtracks. The group has one iteration it can backtrack into. The second x+ matched only one x,...
Read more >
Dangerous Regular Expressions | Okta Security
In this post, I will talk about how regex is used in a security context, what can go wrong when regexes are not...
Read more >
safe-regex - npm
Detect potentially catastrophic exponential-time regular expressions by limiting the star height to 1. WARNING: This module has both false ...
Read more >
Regex Tester and Debugger Online - Javascript, PCRE, PHP
Regular Expression Tester with highlighting for Javascript and PCRE. Quickly test and debug your regex.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found