Rule proposal: no-catastrophic-backtracking
See original GitHub issuePlease describe what the rule should do:
Flag RegExps that are vulnerable to catastrophic backtracking like the one that took down CloudFlare’s global network for an hour.
See also https://www.regular-expressions.info/catastrophic.html.
What category of rule is this? (place an “X” next to just one item)
[X] Warns about a potential error (problem) [ ] Suggests an alternate way of doing something (suggestion) [ ] Enforces code style (layout) [ ] Other (please specify:)
Provide 2-3 code examples that this rule will warn about:
/.*.*=.*/
new RegExp('.*.*=.*')
// maybe even some basic things like this?
const keyValue = '.*=.*'
new RegExp(`.*${keyValue}`)
Why should this rule be included in ESLint (instead of a plugin)?
RegExps are a fundamental and commonly-used JavaScript feature and there is a serious risk of devs writing bad ones. People would be far less likely to discover or adopt this rule (especially into popular presets) if this rule were in an external plugin, and I want as many people as possible to be able to catch unsafe RegExps before they reach production.
Also, a rule like this may have been able to catch a potential catastrophic backtracking regex that one user discovered in eslint itself 😎
Are you willing to submit a pull request to implement this rule?
Yes, as long as I can depend on safe-regex
to perform the actual safety check (I can write the parsing logic to find regexps, evaluate some expressions in RegExp
constructors, and then pass the patterns to safe-regex
).
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (7 by maintainers)
Yes. Any regex’s super-linear behavior can be traced to a corresponding NFA with two properties:
Wustholz and Weideman independently defined the necessary and sufficient conditions to identify these properties in 2015-2017.
Their analyses and implementations only apply to a subset of regex features.
Weideman’s tool is open-source here.
@jedwards1211 I’m actually the maintainer of safe-regex now. The v2.0 release has my improvements (fixes some false negatives). I thought I made a PR to eslint to pick up v2.0, but maybe I misremember. Though I guess that was to update the security plugin rather than to eslint itself?
@not-an-aardvark
True. The documentation says that pretty clearly: “WARNING: This module has both false positives and false negatives.”
Correct. That regex is quadratic, not exponential-time. As advertised, the star height heuristic currently used in safe-regex only detects (some) exponential-time regexes. For more on heuristics (Star Height, QOD, and QOA), see section 5.1 in this paper.
This would be nice. The dicey bit is ensuring that changing the algorithm doesn’t change the semantics of the regex match. Many regex engines claim to have the same semantics but exhibit subtle differences – see section 7 of this paper (short version). Another option is to optimize the regex engines instead.
Another option, of course, is to use the RE2 bindings for JavaScript. But I know of at least one semantic difference between RE2 and Node.js’s built-in Irregexp engine, so that might be risky.