Code cleanup: Replace regex-strings by native regex
See original GitHub issueThroughout the code of highlight.js, there are many places where strings containing regular expressions are used instead of natvie regex-statements. For example:
// Common regexps
hljs.IDENT_RE = '[a-zA-Z]\\w*';
hljs.UNDERSCORE_IDENT_RE = '[a-zA-Z_]\\w*';
hljs.NUMBER_RE = '\\b\\d+(\\.\\d+)?';
I think the code would be much more readable by using native expressions. The primary gain would be to get rid of the confusing double backslashes (or quadruple backslashes in other cases).
// Common regexps
hljs.IDENT_RE = /[a-zA-Z]\w*/;
hljs.UNDERSCORE_IDENT_RE = /[a-zA-Z_]\w*/;
hljs.NUMBER_RE = /\b\d+(\.\d+)?/;
I actually tried this change and a lot of tests fail, so other code would have to be changed as well.
The question is: Do you think this is a valuable change? I’m not sure if I have the time to work on it, but I might if you want it.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
Remove all special characters with RegExp - Stack Overflow
As was mentioned in the comments it's easier to do this as a whitelist - replace the characters which aren't in your safelist....
Read more >Find and replace text using regular expressions - JetBrains
Open the search and replace pane Ctrl+R . Make sure that the Regex icon · In the search field enter the search pattern....
Read more >Regex in regex - Rust - Docs.rs
Compiles a regular expression. Once compiled, it can be used repeatedly to search, split or replace text in a string. If an invalid...
Read more >Regular Expression Improvements in .NET 7 - .NET Blog
System.Text.RegularExpressions has improved significantly in .NET 7. In this post, we'll deep-dive into many of its exciting improvements.
Read more >Regular expressions - JavaScript - MDN Web Docs
Executes a search for all matches in a string, and replaces the matched substrings with a replacement substring. split(), Uses a regular ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I think if we’re making “deep dives” into fixing lots of issues with a certain grammar (like the recent Handlebars work, etc.) and while we’re there (in a separate commit, to make review easy) it makes sense to turn strings into regexes (where no fancy string addition is necessary), I’d be ok with doing it as it comes up in the natural course of things.
But there are also 3rd party languages… so the “API” we offer now of serving up strings for the main library might stay the same for a long time, since that would be a breaking change (didn’t think of that earlier).
Or just do it organically… there are so many strings that a few more aren’t making my day significantly better or worse. 😃
If such a reason exists, I’m not aware of it.
Agree.