Anti-slash not properly handled
See original GitHub issueSteps to reproduce.
When trying to apply markjs to the following text: www.happy.com\
, the anti slash will prevent marking to work properly.
mark.mark("www.happy.com\\", { "separateWordSearch": false, "accuracy": "complementary", "ignoreJoiners": true });
Trying to add the (escaped) anti-slash as a synonym will not help:
mark.mark("www.happy.com\\", { "separateWordSearch": false, "accuracy": "complementary", "ignoreJoiners": true, "synonyms": { "\\": " " } });
Worse, when I add other synonyms (depending on how many and which), it will throw an exception:
mark.mark("www.happy.com\\", { "separateWordSearch": false, "accuracy": "complementary", "ignoreJoiners": true, "synonyms": { "-": " ", ".": " ", "+": " ", "\\": " " } });
mark.js:555 Uncaught SyntaxError: Invalid regular expression: /()([^\s!"#\$%&'\(\)\*\+,\-\.\/:;<=>\?@\[\\\]\^_`\{\|\}~¡¿]*w[\u00ad\u200b\u200c\u200d]*w[\u00ad\u200b\u200c\u200d]*w[\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*([\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*.|[\u00ad\u200b\u200c\u200d]*([\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*+|[\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*)[\u00ad\u200b\u200c\u200d]*)[\u00ad\u200b\u200c\u200d]*h[\u00ad\u200b\u200c\u200d]*[aàáảãạăằắẳẵặâầấẩẫậäåāąAÀÁẢÃẠĂẰẮẲẴẶÂẦẤẨẪẬÄÅĀĄ][\u00ad\u200b\u200c\u200d]*p[\u00ad\u200b\u200c\u200d]*p[\u00ad\u200b\u200c\u200d]*[yýỳỷỹỵÿYÝỲỶỸỴŸ][\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*([\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*.|[\u00ad\u200b\u200c\u200d]*([\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*+|[\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*)[\u00ad\u200b\u200c\u200d]*)[\u00ad\u200b\u200c\u200d]*[cçćčCÇĆČ][\u00ad\u200b\u200c\u200d]*[oòóỏõọôồốổỗộơởỡớờợöøōOÒÓỎÕỌÔỒỐỔỖỘƠỞỠỚỜỢÖØŌ][\u00ad\u200b\u200c\u200d]*m[\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*(\\|[\s]+)[\u00ad\u200b\u200c\u200d]*[^\s!"#\$%&'\(\)\*\+,\-\.\/:;<=>\?@\[\\\]\^_`\{\|\}~¡¿]*)/: Nothing to repeat
at new RegExp (<anonymous>)
at handler (mark.js:555)
at Mark.mark (mark.js:582)
at window.Mark.mark (mark.js:1022)
Issue Analytics
- State:
- Created 6 years ago
- Comments:12 (7 by maintainers)
Top Results From Across the Web
[bug] Backslashes in path not handled properly by 'output_dirs ...
The problem is you're not escaping the backslash properly in your python code. Anyway, you can use both a backslash or a simple...
Read more >Lexer rule to handle escape of quote with quote or backslash ...
The problem is handling the \ properly. Bart found the path through the ATN that I missed and allowed it to match the...
Read more >50 Properly handling backslash character using OpenCSV
Overwriting default escape character of CSVReader with null character: CSVParser csvParser = new CSVParserBuilder().
Read more >SQL LIKE does not handle backslashes correctly - Apache
Try the following in SQL shell: select '\\\\' like '%\\%'; It returned false, which is wrong. cc: yhuai joshrosen A false-negative ...
Read more >Properly handling backslashes using OpenCSV - GeekPrompt
The root cause: By default CSVReader is using backslash ('\') as escape character. Whereas CSVWriter is using a double quote('”') as escape ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I did some investigation. It looks like we need to doubly escape special regular expression characters.
That’s why only the
.
was replaced in thetest\.com
string to producetest\(\.|p)com
.I’ve fixed the problem and will add a PR.
I also wanted to discuss the messiness of the regular expression when there is more than one synonym; but I’ll open a new issue for that.
I’ve just released v8.11.1 with @Mottie’s patch for this issue. Thanks for your help @Mottie and thanks for reporting @mayerwin!