question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Accidental literal substitution in regex search for some matches (CRLF line endings)

See original GitHub issue

Version: 1.42.0-insider Commit: 241d4048aabc31db0baed66bb3d58cf3210d981d Date: 2019-12-17T10:**43:**45.626Z Electron: 6.1.6 Chrome: 76.0.3809.146 Node.js: 12.4.0 V8: 7.6.303.31-electron.0 OS: Windows_NT x64 10.0.18362

Steps to Reproduce:

  1. Download and unzip test files: issue87247.zip
  2. Open the folder in VSCode
  3. Open search (Ctrl+Shift+F)
  4. Enter (^.+)(\n\{% aqlexample) as search phrase, enable regex search, replace by $1\n$2
  5. It finds 16 results in 2 files: image
  6. Note the first 3 matches in graphs-traversal.md and click on the first to open the replace preview. Instead of inserting a blank line above {% aqlexample, it tries to do a literal substitution?
     Examples:
    -
    -{% aqlexample examplevar="examplevar" type="type" query="query" bind="bind" result="result" %}
    +$1
    +$2 examplevar="examplevar" type="type" query="query" bind="bind" result="result" %}
    
  7. Click on the 4th match for that file. The replace preview shows the insertion of a blank line as expected:
     #### Filtering edges on the path
    +
     {% aqlexample examplevar="examplevar" type="type" query="query" bind="bind" result="result" %}
    
  8. Click on the sole match for examples-join.md and see the 3 messed up matches disappear from the result list for the other file.
  9. Click on refresh and they will re-appear.

bug87247

Note: I can observe this behavior with CRLF line endings, but not LF line endings.

Does this issue occur when all extensions are disabled? Yes

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
roblourenscommented, Dec 13, 2022

rg has the --crlf flag which makes $ able to match CRLF. But with this flag, . still matches \r, and that seems wrong. --crlf sounds like a bit of a hack in rg for $ specifically. You could file an issue on the ripgrep repo for this.

We could also work around it on our end by rewriting . to [^\r\n], similar to how we rewrite \n already, and I think that would fix it.

0reactions
Simran-Bcommented, Dec 15, 2022

According to https://www.regular-expressions.info/dot.html#linebreak, there is no consensus on what is a line break character other than \n. It varies from one character to the full list of Unicode whitespace characters…

The --crlf option seems like it could also control the . behavior but doesn’t at the moment. It’s a workaround on rg’s side (replacing $ with (?:\r??$)), not a regex engine feature, so you may as well rewrite the expressions in VSCode as a workaround.

@connor4312 wrote that (^.+)(\r?\n\{% aqlexample) would match Examples:\r\n\r\n{% aqlexample. I can not confirm that with VSCode v1.72.0. Searching and replacing all matches without having any files open only replaces the empty line and the {% aqlexample string in my test but leaves Examples: untouched. The same goes for (^.+)(\n\{% aqlexample). The Examples: line is only replaced if I use \r?\n\r?\n or \n\n in the regex. So it seems like there is only the odd behavior caused by the CRLF vs. LF mismatch but no additional bug.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Regex to find and fix LF lineEndings to CRLF - Stack Overflow
Occasionally someone will have bad git settings, or in some other way accidentally introduce LFs into some files in the git repo, and...
Read more >
Regex Tutorial - Start and End of String or Line Anchors
In a regular expression, the caret matches the concept “start of string”, while the dollar sign matches “end of string”
Read more >
Substitutions in Regular Expressions | Microsoft Learn
Substitutions are language elements that are recognized only within replacement patterns. They use a regular expression pattern to define all or ...
Read more >
Regular expression syntax - Geany
When a line ending is defined as a single character, dot never matches that character; when the two-character sequence CRLF is used, dot...
Read more >
Regexes - Raku Documentation
Regexes search an entire string for matches. Sometimes this is not what you want. Anchors match only at certain positions in the string,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found