Unexpected behavior with anchor and word boundary regex
See original GitHub issue/^./ does not highlight anything, but it should highlight the first letter. /\b./ highlights all English letters, but it should highlight the first letter.
- I haven’t investigated why /^./ is not highlighting everything.
- \b highlighting everything is because when each match is highlighted, the highlighted text is no longer in the textContent of the node. When the regex is applied again, \b will match the second letter, as the first letter is no longer there. This repeats until all English letters are highlighted.
Steps to reproduce
Enter in the above regex in this fiddle: https://jsfiddle.net/julmot/ova17daa
Environment
Latest version of Chrome and version 8.1.1 of jquery.mark.min.js
Proposed Fix
Psuedocode to go in wrapMatches function:
text = node.textContent // save text because we don't want it changing each highlight
if no global flag: // highlight first match
match = regex.exec(text)
highlight(match)
if global flag: // highlight all matches
while(match is not null):
match = regex.exec(text)
highlight(match)
Issue Analytics
- State:
- Created 7 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
unexpected word boundaries behavior in regex - Stack Overflow
using a lookahead assertion that your pattern is followed by a space or the end of string, not a word boundary (since your...
Read more >Anchors - Launch School
It is easier to demonstrate certain behaviors when using ^ and $ on Rubular. Word Boundaries. The last two anchors anchor regex matches...
Read more >Regex Boundaries and Delimiters—Standard and Advanced
Boundaries vs. Anchors assert that the current position in the string matches a certain position: the beginning, the end, or in the case...
Read more >Everything you need to know about Regular Expressions
On an abstract level a regular expression, regex for short… ... Let's try \b(\w+)\s+a it anchors to a word boundary, and matches word ......
Read more >perlretut - Perl regular expressions tutorial - Perldoc Browser
An anchor useful in basic regexps is the word anchor \b . ... if ($line =~ /^(\w+)=$a99a$/){ # unexpected behavior! print "$1 is...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It looks like
/^./
is matching whitespace, so that’s why it doesn’t look like anything was found.@Almenon Sorry for my late reply, I was on a journey.
I can understand your situation and I agree that strictly speaking this might be expected by a small group of users. But this requirement is uncommon and the expected behavior technically impossible. We’re not searching inside a simple string like you’ve demonstrated. We’re searching inside real DOM text nodes and matches need to be wrapped. Once you’ve wrapped one match you need to search in the remaining text node value and as this is a “new” RegExp context, it will find further characters. Also, the RegExp will be applied through different text nodes, so there won’t be any chance to correctly match multiline or global matches. Therefore the
limit
option exists.I’m sorry to disappoint you but I don’t see any way to resolve your request. Since this regular expression probably just exists while typing in a complete one, you might implement a method that ignores such things.
:octocat: