question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unicode in input string is not handled

See original GitHub issue
/👍/u

parses differently to

/\u{1f44d}/u

The first is becoming 2 chars \ud83d and \udc4d.

I might try and detect any unicode in the input string and error out if that’s the case, but wondering if this lib can handle both the above the same, or maybe error?

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mathiasbynenscommented, Apr 15, 2021

The u flag indeed acts as an opt-in to using code points as the character boundary, instead of UCS-2/UTF-16 code units (without the u flag). I wrote about that here along with some examples: https://mathiasbynens.be/notes/es6-unicode-regex

0reactions
DmitrySoshnikovcommented, Apr 14, 2021

@tjenkinson thanks for the report and investigation, I think the change looks reasonable. @mathiasbynens, what are your thoughts on this?

Also, yes, when u is not enabled, the charCodeAt might be a good alternative.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why is my unicode string not being handled (printed/copied ...
I'm trying to make a program that can work with unicode strings, eventually being able to copy them to the Windows clipboard. I...
Read more >
What are best practices for handling user Unicode in a web ...
1 Answer 1 · If accepting UTF, raise an error if the input has any illegal byte sequences or non-shortest-form UTF-8 characters. ·...
Read more >
"Input string was not in a correct format" - MSDN
Input string was not in a correct format. Description: An unhandled exception occurred during the execution of the current web request.
Read more >
C# - Unicode characters in string input tensors not translated ...
Describe the bug. An input string tensor, containing unicode chars, gets translated to an unexpected value. Urgency Bug. System information.
Read more >
Unicode handling — CKAN 2.9.7 documentation
Note that the type of the pattern string does not influence the return type. Filenames¶. Like all other strings, filenames should be stored...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found