String literals are incorrectly parsed
See original GitHub issuesubj
module.exports = '\u0009\u000A\u000B\u000C\u000D\u0020\u00A0\u1680\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\u2028\u2029\uFEFF';
here is the source https://raw.githubusercontent.com/zloirock/core-js/master/packages/core-js/internals/whitespaces.js
Issue Analytics
- State:
- Created 4 years ago
- Comments:25 (20 by maintainers)
Top Results From Across the Web
Incorrect parsing of string literals prefixed with "u8"
I can confirm this bug. For C++17 std language level and the v14.27 compiler toolkit, string literals with non-ASCII characters in a UTF-8...
Read more >Problem parsing unicode escape in a Java 6 String literal...?
The problem is that the Unicode replacement is done very early in compilation. Unicode escapes aren't just valid in strings and character ...
Read more >CA2243: Attribute string literals should parse correctly
Cause. An attribute's string literal parameter does not parse correctly for a URL, GUID, or Version. Rule description.
Read more >SyntaxError: unterminated string literal - JavaScript | MDN
The JavaScript error "unterminated string literal" occurs when there is an unterminated string literal somewhere. String literals must be enclosed by single ...
Read more >Raw string literals parsed incorrectly : RSCPP-17887 - YouTrack
Raw string literals parsed incorrectly ... Expectation: No errors. In fact, if you build the project it will compile fine, even though Resharper...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I see multiple technical issues involved in this thread.
\u180e
Correct. Historically when MONGOLIAN VOWEL SEPARATOR was introduced, it was categorized as
Zs
(whitespace), later in 2013 it was changed toCf
(ref: https://www.unicode.org/L2/L2013/13004-vowel-sep-change.pdf) and published in Unicode version 7.0.Unfortunately such change will need decades to sync to every downstream projects of Unicode. So please file a bug on angular that
\u180e
should not be included inWS_CHARS
.weird looking texts on REPL
There are three red dots in the parsed
"value"
key of the string literal. They represents\u2028
,\u2029
and\ufeff
respectively. Meriyah REPL uses CodeMirror to pretty print the AST, which uses a\u2022
(Bullet) to represent a “special char”. So a red dot is printed.https://github.com/codemirror/CodeMirror/blob/01758b19565384414306816b43b5f35d81f039a3/src/line/line_data.js#L122
Note that when you copy from the AST, CodeMirror will send you the raw text, so you can compare it to the escaped version on your DevTools console (Yes, chrome DevTools also uses CodeMirror)
how it can break an app
I have no idea how a parser can break an app without generating the app code from the parsed AST. So I guess here is the process:
For example,
astring
is a generator that can print estree AST (generated by meriyah) to JavaScript codes. TypeScript has builtin parser and generator. One may also have their own generator.In this case it can break the app because there are
\u2028
\u2029
in the literal. When a generator is doing something likeThe generated code will break on legacy platforms because
\u2028
,\u2029
must be escaped in string literals prior to ES2019 (https://ecma-international.org/ecma-262/#sec-intro). Since\u2028
, and\u2029
are not printed as equivalent escaped form indecl.init.value
, the generator may print the unescaped characters to the source.To preserve the raw text of the string literal, you can pass
raw: true
to the meriyah option, which will append a"raw"
propertyThe generator may print the string literal using
decl.init.raw
. If you are using your own generator, please revise and usedecl.init.raw
.I’ll just make it clear as I found the original problem. All this stuff is borderline black magic so I think we all need to take a step back and appreciate for a second how hard this shit is and how big brainEd we all are. It’s basically computer science. Coming from a lowly angular developer.
I just want to build my angular app in ES5 as I have IE11-using customers. If I use meriyah, it breaks in this single and specific way. If I use ts, it builds fine but much slower. Can we focus on just solving this and moving forward pls