question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Handling escape symbols

See original GitHub issue

From https://acss.io/#pseudo-classes:

Each line should parse as one word (i.e. one identifier):

C\(\#0280ae\)
C\(brandColor\)
C\(\#fff\)\:h:hover

This can be verified with https://rawgit.com/tabatkins/parse-css/master/example.html

This issue continues from https://github.com/shellscape/postcss-values-parser/issues/93

With thanks and credit to @nex3 and @ai for identifying this issue.


Update: I have started work on updating the Tokenizer, but I may need assistance as I integrate or abandon the current multichar tokens. I don’t necessarily see how those tokens benefit speed. Any risk of inaccuracy seems too steep a price to pay.

@ai, thank you for the wonderful documentation @ https://github.com/postcss/postcss/blob/7.0.27/docs/architecture.md#tokenizer--libtokenizees6-

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:17 (15 by maintainers)

github_iconTop GitHub Comments

1reaction
aicommented, May 15, 2020

Andrey, please let me know if I’m pestering with these updates or if I can make them more helpful.

A new performance breakthrough is awesome 😍.

Let’s change tokenizer in 8.0.

Is it possible to use it in the safe parser or SCSS parser without changes? I forked the current one and can fork a new one too if it will be impossible to customise it.

1reaction
jonathantnealcommented, May 15, 2020

Andrey, please let me know if I’m pestering with these updates or if I can make them more helpful.

While analyzing the “slower” parts of the tokenizer, it seems like eagerly checking the character ahead improves overall performance. I have rewritten the tokenizer to take advantage of this.

I have also added 2 fields to a token; they are the opening and closing distances between the meaningful value of a token and its delimiters. Although “delimiter” is a poor term, this refers to the split between things like the @ & media in a @media At-Identifier token, the 2 & em in a 2em Number token, or the " & hello & " in a "hello" String token.

Anyway, I think you’ll really like these results!

Compressing PostCSS Tokenizer...

PostCSS Tokenizer Development:       1910 B
PostCSS Tokenizer Development (min):  638 B
PostCSS Tokenizer Development (web):  639 B

Collecting PostCSS Tokenizer Benchmarks...

PostCSS Tokenizer Development:       58721 tokens in 8 ms (1.0 times faster)
PostCSS Tokenizer Development (min): 58721 tokens in 8 ms (1.0 times faster)
PostCSS Tokenizer 7.0.30:            49548 tokens in 8 ms


Compressing PostCSS Parser...

PostCSS Parser Development:       1369 B
PostCSS Parser Development (min):  836 B
PostCSS Parser Development (web):  805 B

Collecting PostCSS Parser Benchmarks...

PostCSS Experimental Parser:       56024 nodes in 10 ms (1.6 times faster)
PostCSS Parser 7.0.30:              6240 nodes in 15 ms
PostCSS + Selector + Value Parser: 28491 nodes in 86 ms (5.5 times slower)

— From https://github.com/csstools/tokenizer#collecting-postcss-parser-benchmarks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Escape character - Wikipedia
In the telecommunications field, escape characters are used to indicate that the following characters are encoded differently. This is used to alter control ......
Read more >
Escaping special characters - IBM
To search for a special character that has a special function in the query syntax, you must escape the special character by adding...
Read more >
Escape Sequences | Microsoft Learn
Character combinations consisting of a backslash (\) followed by a letter or by a combination of digits are called "escape sequences.
Read more >
HTML Escape Characters: Complete List of HTML Entities
Number Symbol Entity Name Code Description 9 Tab &Tab &#9 Tab 10 New Line &NewLine &#10 New Line 32 Space &nbsp &#32 Space
Read more >
4 Special Characters in Queries
Use the backslash character to escape a single character or symbol. Only the character immediately following the backslash is escaped. In the following...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found