question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Object division and regex misparsed

See original GitHub issue

There are a few issues with JavaScript code parsing. A regex is sometimes interpreted like a comment, comments are interpreted like regexes, objects are interpreted like code blocks, ternary expressions like labels etc.

// Semi-keywords as identifier names (allowed by the specs)
`${async++}//`;
`${await++}//`;
`${let++}//`;

// Undefined after an object
+{1:+{}/undefined};

// Regex is highlighted like a comment while dividing an object in escaped sequence of a template literal string
`${{}/ /\//}`;

// Regex in a class extend
+class extends{1:{}/ /\//}{};

// Regex in a default function parameter
(a={}/ /\//)=>a;

// Regex in a ternary expression
{}/./gym?
{}/ /\//:
{}/ /\//;

// Spread operator before regex
[.../\//];

// Void before regex
void/\//;

// A regex in do...while loop
do/\//
while(0);

// A regex in if..else statement
if(0);else/\//;

// Template literals ecaping sequence
`${1}//`;

// Generator function with empty name as a method of an object (wrongly interpreted)
+{*''(){}};

I have a few more, but I am not sure if it maybe depends on some of the these ones, so I am not posting now all issues.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
ghostcommented, Jan 7, 2018

@bitwiseman

As an answer to your question

“Where these patterns come from and what their intent is. What does +{} mean? Is valid javascript I guess, but what does it mean? The same with {}/ ///.”

The code +{} is used to distinguish object from code block. If we want to create an object and immediatelly call its property, the following example is wrong:

{
  func: function(){
    console.log(1);
  }
}.func();

It will throw syntax error. Why? Because there is no single object here. The braces are for code block. If we want to make it object, we need to put some unary operator before it (+, -, !, ~, etc). Simillary is for immediatelly function call:

function(){
  console.log(1);
}();

This is also syntax error (function name is ommited), but if we place unary operator before it, then it will not be a function declaration statement, but instead an anonymous function used in expression and we can call it.

Your second question “What does +{} / /// mean?” is more abstract. According to the standard, it is an object divided by a regex. However, it might be difficult to find a real situation when someone would use it. For example, I use it in my code obfuscator where I am exploiting EcmaScript standard to make hard-to-disassemble code and prevent stealing and modifying. I thought about it and I actually came up with a real situation it can be used in:

RegExp.prototype.a = 4;
var m = {
  valueOf: () => 12
} / /\//.a;
console.log(m);

Here, we are defining property a of every single regex instance we create. Then, we are declaring and initializing variable m which is object divided by a regex’s propery a. So, the object is reduced to number 12 and the second operand is reduced to 4, so 12 / 4 == 3. Of course, it may be used very rare (I actually didn’t see anyone is using it instead my obfuscator), but it is still correct ES syntax.

[One paragraph removed due to privacy reasons]

No matter if some code appears in real situations or not, why not cover all cases? Also, code beautifiers are mostly used for minified and obfuscated code, which may inslude a lot of nonsence code structures. There are examples like this one which never throw syntax error but the code is useless. Another example is empty regex character group, which is not allowed in PCRE but allowed in JavaScript. It is correct syntax, but that regex is useless.

0reactions
bitwisemancommented, May 24, 2017

If we want to make it object, we need to put some unary operator before it (+, -, !, ~, etc).

That is a fascinating point. I wasn’t sure but the beautifier does treat +{} as an object literal for formatting. It seems likely that the problem is that the object literal detection currently occurs after parsing. So, the parser doesn’t figure out that the / is an operator rather than a regex literal. Ouch.

You also asked “Where these patterns come from?”. Well, my hobby is to create random code generators for different languages in order to check syntax highlighting.

That is pretty awesome. There have been times where I thought, “There aren’t many kinds of software development more thankless that writing a code beautifier.” You have reminded me that testing is more thankless, and exhaustive/depth tests like what you’re describing are even more thankless than that. 😄 I’m impressed and I appreciate you doing this.

No matter if some code appears in real situations or not, why not cover all cases?

Short answer: because resources are limited. The beautifier only recently reached the point of “parsing” the javascript input before formatting. That parsing is approximate, and also attempts to handle a number of templating languages that are not Javascript and other languages similar to Javascript.

I agree that this is a bug in the parsing, and that it should be fixed. This is an important bug highlighting a flaw in one of the assumptions of the current parser implementation. At the same time, I have to balance that against the number of users impacted, the resources needed to fix it, and long term goals of the project. So, this case may take some time before it is fixed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Swapping two arrays seems to behave asynchronous outside ...
Version: v7.1.0 Platform: Windows 8.1; 64-bit Issue details: Swapping two variables from a function sometimes behaves like it is ...
Read more >
Show HN: Regex Cheatsheet | Hacker News
To me, the divide is pre and post-Perl. It's not so bad going between JS, Ruby and Elixir regex (possibly due to my...
Read more >
Using regular expressions to parse HTML: why not?
Entire HTML parsing is not possible with regular expressions, since it depends on matching the opening and the closing tag which is not...
Read more >
perldiag - various Perl diagnostics - Perldoc Browser
In regular expressions, the ${foo[2]} syntax is sometimes necessary to disambiguate between array ... This is how Perl "enforces" encapsulation of objects.
Read more >
R News - The Comprehensive R Archive Network
is(object, class2) looks for class2 in the calling namespace after looking in the ... Given a character vector and a regular expression containing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found