[JavaScript] JS runtime does not compute state sets correctly
See original GitHub issueThis is in reference to https://github.com/antlr/grammars-v4/issues/2007
I am trying to find out why the JS grammar at github.com/grammars-v4/javascript/javascript/JavaScript{Parser,Lexer}.g4 does not work with target JavaScript 4.9.1 runtime on input “i = 1;”. I have converted the JavaScript{Parser,Lexer}Base.js code to the ES6 classes (as the old code were incompatible with Antlr 4.9, and nobody bothered to fix the antlr/grammars-v4 JS code, and because there is no CI testing of the grammars across all targets), and then automatically generated a driver for target language=JavaScript and C#. I then compared the two parsers side-by-side on input “i = 1;” to see where they diverge. I have found at least two bugs with the JS runtime, specifically with IntervalSet.
Stepping into the programs, I print out the value of the IntervalSet in param decisionState
for ParserATNSimulator.predicateDFAState()
.for C# here and in JS here. DecisionState
, subclassed from ATNState
, has field nextTokenWithinRule
of type IntervalSet, defined in C# here, and here in JS. In the debugger, I print out the field using “toString()”. The values are: in JS, {<EOF>, 1, 4..5, 7, 9, 11, 18..23, 59..70, 71, 73, 76..77, 80..82, 83..84, 85..87, 87..89, 91..92, 93, 95, 98, 101..105, 104..106, 108..109, 116..119}
, in C#, {{<EOF>, 1, 4..5, 7, 9, 11, 18..23, 59..71, 73, 76..77, 80..89, 91..93, 95, 98, 101..106, 108..109, 116..119}}
. The sets are not the same.
There are two problems:
- The IntervalSet in JS does not coalesce contiguous ranges correctly. In JS, we see
59..70, 71
, but in C# we see59..71
. I believe there is an “off by 1” check that is incorrect. - In JS, the IntervalSet contains values that can overlap, e.g.,
101..105
and104..106
. Again, there seems to be some “off by one” checks here.
I don’t know if these are the cause of the problem of stack overflow reported for the bug. But, the sets in JS are wrong.
–Ken
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (10 by maintainers)
Top GitHub Comments
That fixes the issue with IntervalSet, but it doesn’t fix the stack overflow problem. So, this is a step in the right direction. I will continue to debug and find out what else is wrong.
I looked over the runtime code for the Python2/3 and Go–they’re all different from this and from each other, which makes it hard to judge if there are bugs in those runtimes. I’d still recommend a rewrite. IntervalSet is a set of integers, and there is probably a better way to represent a sparse set rather than an ordered list.
looks like the bug was there since 2013, and a typo prevented a wider disaster… can you check if PR #3077 fixes the observed gap?