question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Addition of unrelated, unused parser rule to grammar causes different parsing results

See original GitHub issue

This is related to a SO question https://stackoverflow.com/q/73522066/4779853

The grammar is poorly written as an EOF-start rule, but that is irrelevant because parsing results should always be consistent when a completely unrelated, unused parser rule is added or deleted. Therefore there is something wrong in the runtime.

grammar my; 
st: 'H' hd | EOF ;
hd: 'D' d | 'C' c | st ;
d: hd ;
c: 'D' c | hd ;
s1: 'D' s1 | c ;
// p: hd ;
SKP: [ \t\r\n]+ -> skip;

Input H C D C C D

Commented rule p causes parser to work one way, uncomment rule p works another way. It is illogical for a rule that is unused to influence a parse because it’s never used in a derivation. It must be investigated.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
kaby76commented, Sep 1, 2022

Some good news.

  • There are no rules with EOF followed by another element in grammars-v4.
  • There are a few grammars where an EOF occurs on one alt but not in another. I’ll take a look at these.
./cool/COOL.g4 has an EOF in one alt, but not in another.
./csharp/Antlr4cs/CSharpPreprocessorParser.g4 has an EOF in one alt, but not in another.
./csharp/CSharp/CSharpPreprocessorParser.g4 has an EOF in one alt, but not in another.
./csharp/CSharpPreprocessorParser.g4 has an EOF in one alt, but not in another.
./golang/Go/GoParser.g4 has an EOF in one alt, but not in another.
./golang/GoParser.g4 has an EOF in one alt, but not in another.
./hypertalk/HyperTalk.g4 has an EOF in one alt, but not in another.
./java/java/JavaParser.g4 has an EOF in one alt, but not in another.
./javadoc/JavadocParser.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/CSharp/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/CSharpSharwell/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/Go/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/JavaScript/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/Python/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/ecmascript/TypeScript/ECMAScript.g4 has an EOF in one alt, but not in another.
./javascript/javascript/JavaScriptParser.g4 has an EOF in one alt, but not in another.
./javascript/jsx/JavaScriptParser.g4 has an EOF in one alt, but not in another.
./javascript/typescript/TypeScriptParser.g4 has an EOF in one alt, but not in another.
./kirikiri-tjs/TJSParser.g4 has an EOF in one alt, but not in another.
./kotlin/kotlin-formal/KotlinParser.g4 has an EOF in one alt, but not in another.
./objc/two-step-processing/ObjectiveCPreprocessorParser.g4 has an EOF in one alt, but not in another.
./save.generated/CSharpPreprocessorParser.g4 has an EOF in one alt, but not in another.
./save.generated/Test/CSharpPreprocessorParser.g4 has an EOF in one alt, but not in another.
./smtlibv2/SMTLIBv2.g4 has an EOF in one alt, but not in another.
./sql/tsql/TSqlParser.g4 has an EOF in one alt, but not in another.
./v/V.g4 has an EOF in one alt, but not in another.
./wat/WatParser.g4 has an EOF in one alt, but not in another.
  • Both of these conditions are really easy to find with Trash. Here’s a Bash script:
#!/usr/bin/bash
for i in `find . -name '*.g4' | grep -v Generated | grep -v examples | grep -v Lexer`
do
 count=`trparse $i -t antlr4 2> /dev/null \
  | trxgrep ' //parserRuleSpec//alternative/element[.//TOKEN_REF/text()="EOF"]/following-sibling::element' \
  | trtext -c`
 if [ "$count" != "0" ]
 then
  echo $i has an EOF usage followed by another element.
 fi
 count=`trparse $i -t antlr4 2> /dev/null \
  | trxgrep ' //labeledAlt[.//TOKEN_REF/text()="EOF" and count(../labeledAlt) > 1]' \
  | trtext -c`
 if [ "$count" != "0" ]
 then
  echo $i has an EOF in one alt, but not in another.
 fi
done
0reactions
parrtcommented, Sep 1, 2022

I’d rather we focus on larger issues than grammar warnings that I might find too risky to merge. Maybe help with antlr4-lab?

Read more comments on GitHub >

github_iconTop Results From Across the Web

java - Unused parser rule is causing an error, depending on ...
1 Answer 1 ... The problem is that you are indirectly declaring a lexer rule matching \n\n by using '\n\n' in a parser...
Read more >
A Guide To Parsing: Algorithms And Terminology
An in-depth coverage of parsing terminology an issues, together with an explanation for each one of the major algorithms and when to use...
Read more >
parsing-using-combos.. - Department of Computer Science
In combinator parsing, the text of parsers resembles BNF notation. We ... a result value and the unused suffix of the input string....
Read more >
Learning for Semantic Parsing Using Statistical Machine ...
The parsing model is based on the synchronous context-free grammar, where each rule maps a natural-language substring to its meaning representation.
Read more >
0bf55623237c869dd9603386c6...
LL(1) grammars, Greibach Normal Form (GNF) induced grammar structure, and the induction of Arithmetic PEG's. LL(1) to GNF based grammar. Keywords—Parsing ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found