question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parser gives up instead of getting out of nested rule to try alternatives

See original GitHub issue

Here’s a repo with this exact reproducible case: https://github.com/deltaidea/chevro-section-conditions

Syntax

A script consists of sections, which consist of commands. Sections cannot be nested. There’re also if statements, both top-level (with sections inside) and section-level (with commands inside). See a sample script below.

Here’s pseudo-code of how I see the syntax:

Script = TopLevelStatement+
TopLevelStatement = TopLevelIf | Section
TopLevelIf = 'if FOO' TopLevelStatement+ 'endif'

Section = '<SECTION>' SectionStatement+
SectionStatement = SectionIf | 'COMMAND'
SectionIf = 'if FOO' SectionStatement+ 'endif'

Example

<SECTION>
  COMMAND

  if FOO
    COMMAND
  endif

if FOO
  <SECTION>
    COMMAND

  <SECTION>
    COMMAND
endif

In the example above the Parser tries the following path (whitespace is ignored):

<SECTION>
  COMMAND

  if FOO
    COMMAND
  endif

  if FOO
    // Expecting 'COMMAND' here, Parser thinks we're still inside the section started on line 1.
    // Instead it gets '<SECTION>' here and gives up.
  • How do I force the parser to rewind, pop the rule stack and try parsing the ‘if’ as a TopLevelStatement?
  • I can adjust the syntax definition in any way, so if there’s a better way to parse the same language, please tell me!

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
bd82commented, May 28, 2017

These are actually really good questions.

Firstly a context free parser cannot take into account the rule hierarchy of nested repetitions. as in the example you showed, simply because it will no longer be context free as we will need to take into account the parent rule in the hierarchy which may change between different invocations.

Secondly it looks like you will have to scan more than LL(*) tokens ahead to decide between the alternatives even if they were both in the same alternation. because you can always add more “COMMANDS” tokens until you find the deciding “section”. This means that for any lookahead K, there is an input where K+1 tokens will be need to disambiguate the choices.

You may be able solve this by using backtracking. https://github.com/SAP/chevrotain/blob/6a8febdcc78351b2b8610debc23f6d1b9bb1ce4d/test/full_flow/backtracking/backtracking_parser.ts#L62

But I strongly suggest you try to avoid backtracking as it is both slow, error prone (reseting state) and mutually exclusive with automatic error recovery.

Suggested Approach.

Explicitly close the “Section” Just like you have IF and ENDIF, you can also have “END_SECTION”.

0reactions
deltaideacommented, May 28, 2017

Glad to know I’m not the only one. 😄 Flat list it is then.

Nah, the indentation doesn’t mean anything to the game that is the canonical interpreter. Newlines are the only part that has meaning, and only as a separator. Like newline after return in JS - it only matters whether at least one is there.

My goal is to be able to parse all existing valid scripts to provide tooling to the whole scripting community. That’s why I can’t change the syntax, only the parser implementation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[css-nesting] Syntax suggestion · Issue #4748 · w3c/csswg-drafts
Basically, it would require any nesting to be placed within a separate grouping container, as a way to differentiate in bulk between ...
Read more >
Help pick a syntax for CSS nesting - Hacker News
Nesting style rules naively inside of other style rules is, unfortunately, ambiguous—the syntax of a selector overlaps with the syntax of a declaration, ......
Read more >
A Guide To Parsing: Algorithms And Terminology
An in-depth coverage of parsing terminology an issues, together with an explanation for each one of the major algorithms and when to use...
Read more >
CSS Nesting Module - W3C
This module introduces the ability to nest one style rule inside another, with the selector of the child rule relative to the selector...
Read more >
Creating Parsers - Tree-sitter
For each rule that you add to the grammar, you should first create a test that ... the given numerical precedence is applied...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found