question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add possibility to jump in a state and push new state on stack

See original GitHub issue

In the current implementation, you can only have pop: true, push: "someState" or next: "someOtherState" in a stateful lexer.

Imagine you are in state = "currentState" but could set the state to next: "continueHereState" and at the same time push: "parseSomething". The next time you pop, it would result in "continueHereState" instead of going back to "currentState".

For me, this was useful to parse for example function calls in JavaScript with recursive arrays and objects. Something like:

identifier(["some", "array", {}, 123], {"object": {"values": ["a", "b"]}, "whatever": false})

I’ve just tweaked these lines https://github.com/no-context/moo/blob/24b23ca961232df15f870f9c8db1c933f2a31e21/moo.js#L484-L486 to this:

    if (group.pop) {
      this.popState()
    } else if (group.push && group.next) {
      this.setState(group.next)
      this.pushState(group.push)
    } else if (group.push) {
      this.pushState(group.push)
    } else if (group.next) {
      this.setState(group.next)
    } 

Is this something you might want a PR for? Would it make sense to allow next inside a pop as well (resulting in setState directly after the pop)?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8

github_iconTop GitHub Comments

2reactions
nathancommented, Sep 23, 2018

This is a fairly common problem people have when writing lexers and parsers. You generally want your lexer to be as dumb and permissive as possible, i.e., it should know nothing about syntax except what the tokens are and the absolute minimum necessary to distinguish among them (in your example, the ability to distinguish between regular text and JavaScript code). I’d recommend writing your lexer like this:

const lexer = moo.states({
  main: {
    label: {match: /#/, next: 'label'},
    text: moo.fallback,
  },
  label: {
    call: {match: /\w+\(/, value: s => s.slice(0, -1), next: 'call'},
    name: {match: /\w+/, next: 'main'},
  },
  call: {
    comma: ',',
    colon: ':',
    lbrace: '{',
    rbrace: '}',
    lbracket: '[',
    rbracket: ']',
    rparen: {match: ')', next: 'main'},
    true: 'true',
    false: 'false',
    null: 'null',
    ws: {match: /\s+/, lineBreaks: true},
    number: /-?(?:\d|[1-9]\d+)(?:\.\d+)?(?:[eE][-+]?\d+)?/,
    string: /"(?:\\["bfnrt/\\]|\\u[a-fA-F0-9]{4}|[^"\\])*"/,
  },
})

lexer.reset(`what a #neat #thing() to #look({"hi":null,"blubb":{}}, [1, [null, []], 1], "hello", 123, "blubb") at`)

That gives you a token stream like this:

text what a 
label #
name neat
text  
label #
call thing
rparen )
text  to 
label #
call look
lbrace {
string "hi"
colon :
null null
comma ,
string "blubb"
colon :
lbrace {
rbrace }
rbrace }
comma ,
ws  
lbracket [
number 1
comma ,
ws  
lbracket [
null null
comma ,
ws  
lbracket [
rbracket ]
rbracket ]
comma ,
ws  
number 1
rbracket ]
comma ,
ws  
string "hello"
comma ,
ws  
number 1
number 2
number 3
comma ,
ws  
string "blubb"
rparen )
text  at

The reason your lexer should be permissive and un-clever is twofold:

  1. You don’t end up duplicating your work. Your parser is going to encode the full syntax of the language anyway (e.g., that every { must be matched by a } and contain key-value pairs), so there’s no reason your lexer needs to know that too, and you can save yourself some time and maintenance effort by not writing the language syntax out twice. Also, parsers are designed to encode structural information (whereas lexers are designed to encode character-based information), so you’ll find it much easier to describe the structural features of the language in a parser (e.g., lbrace (string colon value (comma string colon value)*)? rbrace instead of every state that starts with object in your example).
  2. You can give better error messages. When you’re lexing, the only information you have is: in the current state, the remainder of the source file doesn’t start with a valid token—that’s not a lot to work with. When you make your lexer overly permissive, your parser can give more informed feedback by talking about language constructs at the token and parse tree levels instead of at the character level (e.g., “expected , or ) after argument, but got number” instead of “unexpected 4”).
0reactions
tjvrcommented, Sep 29, 2018

That’s right: I don’t think we want to add a feature like this at this time. Moo is intended to be used with a parser of some kind, and we’re not planning to add parser-like features to it. (If someone was using Moo with a parser, and could demonstrate a use case that required this, then we’d think about it again.)

Good luck with your project! 😊

Read more comments on GitHub >

github_iconTop Results From Across the Web

Correct way to push into state array - Stack Overflow
Using es6 it can be done like this: this.setState({ myArray: [...this.state.myArray, 'new value'] }) //simple value this.
Read more >
Navigation prop reference
If the navigator is a stack navigator, several alternatives to navigate and ... The reset method lets us replace the navigator state with...
Read more >
How to go back to another stack navigator? goBack / pop ...
Current Behavior I have 2 Stack navigators inside Tab navigator. ... up using navigation.reset(newState) where newState is the full state ...
Read more >
Navigate to a destination - Android Developers
To pop destinations when navigating from one destination to another, add an app:popUpTo attribute to the associated <action> element. app: ...
Read more >
Stack Memory: An Overview (Part 3) - Varonis
When new data is added to the stack, the malware will use the 'PUSH' command. To remove an item off the stack, the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found