Context sensitive tokens
See original GitHub issueFollowing the discussions on #993 and #1023, I’m wondering if it’d make sense to have an option for stateful tokens.
const stringStartAndEnd = createToken({
effect: ({ insideString }) => ({ insideString: !insideString }),
pattern: /'/
});
const literal = createToken({
gate: ({ insideString }) => insideString,
pattern: /[^']*/
});
For this to work, the lexer would be initiated with a state
variable, an empty object at first.
If present, the effect
function of a token is invoked whenever encountered during lexing:
state = { ...state, ...token.effect(state)) };
The gate
property of a token is called with the state
variable before a match.
@bd82 does this make any sense?
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (10 by maintainers)
Top Results From Across the Web
The Problem of Context-Sensitive Tokenization - JavaCC 21
The solution I ended up implementing was in three parts: define the RSIGNEDSHIFT and RUNSIGNEDSHIFT tokens in a separate phony lexical state ...
Read more >5. Parser Mechanics - Stanford CS Theory
Context sensitivity decreases the separation between scanner and parser, but it is useful in parsers like IniFile, where the tokens themselves are not ......
Read more >Context sensitive lexers · wincent.com
In this way multiple tokens would be emitted, one for each greater-than symbol, with no complicated action required by either the lexer or...
Read more >Hime - Context-sensitive lexing - Cénotélie
Context -sensitive lexing is the ability for a lexer to yield different tokens depending on the context of the parser. The most common...
Read more >Context-aware multi-token concept recognition of biological ...
The key aspect of our method is utilizing the contextual ... Context-aware multi-token concept recognition of biological entities.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It seems like this feature is more popular than I thought…
Having a full guide e.g:
Would be best but that it a fair bit of work to create…
Perhaps (as suggested) a small note could be added in the lexer tutorial to aid discover-ability.
Fellow noobs should help each other out right 😉
@bd82 Maybe it’s a good idea to make multi-mode lexing more prevalent in the documentation? There isn’t really a tutorial page for it (although the linked example is plenty to understand how it works) and I didn’t find the page until you linked to it from another issue. Perhaps we could add a section to the lexer tutorial page that mentions it?