Help: Basic Intro Question
See original GitHub issueI could not find an appropriate place to field this question. Is there a Slack/Gitter/Discord channel for this community?
Anyways, I am new to Chevrotain and language design and had what I feel is a rather basic question, but for some reason the answer is elusive to me. For the purposes of this ticket, my use case is wanting to generate a parser that handles something akin to (5 * 30) < 200
. I’ve essentially taken the example of the calculator sandbox and want to through in an comparison operation. I’ve tried scrambling through the documentation, but have not found anything that helped me conceptually.
I essentially have this:
class Parser extends chevrotain.Parser {
constructor(input) {
super(input, tokens.list, { outputCst: true });
const self = this;
this.RULE('expression', () => {
this.MANY(() => {
this.OR([
{ ALT: () => this.SUBRULE(this.comparisonExpression) },
{ ALT: () => this.SUBRULE(this.additionExpression) },
]);
});
});
this.RULE('comparisonExpression', () => {
this.SUBRULE(this.additionExpression);
this.CONSUME(tokens.comparisonOperator);
this.SUBRULE2(this.additionExpression);
});
this.RULE('additionExpression', () => {
this.SUBRULE(this.multiplicationExpression);
this.MANY(() => {
this.CONSUME(tokens.additionOperator);
this.SUBRULE2(this.multiplicationExpression);
});
});
this.RULE('multiplicationExpression', () => {
this.SUBRULE(this.atomicNumericExpression);
this.MANY(() => {
this.CONSUME(tokens.multiplicationOperator);
this.SUBRULE2(this.atomicNumericExpression);
});
});
this.RULE('atomicNumericExpression', () => {
this.OR([
{ ALT: () => this.SUBRULE(this.parenthesisExpression) },
{ ALT: () => this.CONSUME(tokens.number) },
]);
});
this.RULE('parenthesisExpression', () => {
this.CONSUME(tokens.lParen);
this.SUBRULE(self.additionExpression);
this.CONSUME(tokens.rParen);
});
Parser.performSelfAnalysis(this);
}
}
const parserInstance = new Parser([]);
module.exports = {
parserInstance,
parse: (inputText) => {
const lexResult = lexer.tokenize(inputText);
parserInstance.input = lexResult.tokens;
parserInstance.expression();
if ( parserInstance.errors.length ) {
console.log(parserInstance.errors);
throw Error(parserInstance.errors[0]);
}
}
};
My issue is that I’m getting a lot of error dumps saying that many prefix paths can happen, ie:
-------------------------------
Ambiguous alternatives: <1 ,2> in <OR> inside <expression> Rule,
<lParen, lParen, number, additionOperator> may appears as a prefix path in all these alternatives.
To Resolve this, try one of of the following:
1. Refactor your grammar to be LL(K) for the current value of k (by default k=5)
2. Increase the value of K for your grammar by providing a larger 'maxLookahead' value in the parser's config
3. This issue can be ignored (if you know what you are doing...), see http://sap.github.io/chevrotain/documentation/0_35_0/interfaces/_chevrotain_d_.iparserconfig.html#ignoredissues for more details
My assumption is that I am missing something quite basic in my logical understanding, however after banging my head, I’m finally looking for help.
Any help is much appreciated and thanks for this library!
Scott
Issue Analytics
- State:
- Created 6 years ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
Alternative - use a bottom up parser (LR).
These kind of parsers support [left recursion]`(https://en.wikipedia.org/wiki/Left_recursion) which makes writing such calculator grammars very clean and concise. However as everything in life they also have their drawbacks such as harder to debug code.
Recursive decent parsers are very popular despite their limitations because they are fairly easy to hand-craft, debug and reason about.
Alright, basically Chevrotain is an LL(K) parser, a.k.a recursive decent parser.
What this means is that at any point when there are multiple alternatives, the choice must be decidable using a fixed lookahead.
In your case the top level expression rule is either a Comparison or an Addition, Now lets find an example where no fixed lookedahead exists:
In both these cases we can always add more parenthesis so in fact we would need infinite lookahead (or backtracking) to decide between the comparison and the addition.
LL parser’s limitations make expression parsing a little ugly. The basic solution is to have each precedence level as a separate none terminal. A much more in depth analysis can be found here: https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm (search for “the classic solution”)
The calculator example already uses this classic solution, but perhaps another level of operators is needed to make it clearer (Pull request will be welcome 😄 ).
Actual code to solve this using the classic solution: