Handling rules that are both left- and right-recursive
See original GitHub issueI’m reading the literature on PEGs. This paper talks about a bug in @alexwarth’s approach to left recursion which is present in ohm:
This grammar:
MyLang {
Expr = Expr "-" Num --a
| Num --b
Num = digit+
}
… Correctly parses "1-2-3"
to (((1)-2)-3)
. But when I change the grammar to this:
MyLang {
Expr = Expr "-" Expr --a
| Num --b
Num = digit+
}
The tree is flipped to (1-(2-(3)))
.
Maybe this isn’t an issue in practice? Its hard to say. But I can certainly see it tripping some people up.
Issue Analytics
- State:
- Created 8 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Right Recursion versus Left Recursion - IBM
With right recursion, no reduction takes place until the entire list of elements has been read; with left recursion, a reduction takes place...
Read more >parsing - Difference between left/right recursive, left/right-most ...
No, they all have different meanings. Right- and left-recursion refer to recursion within production rules. A production for a non-terminal ...
Read more >Left-recursive versus right-recursive lists in LR parsers - Gallium
The GNU Bison manual says “you should always use left recursion, because it can parse a sequence of any number of elements with...
Read more >Left Recursive Grammar | Gate Vidyalay
A recursive grammar is said to be right recursive if the rightmost variable of RHS is same as variable of LHS. OR. A...
Read more >Recursion (Bison 3.8.1) - GNU.org
Since the recursive use of expseq1 is the leftmost symbol in the right hand side, we call this left recursion. By contrast, here...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi Joseph,
I’ve known about Laurie’s paper for a while, and (to put it mildly) I violently disagree with it 😃
Who’s to say that the result of applying a rule like
expr = expr "-" expr | num
should be left-associative? Or right-associative, for that matter? PEGs don’t support left-recursion! The only “correct” result of applying this rule is non-termination. Of course that isn’t terribly useful, so researchers have proposed extensions to PEGs that enable left-recursive rules to produce useful results. The semantics of these extensions isn’t prescribed by Bryan Ford’s definition of PEGs – it’s really up to the people who created the extensions.
Ohm’s approach to left recursion happens to produce right-associative parse trees for your example. This makes sense if you think of left recursion as a kind of loop: the second application of
expr
inexpr "-" expr
is like a nested loop, which will consume as much input as it can before the “outer loop” gets a chance to consume more input. I can imagine a different semantics in which the result would be left-associative. That would be fine, too, but certainly no more correct or valid than our current semantics.You asked if this is an issue in practice. It isn’t. I’ll go a step further and say that this kind of “ambiguous” rule definition is code smell. Do you really want people to wonder whether
expr
produces a left- or right-associative result? No! In a well-written grammar, it’s obvious. In this case, you want the-
operator to be left-associative so you’re better off definingexpr
like this:expr = expr "-" num | num
Now there’s no question about it.
We have plans to release a bunch of exciting new features and tools for Ohm. I want to prioritize that work over this, which I don’t consider to be a real issue. So I’m closing it.
Thanks!
Good idea! See #56.