question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Greediness of rules

See original GitHub issue

Is it possible to adjust the greediness of rules with regex terminals?

My text to parse is this, for example:

OTHER STATEMENT
GO
CREATE TABLE mytable (
    var1 float,
    var2 float
)
GO

Here OTHER STATEMENT is something I don’t really care about, but it could have multiple tokens across lines, quotation marks, parentheses, etc. All we know that it terminates with the word GO. What I actually want to capture is this CREATE TABLE statement. My attempt of a Lark code is this:

program: (statement "GO")* statement ["GO"]
statement: create_table_statement | other_statement
create_table_statement.1: "CREATE TABLE" table_name signature
table_name: CNAME
signature: "(" typed_variable ("," typed_variable)* ")"
typed_variable: variable type
variable: CNAME
type: CNAME
other_statement: /.+/s

%import common.CNAME
%import common.WS
%ignore WS

The problem though is that the whole text matches the other_statement, because /.+/s consumes the GO keyword that is supposed to by the statement separator.

Is there a way within Lark to achieve what I want? I have a vague idea how to preprocess the text and remove the irrelevant statements before feeding it to Lark, but that may require some additional coding effort (making sure that “GO” appears not as a part of a quoted string).

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
erezshcommented, Apr 14, 2022

Oh, sorry, bad formatting. Something like .+?(?=\sGO)

it’s also useful to be able to parse a document partially

Absolutely, although it’s a less common use-case. We’ve been asked about it in the past, but I still haven’t figured out a way for Lark to give a good answer for this. It might be best to skip the unknown sections manually, and then only parse the structured parts. (you can use interactive_parse to get even more control over the parsing mechanism)

0reactions
ymeironcommented, Apr 14, 2022

.?(?=\sGO) however is a zero-width regexp, I’m not sure how it could be used.

I understand that it’s best to have the complete grammar, but it’s also useful to be able to parse a document partially. And it seems to mostly work fine when the separator is included in the statement.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Greed Rules, Instructions, Directions for the Greed Dice Game
Greed Rules Overview: Greed, also known as 10,000, is a dice game where each player competes to be the first to reach 10,000...
Read more >
Seven Signs of the Greed Syndrome - INSEAD Knowledge
Greedy people are not good at maintaining boundaries. They will compromise moral values and ethics to achieve their goals. They look for ...
Read more >
How to play Greed | Official Game Rules - UltraBoardGames
Each player rolls all six dice on the first turn of his round. · The player then sets aside one or more of...
Read more >
What Are the Laws of Greed? - JSTOR
In this department the MONTHY presents easily stated unsolved problems dealing with notions ordinarily encountered in undergraduate mathematics.
Read more >
Inference Engine Greediness in Rule-based Systems
defines and classifies inference engine greediness, investigates how ordering of rules mitigates the im- pact of greediness on rule subsumption and studies.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found