Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Understanding the parse result

See original GitHub issue

As a test, I have a super-simple grammar that accepts numbers across lines.

However, the result has an incredible amount of array-nesting, which increases if the input has multiple lines.

For instance, the input “4” yields

[[[[{"type":"number","value":"4","text":"4","offset":0,"lineBreaks":0,"line":1,"col":1}]]],null]

Why four levels of arrays? And more if there are more lines.

In general, what are the rules for traversing the result as an AST? It’s hard to know how deep you need to start looking for actual objects. What does the arrays signify at each level?

Grammar below (yes, over-complicated for numbers, but this is the starting point for something else):

@{%
const moo = require("moo");

const lexer = moo.compile({
    ws: {match: /[ \t\n\r]+/,  lineBreaks: true },
    number: /[0-9]+/,
});
%}

@lexer lexer

start -> exprlist:* %ws:?
exprlist -> expr
    | exprlist %ws expr
expr -> %number

Issue Analytics

State:
Created 6 years ago
Reactions:1
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

kachcommented, Aug 24, 2017

Right. The first level of array-ness (which has null as element 2) matches the start nonterminal. The first element matches the exprlist:*. The second element, null, represents the fact that the %ws:? item was not matched — “4” has no whitespace after it.

The second level of array-ness comes from the exprlist:* item. :* as you know means “match zero or more of these”. That means the “result” is actually an array of results, where each element in the array represents one exprlist. Since the input “4” only matches one exprlist, that array has one element.

The third level of array-ness is from the fact that you matched one expr in an exprlist, from the rule exprlist -> expr. Why is this in an array? Well, even though there’s only one item in the rule (expr), nearley always returns an array as the result: one element for each item. So, the first element in that array represents a parsed expr. If the rule was exprlist -> expr "cow" then the array would have two elements: one for the expr and one for the string "cow".

You can probably guess the fourth level of array-ness already, now! It’s from the rule expr -> %number, which returns a single-element array for exactly the same reason that the rule exprlist -> expr does.

Okay, so, how do we fix this? The easy answer is to use postprocessors (see Tim’s link). For example,

expr -> %number {% function(d) {return d[0];} %}

will now make expr return the contents of the %number, not an array! (It extracts the first element of the result array by doing d[0]).

In fact, the function `function(d) {return d[0]; } %} is so common, nearley provides it automatically: you can simply write

expr -> %number {% id %}

1reaction

tjvrcommented, Aug 24, 2017

@Hardmath123 will be along later to explain better, but I recommend reading about postprocessors: 🙂

By default, nearley wraps everything matched by a rule into an array.

Top Results From Across the Web

What is Data Parsing? The Process Explained - Smartproxy

Data parsing means turning raw, unstructured data into well-structured and understandable information. However, parsing is not a ...

Parsing Explained - Computerphile - YouTube

How ambiguity is dangerous! Professor Brailsford simplifies parsing. EXTRA BITS: https://youtu.be/Airi85CPdPk Angle Brackets: ...

What Is Parsing of Data? - Oxylabs

Data parsing is a method where one string of data gets converted into a different type of data. So let's say you receive...

A Guide To Parsing: Algorithms And Terminology

An in-depth coverage of parsing terminology an issues, together with an explanation for each one of the major algorithms and when to use...

What is Parsing? - The Mighty Programmer

Parsing is the process of converting formatted text into a data structure. A data structure type can be any suitable representation of the...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Understanding the parse result

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

How to deal with invalid syntax due to end of stream?

Code review/feedback: parsing dictionary formats with Nearley