question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Generate parser as Python C-Extension

See original GitHub issue

Suggestion Lark really makes creating parsers easy, but unfortunately generated parser is very slow. I recently had to debug why things are so slow and found that most of the time is spent on parsing. Hardcoding simple and frequently used cases like _id="UUID" helped to improve performance 10 times. I’m talking not about 10 times better parsing, but overall app performance.

So maybe adding possibility to generate parser as a C-Extension, which would have same interface, would really improve performance? Having Python based parser is nice for prototyping, but eventually it will have to be ported to a lower level language in order to increase performance.

Describe alternatives you’ve considered Probably updating grammar from Earley to LALR would increase performance, but still having C parser will be a lot faster.

I was looking at pegen, but I’m not sure if it is intended for anything else other than Python language itself.

Other options probably to port grammar from Lark to bison and flex. But that sounds too complicated.

Additional context I’m working on data management project, where I use a small expression language for data transformation. Parser I use can be found here:

https://gitlab.com/atviriduomenys/spinta/-/blob/master/spinta/spyna.py

The man performance issues where in an upsert action, where I upserting a lot of data and each upsert action has a "_where": "_id='UUID'" expression which is parsed with Lark, and this took most of the time.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
ThatXlinercommented, Nov 12, 2020

I have successfully transpiled lark (via nuitka) into C but the common.lark and other metadata files had some trouble catching on the transpiliation.

2reactions
erezshcommented, Oct 29, 2020

Probably updating grammar from Earley to LALR would increase performance

Yes, definitely. Actually, it’s possible that LALR in Python would be faster than Earley in C.

Looking at your grammar, it looks like it’s simple enough, that it should be possible to use LALR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Towards a Standard Parser Generator - Python.org
The C code is compiled to form an extension module. Recently, this build procedure was completely restructured. Today, BisonGen implements the LALR(1) algorithm ......
Read more >
Building a Python C Extension Module
You'll learn how to: Invoke C functions from within Python; Pass arguments from Python to C and parse them accordingly; Raise exceptions from...
Read more >
5. Parsing Python Arguments
My advice: Always make all PyObject* references to default arguments static . So first we declare a static PyObject* for each default argument:...
Read more >
Parsing in Python: all the tools and libraries you can use
Python libraries to build parsers. Tools that can be used to generate the code for a parser are called parser generators or compiler...
Read more >
Fancy Argument Parsing — c-extension-tutorial documentation
Python functions can accept multiple arguments, have default value, ... over how we parse arguments, however, this would make it very tedious to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found