Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Refactor parsers

See original GitHub issue

The extract methods of the parsers have become unruly now that we support so many attributes, so it would be good to refactor these into smaller chunks. One possibility would be to take advantage of decorators which would allow all wrapped functions to automatically update progress, have a method that returns which attributes are actually supported and any parser-specific doc-strings, and ability to parse just the attributes one cares about instead of whole file.

I’ve pushed some code to my repo for this decorator and how it would be used: https://github.com/ATenderholt/cclib/commit/59453355360e3a682f10160778507cfdbfe646a3

Here’s an example:

>>> from cclib.parser import Gaussian
>>> Gaussian.get_supported_attributes().keys()
dict_keys(['newattr'])  # the only attribute parsed with decorated function is newattr
>>> Gaussian.get_supported_attributes()["newattr"].__name__
'parsing_test'             # the function is called 'parsing_test'
>>> Gaussian.get_supported_attributes()["newattr"].__doc__
'Parses newattr.'        # and here's its docstring
>>> parser = Gaussian("nonexistant.log")
>>> parser.fupdate = 0.5  # needed in this demo because fupdate isn't set unless parse() is called
>>> parser.parsing_test("some good line", None)  # lets parse a line that should actually enter this function; None is because I don't have an associated inputfile in this example.
some good line
>>> parser.parsing_test("some bad line", None)  # here's a block we should skip over
>>>

Only downside I see is the overhead of calling into the decorators, but I think simplified code and methods to help generate documentation would make it worth it.

Issue Analytics

State:
Created 7 years ago
Reactions:2
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

berquistcommented, Mar 1, 2022

According to https://twitter.com/mmwieclaw/status/1498652945394683911/photo/1, we are 10x slower than the custom parsing code in https://github.com/mishioo/tesliper/blob/master/tesliper/extraction/gaussian_parser.py for Gaussian.

1reaction

berquistcommented, Jan 21, 2022

An additional reason to refactor is to find where hotspots are, because parsing can be slow (proof in https://github.com/patonlab/GoodVibes/pull/43#issuecomment-1018148575).

Top Results From Across the Web

Refactoring Parsers

There are three types of Refactoring Parsers implemented right now, which help to abstract common parser refactoring tasks. Parser refactoring means, that a ......

tuhdo/semantic-refactor - GitHub

Parsing is a process of analyzing source code based on programming language syntax. relies on Semantic for analyzing source code and uses its...

MySQL 8.0: Refactoring and Improving the Parser

As usual, the first goal of that refactoring is a simplification and deduplication of parser grammar rules.

Refactoring of fact parsers - RFCs - TheForeman

That means I opened PRs where I move parsers from Ansible plugin (fixes ... If I move all parsers to one big pile...

How to refactor grammar to remove reduce/reduce conflict?

... of just => (i.e including the closing parenthese) but I would rather find a proper way refactoring the grammar itself instead. parsers....