Customization of base_paths for relative imports
See original GitHub issueCurrently, the base_paths
for relative grammar imports only base on following, if I understood correctly:
- the directory where the grammar file(-like object) is (fetching from the grammar file path)
- the directory where the main script is (fetching from
sys.modules['__main__'].__file__
However, this limits the architecture of projects that are using lark
.
Imagine we have the following architecture:
|- package/
|- submodule1/
|- parser.py # with a string grammar wanting to import some
# shared rules/terminals from the shared grammar
|- submodule2/ # maybe I have another submodule that wants to
# import some rules/terminals from the shared grammar
|- shared/
|- grammar/
|- shared.lark # some shared rules/terminals
# say we have a terminal `INT` defined for the example below
Well, we could probably use 2) listed above to do the relative imports?
// not sure if this works
%import .shared.grammar.shared.INT
However, this doesn’t run if the package is not running independently. For example, using pytest
, since sys.modules['__main__'].__file__
will become the path to pytest
.
Ideas/Proposals:
-
Reading paths from an env variable, say
LARK_GRAMMAR_PATH
(just likePYTHON_PATH
) -
Opening the
source
argument fromLark
construct to allow overridden from outside. -
Allowing pre-loaded grammar by
load_grammar.load_grammar
, since this function already allows specification ofgrammar_name
, where thebase_paths
can fetch fromWith the above architecture, say in
parser.py
, we can do:from pathlib import Path SHARED_GRAMMER= str(Path(__file__).parent.parent.joinpath('shared', 'grammar', 'shared.lark').resolve()) grammar = load_grammar.load_grammar( r""" start: INT %import .shared.INT """, SHARED_GRAMMER ) parser = Lark(grammar, ...)
Then in
Lark
construct, we need to tell if the grammar is pre-loaded:from load_grammer import Grammar class Lark: def __init__(self, grammar, ...): if isinstance(grammar, Grammar): self.grammar = grammar self.source = '<pre-loaded>' # may have to deal with caching separately in such case else: # regular grammar loading
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
Why not just add a
include_path
option to Lark?Like
Lark(..., include_path='./shared/grammar')
?I’ve yet to look up all of the links sent (thanks!)… but maybe won’t have to! Sometimes, the good people of the internet are just much faster than I am:
https://github.com/sympy/sympy/pull/19825
I did raise my concerns about being able to modify behavior, but am mostly just excited about seeing the implementation.
I think I just saw one lark file in that PR so wouldn’t matter too much to this issue… but I’m still interested in how it develops!