Error parsing preprocessor statements with a space in front after stubbed macros
See original GitHub issueI’m noticing a misparse when a #define is preceded by a space.
#define MACRO_STUB
#define MACRO x
yields this parse:
(translation_unit (preproc_def name: (identifier)) (preproc_def name: (identifier) value: (preproc_arg)))
but if I insert a space before the second #define like this:
#define MACRO_STUB
#define MACRO x
yields this (incorrect) parse:
(translation_unit (preproc_def name: (identifier) value: (preproc_arg)))
Notably, the parser still produces correct output if the first macro is not stubbed:
#define MACRO_NOSTUB x
#define MACRO x
(translation_unit (preproc_def name: (identifier) value: (preproc_arg)) (preproc_def name: (identifier) value: (preproc_arg)))
I tried like one change so far that didn’t work (since I don’t understand the parser well), changing preproc_arg to:
preproc_arg: $ => token(prec(-1, repeat1(/[^\n]|\\\r?\n/))),
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (2 by maintainers)
Top Results From Across the Web
Issues · tree-sitter/tree-sitter-c - GitHub
Handling of preprocessor macros is not general enough ... Error parsing preprocessor statements with a space in front after stubbed macros bug.
Read more >Why is my C preprocessor ignoring these macros using spaces?
I'm trying to build WRF 4.2, but have run into some errors due to a macro not being expanded by the preprocessor. Essentially,...
Read more >The Verilog Preprocessor: Force for `Good and `Evil - Veripool
Including: Good and bad message and assertion macros, using `line in generated code, the mystery of where comments land, and the localparam-vs- ...
Read more >CPP – Haskell – Aelve Guide
Since Clang is used instead of GCC on macOS, and Clang's preprocessor syntax is slightly different, ... on a separate line without space...
Read more >The C Preprocessor - Math
Directives : General syntax of preprocessing directives. ... is the directive that defines a macro. Whitespace is also allowed before and after the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
In the extreme case, a preproc directive can occur between any two tokens. I think putting them in
extras
is the only correct solution.@pluick – It seems comment tokens are returned by tree-sitter, there is even a test case (
extra_non_terminals
) in the tree-sitter repo which shows them being returned despite being inextras
. Otherwise, every editor using tree-sitter for syntax highlighting must be using language-specific hacks for highlighting comments…I’d just like to point out an example:
There are two versions of the file – both are syntactically valid, but it’s impossible to combine both versions in a way that fits the grammar. I suppose this could be handled by treating the different branches as ambiguities but tree-sitter isn’t designed to parse ambiguous grammars, is it?
If a ‘perfect’ parse is impossible, maybe the preprocessor aspect of the C/C++ grammars should be designed with that in mind.