Introduce MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE warning
See original GitHub issueConsider the following set: [A-z]
. Most likely it’s an incorrect definition because the range
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
contains nonimplied characters:
[\]^_`
Correct and clear definition: [A-Za-z]
.
If these characters are implied, they can be added explicitly:
[A-Za-z[\\\]^_`]
I suggest adding at least a new warning MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE
. It’s especially actual if use the caseInsensitivity
option. Or name it as RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS
.
@parrt what do you think?
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I started testing our grammars-v4 and encountered some subtle cases with the current warning that related to Unicode ranges. Probably I was too fast about the correct solution here, but I’ll resolve it (restrict to ANSI characters) or revert it tomorrow. Sorry.
Yet another point for including grammars-v4 to the test infrastructure in some way (integration testing).
It would be good to take a look at this PR while I’m fixing the rest case insensitive issues: https://github.com/antlr/antlr4/pull/3349 It resolves a lot of issues including character ones (but it should be merged after case insensitive PR).