Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Introduce MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE warning

See original GitHub issue

Consider the following set: [A-z]. Most likely it’s an incorrect definition because the range

ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy

contains nonimplied characters:

[\]^_`

Correct and clear definition: [A-Za-z].

If these characters are implied, they can be added explicitly:

[A-Za-z[\\\]^_`]

I suggest adding at least a new warning MIX_OF_LOWER_AND_UPPER_CHAR_CASE_IN_RANGE. It’s especially actual if use the caseInsensitivity option. Or name it as RANGE_PROBABLY_CONTAINS_NOT_IMPLIED_CHARACTERS.

@parrt what do you think?

Issue Analytics

State:
Created 2 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

KvanTTTcommented, Dec 26, 2021

I started testing our grammars-v4 and encountered some subtle cases with the current warning that related to Unicode ranges. Probably I was too fast about the correct solution here, but I’ll resolve it (restrict to ANSI characters) or revert it tomorrow. Sorry.

Yet another point for including grammars-v4 to the test infrastructure in some way (integration testing).

1reaction

KvanTTTcommented, Dec 25, 2021

It would be good to take a look at this PR while I’m fixing the rest case insensitive issues: https://github.com/antlr/antlr4/pull/3349 It resolves a lot of issues including character ones (but it should be merged after case insensitive PR).