Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Avoid using non-ASCII Unicode characters outside of comments and literals

See original GitHub issue

In error-prone 2.11.0 I’ve started getting the following error when building within IntelliJ

Foo.java:17:2
java: [UnicodeInCode] Avoid using non-ASCII Unicode characters outside of comments and literals, as they can be confusing.
    (see https://errorprone.info/bugpattern/UnicodeInCode)

When I view the file in VIM or HexDump there I can’t see any non-unicode characters.

Line 17 is the end of the file, I can’t supply the whole file due to work constraints. But below is a screenshot of the end of the file from hexedit

Within IntelliJ the formatter is doing

If I down grade error-prone to 2.10.0 it works fine on the offending file

Issue Analytics

State:
Created a year ago
Reactions:2
Comments:12 (1 by maintainers)

Top GitHub Comments

6reactions

chashnikovcommented, Aug 10, 2022

I’ve found the cause: Javac modifies content of file passed to it as char[] (see UnicodeReader.java:103) by replacing the last character by 0x1a. If this array is cached (the original implementation of Javac also does that, but code in intellij does this in a different way to improve performance), Error Prone may get this modified content and report an error. Note that this code in Javac was rewritten as part of JDK-8224225, so the problem shouldn’t appear in Java 16 and newer versions.

2reactions

elefeintcommented, May 20, 2022

FYI, there is an issue filed on the IntelliJ side, too – https://youtrack.jetbrains.com/issue/IDEA-288257

Top Results From Across the Web

Why is executing Java code in comments with certain Unicode ...

The easy way out is to do lexing in two steps: first search and replace all Unicode escapes with the character it represents,...

UnicodeInCode - Error Prone

Avoid using non-ASCII Unicode characters outside of comments and literals, as they can be confusing. Severity: ERROR. The problem. Using non-ASCII Unicode ...

It is often best to avoid non-ASCII characters in source code ...

> It is often best to avoid non-ASCII characters in source code. Indeed, in some cases, there is no standard way to tell...

Minification inadvertently converts ISO-8859-1 non-ASCII ...

Bug Report Non-ASCII characters in a string literal from a UTF-8 JavaScript file are inadvertently converted to their UTF-8 counterparts.

Code Inspection: Non-ASCII characters | PhpStorm ... - JetBrains

Non -ASCII characters used in identifiers, strings, or comments. · Identifiers written in different languages, such as myСollection with the ...