question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug: `key-spacing` with character encoded as two code units

See original GitHub issue

Environment

Node version: v18.2.0 npm version: 8.10.0 Local ESLint version: 8.16.0 Global ESLint version: none Operating System: Ubuntu 22.04 LTS

What parser are you using?

Default (Espree)

What did you do?

Configuration
{
    "parserOptions": {
        "ecmaVersion": "latest",
        "sourceType": "module"
    },
    "rules": {
        "key-spacing": [2, { "align": "value" }]
    }
}
const foo = {
    "a": "bar",
    "𐌘": "baz" // U+10318 https://en.wikipedia.org/wiki/Old_Italic_(Unicode_block)
};

Object.keys(foo).forEach((key) => {
    console.log(key, key.length, [...key].length);
});
// a 1 1
// 𐌘 2 1

What did you expect to happen?

No error reported.

What actually happened?

$ npx eslint index.js

/home/regseb/testcase/index.js
  2:10  error  Missing space before value for key 'a'  key-spacing

✖ 1 problem (1 error, 0 warnings)
  1 error and 0 warnings potentially fixable with the `--fix` option.

Participation

  • I am willing to submit a pull request for this issue.

Additional comments

JavaScript uses UTF-16 for String:

  • A character before U+FFFF is encoded as one code unit.
  • A character after U+FFFF is encoded as two code units.

String.length counts code units instead of characters.

I think we need to change the function getKeyWidth().

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

3reactions
nzakascommented, Jul 13, 2022

@regseb please open a separate issue for that.

grapheme-splitter seems like the right approach to me. Thanks @ljharb!

0reactions
mdjermanoviccommented, Jul 16, 2022

grapheme-splitter seems like the right approach to me.

Agreed, marking as accepted.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bug with two carets input for multibyte characters (`^^xx^^xx ...
It is not a bug, simply you are inputting the wrong characters. The input ^^d0^^9f. is valid but is the two character sequence...
Read more >
Java charAt used with characters that have two code units
A character represented by surrogate pairs has two code units making up the character. sentence.charAt(0) would return the first code unit, ...
Read more >
Applications Manager Issues Fixed - ManageEngine
Find the full list of feature enhancements and bug fixes that have gone into the recent releases of Applications Manager, the server and...
Read more >
Troubleshooting character encoding - Vespa Documentation
UTF-8 is a Unicode specific encoding where each letter (code point) is encoded as one to ... Note how these two bugs create...
Read more >
Ubuntu Manpage: tdom::schema - Creates a schema validation ...
The possible error codes are: MISSING_ELEMENT MISSING_TEXT UNEXPECTED_ELEMENT ... encoding, where each binary octet is a two-character hexadecimal number.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found