Feature request: Treat the Chinese text as a Chinese sequence when using`Ctrl+Left/Right`
See original GitHub issueNow the VSCode treats a long Chinese text as one “word”. Each time use Ctrl+Left/Right
, it will move the cursor to the begin or end.
The feature request is that treat the Chinese text as a Chinese sequence, then each Ctrl+Left/Right
, it just move one step. This act is the system text program default.
Example: (use |
as the cursor )
|本文的学习公式
// Ctrl+Right
本文的学习公式|
Expected:
|本文的学习公式
// Ctrl+Right
本|文的学习公式
// Ctrl+Right
本文|的学习公式
// Ctrl+Right
本文的|学习公式
(Of course, It would be better if it can support Word Segmentation.)
Issue Analytics
- State:
- Created 5 years ago
- Reactions:18
- Comments:10 (3 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This is a longstanding problem which virtually all East-Asian developers will notice once they start editing natural sentences (say, in Markdown) on vscode. I think this is fundamentally a problem of wrong word-splitting for CJK languages (and perhaps Thai, too), which use no spaces to delimit words. A similar problem happens when you double-click a word in a line (the whole line will be selected instead of the target word) and when you trigger an autocompletion using <kbd>Ctrl</kbd>+<kbd>Space</kbd> (a whole line will be shown as a candidate).
Ideally, dictionary-based word segmentation is desirable (this is available on MS Word, Google Chrome browser, etc), but it’s not 100% correct, and I’m not sure if it is really necessary for a code editor. Another practical approach that works at least in Japanese is to split words based on character types, because a typical Japanese text is a mixture of kanji, hiragana and katakana (This algorithm is implemented on most domestic text editors and even MS Notepad.exe). Character types can be easily determined via Unicode code points.
Example:
(1): Natural Japanese text with two sentences.
。
is a Japanese period.; (2): Dictionary-based word boundaries (|
), available on MS Word, Chrome, etc.; (3): Codepoint-based kana-kanji boundaries, available on Firefox, Notepad.exe, etc.There is already a popular extension that does (3) above for Japanese text. Unfortunately, it works on <kbd>Ctrl</kbd>+ <kbd>←</kbd>/<kbd>→</kbd> but nowhere else. It does not work on double-clicks, <kbd>Ctrl</kbd>+<kbd>D</kbd>, autocompletion, text search, and so on.
Personally, I think (3) should be implemented as part of the basic functionality of VSCode, considering the fact that it’s available on any other decent text editors. Dictionary-based solution (2) may be too costly within the main vscode repository, but I hope there is a way to allow extension developers to override word-boundary detection algorithm or the double-click behavior.
By the way, for the meantime, you can alleviate this problem by tweaking
"editor.wordSeparators"
settings and adding multibyte punctuation marks such as。
. With this, you can stop the cursor at least at (double-byte) periods and commas using <kbd>Ctrl</kbd> + <kbd>←</kbd>/<kbd>→</kbd>Let’s see if we can have time for it during holiday time.