question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

wcwidth needs several enhancements:

  1. interface to configure width of ambiguous chars at runtime, maybe locale dependent (see https://chromium.googlesource.com/native_client/nacl-newlib/+/master-backup/newlib/libc/string/wcwidth.c). Maybe include this into 2.
  2. interface to load widths from some kind of outer source, see https://github.com/xtermjs/xterm.js/issues/942#issuecomment-388406591 This would make it possible to overload the default width set with some custom values. Those values could be based on backend or frontend or elseone’s needs.
  3. Still in doubt - maybe merge wcwidth with future grapheme handling. Most of the wcwidth rules are a subset of the grapheme clustering. Once we have working grapheme handling merging them might save a lot of runtime. This is somewhat doubtful because grapheme handling might not work at an early stage with current cell based buffer model and should be added at a later stage (closer to the renderer). See also #1468.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:11 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
egmontkobcommented, Jun 13, 2018

VTE simply uses whatever’s returned by g_unichar_iswide(), which is already sub-ideal, we have a pending bug 772890 to perhaps use glibc’s wcwidth() instead. In case these methods are locale-dependent (are they really? even among the UTF-8 ones?), simply the process’s locale is used (in case of gnome-terminal it’s the locale set up by systemd --user).

Plus there’s an API option in VTE (compatibility preference in gnome-terminal) for ambiguous-width characters to be treated as wide. This changes the width of several characters. The set is somewhat arbitrary, and is subject to accidental as well as planned further changes, see bug 791452. Alas there’s no locale definition to back this up, I vaguely recall there’s a long-standing open (or rejected?) glibc bug on sourceware, as far as I recall it had a four-digit bug number, but the last time I tried I couldn’t find it. This means that apps can’t really use this behavior of the terminal emulator. Ideally I guess there’d be no such API in VTE, rather you could switch (both VTE and the apps running inside) into a different locale with a different wcwidth.

That still wouldn’t solve the problem of ssh, though. And wouldn’t solve how the locale used by the terminal emulator would match up with the locale of the app. Maybe be need to introduce an escape sequence for that?? And then juggle with multiple locales inside a single terminal emulator process (e.g. in case of gnome-terminal), using newlocale/uselocale/whatever_l and friends.

Seems unsolvable without having a living global specification for wcwidth that everybody can follow.

Even worse, a specification couldn’t solve it either, see the mess caused by Unicode 9.0 changing the width of certain already existing codepoints (e.g. VTE 772812 and tons of duplicates out there). We’d either need a specification that’s guaranteed not to change in backwards incompatible ways (Unicode seems not to make such a guarantee), or introduce some kind of versioning. I guess versioning is required anyways, an app should be able to tell if the terminal emulator doesn’t yet recognize a new codepoint.

And of course, design all of these in a way that can even work across ssh, across different OSes with different notations for locales. Sigh.

Maybe, maybe just ignoring this whole problem set and living with the tiny breakages every now and then is better than overengineering it??

1reaction
mofuxcommented, Jun 13, 2018

5. When using xterm.js as an SSH frontend, there is no way to reliably determine the wcwith of the target system.

@egmontkob Do you know how VTE solves this, especially for the SSH case?

Read more comments on GitHub >

github_iconTop Results From Across the Web

jquast/wcwidth: Python library that measures the ... - GitHub
Use function wcwidth() to determine the length of a single unicode character, and wcswidth() to determine the length of many, a string of...
Read more >
PuTTY semi-bug wcwidth-upgrade - Chiark.greenend.org.uk
Inside PuTTY is a function called wcwidth() , which tells us the display width of a given Unicode code point (e.g., single-width, zero-width/ ......
Read more >
XTERM - Change Log - Thomas E. Dickey
This is the xterm change-log, distributed with xterm, with items listed by ... modify wcwidth tables to separate Unicode Cf category as formatting ......
Read more >
Double-width characters in Unicode 9+ - Freedesktop GitLab
wcwidth () is incapable of dealing with this properly because it only ... It sucks big time that Unicode can change the width,...
Read more >
Pycharm Professional 2020.1 - authentication error updating ...
This occurs after selection modules to upgrade within a project... ... Requirement already satisfied, skipping upgrade: wcwidth>=0.1.4 in .
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found