question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UTF-8 characters have the wrong encoding when using OMEMO

See original GitHub issue

In converse.js I see UTF-8 characters (e.g. ü) from apps like Conversations or gajim as two bytes like ü and in Conversations I see for such characters sent via Conversations.

The page includes:

<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
allo-commented, Sep 8, 2018

I managed to fix it by modifying the arrayBuffer <-> string methods:

  u.arrayBufferToString2 = function (ab) { 
    return new TextDecoder("utf-8").decode(ab);
  };
  u.stringToArrayBuffer2 = function (string) {
    const bytes = new TextEncoder("utf-8").encode(string)
    return bytes.buffer;
  };

Now it seems to work for me. I guess you need to implement it with a API which is supported by more/older browsers as well.

0reactions
jcbrandcommented, Oct 2, 2018

I did a git bisect and found that the offending commit is this one: https://github.com/conversejs/converse.js/commit/e05b7e9de3083cf8c867548ee5bc51bf421f76f3

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why does File::Slurp get UTF8 characters wrong when I use ...
This can't be done using slurp , but it can be done using read_file via its binmode parameter. use open ':std', ':encoding(UTF-8)'; #...
Read more >
UTF8 encoding issue : replace the character - SAS Support ...
Solved: Hello, We have issues dealing with ANSI to UTF8 encoding mishaps in our migration from SAS 9.2 to 9.4 . Indeed the...
Read more >
UTF-8 and Unicode FAQ for Unix/Linux
With the UTF-8 encoding, Unicode can be used in a convenient and backwards ... standard ISO 10646 defines the Universal Character Set (UCS)....
Read more >
RFC 3629: UTF-8, a transformation format of ISO 10646
Abstract ISO/IEC 10646-1 defines a large character set called the Universal Character Set ... UTF-8, the object of this memo, has a one-octet...
Read more >
What every JavaScript developer should know about Unicode
Unicode deals with characters as abstract terms. ... The character encoding is what transforms abstract code points into physical bits: code ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found