Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UTF-8 characters have the wrong encoding when using OMEMO

See original GitHub issue

In converse.js I see UTF-8 characters (e.g. ü) from apps like Conversations or gajim as two bytes like Ã¼ and in Conversations I see � for such characters sent via Conversations.

The page includes:

<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Issue Analytics

State:
Created 5 years ago
Comments:10 (6 by maintainers)

Top GitHub Comments

1reaction

allo-commented, Sep 8, 2018

I managed to fix it by modifying the arrayBuffer <-> string methods:

  u.arrayBufferToString2 = function (ab) { 
    return new TextDecoder("utf-8").decode(ab);
  };
  u.stringToArrayBuffer2 = function (string) {
    const bytes = new TextEncoder("utf-8").encode(string)
    return bytes.buffer;
  };

Now it seems to work for me. I guess you need to implement it with a API which is supported by more/older browsers as well.

0reactions

jcbrandcommented, Oct 2, 2018

I did a git bisect and found that the offending commit is this one: https://github.com/conversejs/converse.js/commit/e05b7e9de3083cf8c867548ee5bc51bf421f76f3

Top Results From Across the Web

Why does File::Slurp get UTF8 characters wrong when I use ...

This can't be done using slurp , but it can be done using read_file via its binmode parameter. use open ':std', ':encoding(UTF-8)'; #...

UTF8 encoding issue : replace the character - SAS Support ...

Solved: Hello, We have issues dealing with ANSI to UTF8 encoding mishaps in our migration from SAS 9.2 to 9.4 . Indeed the...

UTF-8 and Unicode FAQ for Unix/Linux

With the UTF-8 encoding, Unicode can be used in a convenient and backwards ... standard ISO 10646 defines the Universal Character Set (UCS)....

RFC 3629: UTF-8, a transformation format of ISO 10646

Abstract ISO/IEC 10646-1 defines a large character set called the Universal Character Set ... UTF-8, the object of this memo, has a one-octet...

What every JavaScript developer should know about Unicode

Unicode deals with characters as abstract terms. ... The character encoding is what transforms abstract code points into physical bits: code ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

UTF-8 characters have the wrong encoding when using OMEMO

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Cannot receive OMEMO messages from Gajim or ChatSecure

Bug Added Contact still pending