question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mismatch of unicode to Big5

See original GitHub issue

Hi, I just found (at least) one word in Chinese is not correctly encoded from unicode to Big5. I’m not sure is there any other word has the same problem. However, it is correct in https://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT

0xB05F	0x8D77	# <CJK>

To reproduce the problem:

const char = 'θ΅·';
const code = char.charCodeAt(0);
console.log(code.toString(16)); // 8d77

const buff = iconv.encode(char, 'big5');
console.log(buff.toString('hex')); // 8ffe ...which is not correct. It should be b05f

my environment is: nodejs: v12.22.1 npm: v7.9.0 icon-lite@0.6.2

β”œβ”€β”¬ body-parser@1.19.0
β”‚ β”œβ”€β”€ iconv-lite@0.4.24
β”‚ └─┬ raw-body@2.4.0
β”‚   └── iconv-lite@0.4.24
β”œβ”€β”¬ eslint@6.8.0
β”‚ └─┬ inquirer@7.3.3
β”‚   └─┬ external-editor@3.1.0
β”‚     └── iconv-lite@0.4.24
β”œβ”€β”€ iconv-lite@0.6.2
β”œβ”€β”¬ mssql@5.1.4
β”‚ └─┬ tedious@4.2.0
β”‚   └── iconv-lite@0.4.24
β”œβ”€β”¬ mysql2@1.6.5
β”‚ └── iconv-lite@0.4.24
β”œβ”€β”¬ node-fetch@1.7.3
β”‚ └─┬ encoding@0.1.13
β”‚   └── iconv-lite@0.6.2 deduped
└─┬ pdfmake@0.1.71
  └── iconv-lite@0.6.2 deduped

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
ashtuchkincommented, May 24, 2021

Just released a v0.6.3 with the fix. Let me know if you see any problems. Thanks for your contribution!

1reaction
ashtuchkincommented, May 19, 2021

That’s a good point, thank you! It seems that I’ve missed that condition. I’ll look closer and see how I can fix it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to convert encoding from UTF-8 to Big5 with rare words ...
Initially, we can try to convert the UTF8 easily using the Encoding ... be converted for reasons other than sign mismatch or data...
Read more >
Unicode to database code set - IBM
Informix code set name Informix code set number JDK code set 8859‑1 819 8859_1 8859‑2 912 8859_2 8859‑3 57346 8859_3
Read more >
Unicode to Database Code Set - Oninit:
Informix Code Set Name Informix Code Set Number JDK Code Set 8859‑1 819 8859_1 8859‑2 912 8859_2 8859‑3 57346 8859_3
Read more >
Mojibake - Wikipedia
Mojibake is the garbled text that is the result of text being decoded using an unintended ... the most common ones being: Unicode,...
Read more >
Convert Big5 to UTF8 - Oracle Communities
Are you saying that you have data in a table that does not match the database character set encoding? And that you have...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found