Problem with the encoding of Cyrillic characters
See original GitHub issueThere are problems with the encoding of Cyrillic characters in some files. The sample file I uploaded to google drive: file xls. This file opens correctly in Excel 16.0. The result of the conversion to csv from the page http://oss.sheetjs.com/js-xlsx/
Îñòàòêè ÒÌÖ íà ñêëàäàõ,,,,,,,,,,,,,,,,,,,,,,,,,,
Íà äàòó: 04.12.17,,,,,,,,,,,,,,,,,,,,,,,,,,
"Ïî íîìåíêëàòóðíûì ïîçèöèÿì èç ñïèñêà (""Àâòîøèíû"").",,,,,,,,,,,,,,,,,,,,,,,,,,
Íîìåíêëàòóðà,,,Åä.,Ñêëàä ã. ×åëÿáèíñê,,Ñëîáîäñêîé ïåð. 45,,,,,,,,,,,,,,,,,,,,
,,,,öåíà À,Ñâîáîäíûé,öåíà À,Ñâîáîäíûé,,,,,,,,,,,,,,,,,,,
51344,-----,16.5/70-18 TT ÂØÇ ÊÔ-97 íñ10 ñ îá.ëåíòîé,êîìïë, ,0,"14,008.00",>12,,,,,,,,,,,,,,,,,,,
Can it be fixed by some manipulation:
const iconv = require('iconv-lite');
console.log(iconv.decode(iconv.encode('Îñòàòêè ÒÌÖ íà ñêëàäàõ,,,,,,,,,,,,,,,,,,,,,,,,,,', 'cp1252'), 'cp1251'));
//display "Остатки ТМЦ на складах,,,,,,,,,,,,,,,,,,,,,,,,,,"
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
how to solve problem with the encoding of Cyrillic characters
Hi, I've front problem with cyrillic characker in Confluence. All of them is shown like this - "???". scrinshot My confluence version is....
Read more >Weird problem with cyrillic characters - TechNet - Microsoft
Hi, This issue could be related to message encoding. Please try to open one of such email message, under Message tab, click Actions...
Read more >Character encoding issues for Russian Chars - Stack Overflow
Solved it all by adding AddDefaultCharset UTF-8 to the .htaccess. Apparently the server was going for another character encoding such as ...
Read more >Problem with cyrillic characters - RStudio IDE
I have several dataframes in Russian in UTF-8, and earlier before 4.0.4 update strings were displayed in the console correctly but now they ......
Read more >Cyrillic Character Encoding Issues - Blazored/LocalStorage
Found this page when googling why my data's scandic letters (ä,ö,å) were encoded, OP probably had the same issue with cyrillic characters.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@makcbrain thanks for sharing! This is a BIFF5 XLS (Excel 5.0/95) file with no CodePage record, so there’s no way to inspect the file and determine the correct encoding. To see that this is a file ambiguity, try opening this in Excel 2016 for Mac and you’ll see different content corresponding to the default Mac Roman codepage 10000:
That string does correspond to the original set of bytes, as you can verify manually:
Just as discussed in #907 the final solution will involve adding a default codepage option to the read functions (e.g.
XLSX.readFile("file.xls", {codepage:1251})
)pass the
codepage
option toread
orreadFile
https://github.com/SheetJS/sheetjs/#parsing-options .