Cheerio .html() should not encode cyrillic symbols
See original GitHub issuevar cheerio = require('cheerio');
var $ = cheerio.load('<div>кириллица</div>');
console.log($.html()); // => <div>кириллица</div>
when expected output is <div>кириллица</div>
Issue Analytics
- State:
- Created 8 years ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
Node.js Cheerio parser breaks UTF-8 encoding - Stack Overflow
The way cheerio works is that is tries to decode characters by nature and present the numerical HTML encoding of the Unicode character....
Read more >cheerio
Cheerio can parse nearly any HTML or XML document. Cheerio works in both browser and Node environments. Cheerio is not a web browser....
Read more >textract - npm
Extracting text from files of various type including html, pdf, doc, docx, ... Does textract not extract from files of the type you...
Read more >iptcembed - Manual - PHP
The following code embeds both IPTC APP segment 13 and EXIF APP segment 1 data from a source file and embeds it into...
Read more >Cheerio Scraper - HTML scraping tool - Apify
Cheerio Scraper is ideal for scraping web pages that do not rely on client-side ... the symbol must be encoded as [\x5B] or...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
decodeEntities
is not an option for.html
, but for.load
(or the constructor).Found a solution in #466 with
$.html({decodeEntities: false})
, which works fine. Anybody knows what could be the downsides of using this option?