[bug] - Garbled when the some link is in Chinese
See original GitHub issuewhen the url is like http://media.people.com.cn/n1/2020/0617/c40606-31749210.html
,then the result was:
however, this link is in chinese too, it was right.
https://juejin.cn/post/6931597891182002183
how to solve this problem?change the UNICODE? thanks you!
Issue Analytics
- State:
- Created 10 months ago
- Comments:8
Top Results From Across the Web
Email message body is garbled when Simplified Chinese ...
Fixes an issue that garbles an email message body when Simplified Chinese characters are included in the BCC field in an Exchange Server...
Read more >Solving the problem of Chinese garbled characters in URL ...
Introduction: In the RESTful class of service design, often encounter the need to use the URL address in Chinese as a condition, in...
Read more >Flashfxp shows some Chinese characters ... - FlashFXP Bug Reports
When the folder name contains "胜", "鲁" and other Chinese characters, Flashfxp appears as garbled and inaccessible. As shown in the screenshot.
Read more >When a website loads slowly, why do Chinese characters ...
Your Software/Computer uses its built-in Fonts, but it doesn't have any Chinese Fonts so it's returning Error message. A bit more Detailed Explanation:...
Read more >Random characters (Chinese/Japanese) got rendered as ...
While some users experience garbled text because the selected encoding is wrong, asking them to change the encoding will help. However, we have...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
yes, as far as I can see this is just an exception. User should find a way to handle it by himself. In this case, the process could be: load HTML --> convert to UTF-8 --> pass converted UTF-8 string to article-parser
@ndaidong Yup, need lib like
iconv-lite
to convert the encoding. But we can’t know if the html string is as same as the meta(input string maybe utf-8 but with other encoding in meta). So I think the encoding should specific from options or users should convert the input by themselves. Or we have to add encoding to rules. It’s too complex