Invalid decoded text
See original GitHub issue- Platform: Mac
- Mercury Parser Version: 2.1.0
- Node Version (if a Node bug): 10
- Browser Version (if a browser bug):
Expected Behavior
Parsed HTML should be properly encoded as per the original text
Current Behavior
The parsed html contains in invalid text .Might be because of decoding issue.
Steps to Reproduce
- Fetch the html using any client
- Pass that to the parse using
Mercury.parse(url,{html:fetchedHtml})
- Returned HTML contains incorrectly decoded text
Detailed Description
I want to parse by fetching the html and giving to the parse instead of parser fetching the html.
Possible Solution
After looking at the code, it seem you are handling the case for browser only i.e. only if the html is provided from the browser, the proper encoding is checked from the html file. Ideally it should be able to decode the text irrespective of whether the parser is running on a browser or not
Issue Analytics
- State:
- Created 4 years ago
- Comments:8
Top Results From Across the Web
Invalid decoded text · Issue #425 · postlight/parser · GitHub
The parsed html contains in invalid text .Might be because of decoding issue. Steps to Reproduce. Fetch the html using any client; Pass...
Read more >Why is python decode replacing more than the invalid bytes ...
This means that for an invalid encoded sequence like '\xF0SUFFIX' , it will decode u'\ufffdFIX' instead of u'\ufffdSUFFIX' . Example 1: Introducing DOM...
Read more >base64: invalid input error when trying to decode contents of ...
The error base64: invalid input seems to indicate that the base64 program is not able to accept the encoded input into its decode...
Read more >encode and decode error: invalid character in a base-64 string.
Solution 1. The error means that your encoded string going in, is broken. It's probably been modified along the way, use the debugger...
Read more >Base64 Encoding of "invalid" - Online
Encode invalid to Base64 format with various advanced options. ... Select a file to upload and process, then you can download the encoded...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
For me problem was when I was trying to pass the local html as string. Using
Buffer
fixed the issueFixed it by passing the html as
Buffer
withutf-8
instead ofstring
as mentioned in the README