Better handling of responses without correct content-type charset
See original GitHub issueProblem
If server doesn’t provide correct content-type charset, we’re defaulting to latin1 because requests underlying does that. This is undesirable for user experience.
Possible solutions
- Default to utf8
- For well-known content-types, handle them as defined in respective RFC (utf8 for
application/json
, first line or BOM for XML, meta tag in HTML) - Use chardet to detect from the body
Considerations
- Streaming mode and chunks
- Discrepancies between streaming and display mode: should this only be done for readability when showing to the terminal, or also when piping to other commands? Current decision: only doing this for terminal display. Consider flag for enforcing this for piping as well
- chardet requires downloading the whole body, so it would download a whole video to tell you it’s binary data
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:12 (5 by maintainers)
Top Results From Across the Web
What does "Content-type: application/json; charset=utf-8 ...
I know it specifies the character encoding but the service works fine without it. "working" does not always mean "the existent code/configuration is...
Read more >Setting the HTTP charset parameter - W3C
In ASP.Net, setting Response.ContentEncoding will take care both of the charset parameter in the HTTP Content-Type as well as of the actual ...
Read more >The charset of content-type request header is ignored #168
Abstract It seems that a charset of the content-type response header is ignored in specific case. If a request has an Accept-Charset request ......
Read more >Charset Encoding in ASP.NET Response - Rick Strahl
Binary content is not encoded so automatic encoding isn't applied. This is true when you use BinaryWrite or write directly to the Response....
Read more >Correct `Content-Type` header | webhint documentation
content-type warns against not serving resources with the Content-Type HTTP response header with a value containing the appropriate media type and charset ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For now, my first choice would be to do it directly using
charset-normalizer
after HTTPie got the whole body. Or resetting the buffer seek if this is how it is done/checked. (Or maybe storing the body?)https://github.com/psf/requests/issues/2086 is interesting to follow.