utils.get_encoding_from_headers returns ISO-8859-1 incorrectly
See original GitHub issueWhen I call get_encoding_from_headers on this url:
http://thelastpsychiatrist.com/2012/02/my_fiancee_is_pushing_me_away.html
The response is ISO-8859-1:
(Pdb) get_encoding_from_headers(self.response.headers)
'ISO-8859-1'
Even though the headers don’t contain that characterset:
(Pdb) self.response.headers
{'date': 'Sun, 11 Mar 2012 21:10:40 GMT', 'transfer-encoding': 'chunked', 'content-type': 'text/html', 'server': 'Apache/2.2.22'}
It looks like this was an intentional choice in the source, but this is problematic for me because, if I knew that the encoding was guessed, I’d want to check the HTML meta tag myself - which would then properly parse as UTF-8.
I think the better solution for is to either return None explicitly, or provide a default kwarg param that people could set to an encoding manually if they wanted to.
I can patch this if it sounds like a good solution.
Issue Analytics
- State:
- Created 12 years ago
- Comments:16 (16 by maintainers)
Top Results From Across the Web
ResourceBundle loading ISO-8859-1 characters incorrectly
I have a following test_fi.properties file under my project, where I have special characters that are visible properly in IntelliJ.
Read more >Non-ascii (iso-8859-1) location headers are handled ... - GitHub
The real problem is that we should be operating on the location header as a set of bytes that are a encoded in...
Read more >SOAP MESSAGE INCORRECTLY CONVERTED TO ISO-8859 ...
The presence of DFHCONTENTTYPE container after a repeated INVOKE WEBSERVICE causes to incorrectly convert the SOAP message to to ISO-8859-1 (ASCII) instead ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
For future reference to anyone who stumbles upon this, the spec is:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1
We already have an extensive hook system:
http://docs.python-requests.org/en/latest/user/advanced/#event-hooks