Make chardet/charset_normalizer optional?
See original GitHub issueWith a routine version bump of requirements, I noticed chardet
had been switched out for charset_normalizer
(which I had never heard of before) in #5797, apparently due to LGPL license concerns.
I agree with @sigmavirus24’s comment https://github.com/psf/requests/pull/5797#issuecomment-875158955 that it’s strange for something as central in the Python ecosystem as requests
is (45k stars, 8k forks, many contributors at the time of writing) to switch to such a relatively unknown and unproven library (132 stars, 5 forks, 2 contributors) for a hard dependency in something as central in the Python ecosystem as requests
is.
The release notes say you could use pip install "requests[use_chardet_on_py3]"
to use chardet
instead of charset_normalizer
, but with that extra set both libraries get installed.
I would imagine many users don’t really necessarily need the charset detection features in Requests; could we open a discussion on making both chardet
/charset_normalizer
optional, á la requests[chardet]
or requests[charset_normalizer]
?
AFAICS, the only place where chardet
is actually used in requests
is Response.apparent_encoding
, which is used by Response.text
when there is no determined encoding.
Maybe apparent_encoding
could try to
- as a built-in first attempt, try decoding the content as UTF-8 (which would likely be successful for many cases)
- if neither
chardet
orcharset_normalizer
is installed, warn the user (“No encoding detection library is installed. Falling back to XXXX. Please see YYYY for instructions” or somesuch) and return e.g.ascii
- use either chardet library as per usual
Issue Analytics
- State:
- Created 2 years ago
- Reactions:8
- Comments:23 (11 by maintainers)
apparent_encoding
genuinely just needs to go away. That can’t be done until a major release. Once that happens, we don’t need dependencies on either library@Gagaro
html5lib
optionally requireschardet
and likely behaves differently if it’s not installed.https://github.com/html5lib/html5lib-python/blob/f7cab6f019ce94a1ec0192b6ff29aaebaf10b50d/requirements-optional.txt#L7-L9