setup.cfg should standardize on UTF-8 for encoding
See original GitHub issueIn jaraco/configparser#34, I learned that although setuptools v40.7.0 presumably added support for non-ASCII, there are still environments where loading non-ASCII is failing.
configparser # easy_install --version
setuptools 40.8.0 from c:\python37\lib\site-packages (Python 3.7)
configparser 3.7.2 # python setup.py egg_info
Traceback (most recent call last):
File "setup.py", line 5, in <module>
package_dir={'': 'src'},
File "C:\Python37\lib\site-packages\setuptools\__init__.py", line 144, in setup
_install_setup_requires(attrs)
File "C:\Python37\lib\site-packages\setuptools\__init__.py", line 137, in _install_setup_requires
dist.parse_config_files(ignore_option_errors=True)
File "C:\Python37\lib\site-packages\setuptools\dist.py", line 702, in parse_config_files
self._parse_config_files(filenames=filenames)
File "C:\Python37\lib\site-packages\setuptools\dist.py", line 599, in _parse_config_files
(parser.read_file if six.PY3 else parser.readfp)(reader)
File "C:\Python37\lib\configparser.py", line 717, in read_file
self._read(f, source)
File "C:\Python37\lib\configparser.py", line 1014, in _read
for lineno, line in enumerate(fp, start=1):
File "C:\Python37\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 103: character maps to <undefined>
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (9 by maintainers)
Top Results From Across the Web
What's the right way to use Unicode metadata in setup.py?
Show activity on this post. I was writing a setup.py for a Python package using setuptools and wanted to include a non-ASCII character...
Read more >UTF-8: The Secret of Character Encoding - HTML Purifier
8-bit encodings are extensions to ASCII that add a potpourri of useful, non-standard characters like é and æ. They can only add 127...
Read more >PR#33: Set encoding for setup.cfg - python-daemon - Pagure.io
As of version 40.9.0, Setuptools ignores encoding declarations and simply requires the text to be encoded as UTF-8.
Read more >utf8: Unicode Text Processing
Input, validate, normalize, encode, format, and display. Details Functions for manipulating and printing UTF-8 encoded text:
Read more >UTF-8 and Unicode FAQ for Unix/Linux
With the UTF-8 encoding, Unicode can be used in a convenient and backwards compatible way in environments that were designed entirely around ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Given that
setopt
and itsedit_config
function need to write to the config file, I’m even more strongly inclined now to remove support for specifying an encoding in setup.cfg files and instead insist on UTF-8, especially since commands likebdist_rpm
invokeegg_info
which in turn rewrites the config file.Hi @jaraco , I have a related issue. setup.cfg contains Unicode symbols and set explicit UTF-8 encoding:
When I run tox to test itself under Python2:
This is because “edit_config” doesn’t pass down the original encoding:
The output is something like: