question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PIP doesn't read setup.cfg in UTF-8, which causes UnicodeDecodeError

See original GitHub issue

Environment

  • pip version: 20.2.3
  • Python version: 3.8.5
  • OS: Win7x64

Description

When installing anything within a path that has a setup.cfg, pip will read it but fails at decoding if the file contains non-ASCII characters. The cfg file is proper UTF-8 but PIP doesn’t open it as so. It tries to use “GBK” (my locale) and fails.

Related, but not the same issue: #8717 (this is caused by pyvenv.cfg)

Expected behavior

It should read setup.cfg in UTF-8.

How to Reproduce

  1. open a path.
  2. Create a setup.cfg file with
[metadata]
name = test
version = 0.0.1
packages =
    test
description = '計算ツール'
  1. Run pip install requests

Output

D:\test>pip install requests
ERROR: Exception:
Traceback (most recent call last):
  File "c:\program files\python3\lib\site-packages\pip\_internal\cli\base_command.py", line 228, in _main
    status = self.run(options, args)
  File "c:\program files\python3\lib\site-packages\pip\_internal\cli\req_command.py", line 182, in wrapper
    return func(self, options, args)
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 245, in run
    options.use_user_site = decide_user_install(
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 664, in decide_user_install
    if site_packages_writable(root=root_path, isolated=isolated_mode):
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 609, in site_packages_writab
le
    get_lib_location_guesses(root=root, isolated=isolated))
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 600, in get_lib_location_gue
sses
    scheme = distutils_scheme('', user=user, home=home, root=root,
  File "c:\program files\python3\lib\site-packages\pip\_internal\locations.py", line 109, in distutils_scheme
    d.parse_config_files()
  File "c:\program files\python3\lib\distutils\dist.py", line 406, in parse_config_files
    parser.read(filename)
  File "c:\program files\python3\lib\configparser.py", line 697, in read
    self._read(fp, filename)
  File "c:\program files\python3\lib\configparser.py", line 1017, in _read
    for lineno, line in enumerate(fp, start=1):
UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 88: illegal multibyte sequence

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:23 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
uranusjrcommented, Sep 28, 2020

Man, they basically re-implemented the whole thing. I’m not sure pip should be responsible to maintaining this implementation.

Maybe we should just catch the parsing error and carry on pretending the file does not exist instead. pip does not actually need any of the data you mentioned above; it just reads distutils configuration. The usage is honestly quite niche, and I doubt many would even notice the difference.

0reactions
laydaycommented, Mar 1, 2021

That would not fail though. The parsing would only fail if the file contains content not decodable with the platform encoding.

Yeah, I understand that.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pip install UnicodeDecodeError - python - Stack Overflow
In my case it was because pip was facing encoding issues when reading setup.cfg file of the repository. Initially when I used PyCharm...
Read more >
History - setuptools 65.6.3.post20221216 documentation
#1735: When parsing setup.cfg files, setuptools now requires the files to be encoded as UTF-8. Any other encoding will lead to a UnicodeDecodeError....
Read more >
PEP 597: Use UTF-8 for default text file encoding
UnicodeDecodeError caused by assuming the default text encoding is UTF-8 ... When it seems they are reading UTF-8 file (e.g. “… can't decode ......
Read more >
Python 3.6 UnicodeDecodeError: Invalid continuation byte
My System: Ubuntu 14.04 Python 3.6.7 .travis.yml language: python python: '3.6' cache: pip before_install: - openssl aes-256-cbc -K ...
Read more >
1. ConfigObj 5 Introduction and Reference
ConfigObj - a Python module for easy reading and writing of config files. ... We recommend downloading and installing using pip: pip install...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found