question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

problems with LC_ALL=C

See original GitHub issue

There is a pattern of using open(path, 'r').read() without explicit encoding in pip:

This pattern causes issues under Python 3.x with ASCII locale because file contents is decoded using ascii in this case and it fails for non-ascii data.

The first occurance (in setup.py) is clearly wrong IMHO: the utility function is used for reading pip’s own index.txt and news.txt files which are encoded to utf8. It may cause the following exception:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 871: ordinal not in range(128)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 16, in <module>

  File "/var/folders/_5/cbsg50991szfp1r9nwxpx8580000gq/T/pip-61p_z7-build/setup.py", line 31, in <module>

    "\n\n" + read("docs", "news.txt"))

  File "/var/folders/_5/cbsg50991szfp1r9nwxpx8580000gq/T/pip-61p_z7-build/setup.py", line 9, in read

    return codecs.open(os.path.join(os.path.abspath(os.path.dirname(__file__)), *parts), 'r').read()

  File "/Users/kmike/svn/pip/.tox/py32-ascii/lib/python3.2/encodings/ascii.py", line 26, in decode

    return codecs.ascii_decode(input, self.errors)[0]

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 871: ordinal not in range(128)

if the following is added to pip’s own tox.ini:

[testenv:py32-ascii]
basepython = python3.2
setenv = LC_ALL=C

The second is more tricky and I didn’t debug it. It causes the following exception:

Unpacking /Users/kmike/svn/DAWG/.tox/dist/DAWG-0.5.3.zip
  Running setup.py egg_info for package from file:///Users/kmike/svn/DAWG/.tox/dist/DAWG-0.5.3.zip

Exception:
Traceback (most recent call last):
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/basecommand.py", line 107, in main
    status = self.run(options, args)
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/commands/install.py", line 256, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/req.py", line 1042, in prepare_files
    req_to_install.run_egg_info()
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/req.py", line 241, in run_egg_info
    "%(Name)s==%(Version)s" % self.pkg_info())
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/req.py", line 334, in pkg_info
    data = self.egg_info_data('PKG-INFO')
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/req.py", line 274, in egg_info_data
    data = fp.read()
  File "/Users/kmike/svn/DAWG/.tox/py32-locale/lib/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2130: ordinal not in range(128)

in https://github.com/kmike/DAWG testing suite (https://github.com/kmike/DAWG/blob/master/tox.ini). DAWG package has a non-ascii README.rst (which is loaded to long_description, binary under Python 2.x and unicode under Python 3.x).

Under Python 2.x this works fine because req.py doesn’t try to decode the data.

Issue Analytics

  • State:closed
  • Created 11 years ago
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
uranusjrcommented, Apr 21, 2020

I believe this part of the code base has been removed. pip now uses distlib to read legacy metadata (egg-info), which always uses UTF-8.

2reactions
pauleverittcommented, Oct 16, 2013

Indeed, this was the source of my problem. In one shell I had this in my environment:

LANG=en_US.UTF-8

…and pip worked. In another shell I didn’t, and got the UnicodeDecodeError exception.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can't call, dropped calls, & other calling issues - T-Mobile
Read these troubleshooting tips if you're unable to make or receive calls, or your calls drop unexpectedly on your T-Mobile device.
Read more >
10 Call Centre Problems and How to Overcome Them
High number of calls; Angry or abusive customers ; Lack of resources; Overwhelming demands (e.g. number of calls answered in an hour, average ......
Read more >
Fix problems with specific calls - CenturyLink
Find out if your phone call problems occur on all calls or just certain numbers. CenturyLink offers troubleshooting tips for when calls are...
Read more >
Call Center Problems and Solutions…Avoid Them!
One of the numerous call and contact center problems is often associated with an agent's unavailability to answer calls. This could be discouraging...
Read more >
The 5 Biggest Problems with Phone Support (and How to ...
The 5 Biggest Problems with Phone Support (and How to Solve Them) · Complaint #1 Phone Support is Slow and Time-Consuming · Complaint...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found