question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

UnicodeDecodeError when run multithreaded

See original GitHub issue

Code Sample, a copy-pastable example if possible

# TODO: working on it

Problem description

This is something that’s been noticed in Satpy specifically and is being tracked here: https://github.com/pytroll/satpy/issues/1114

The bottom line is that a couple of our users have been getting UnicodeDecodeErrors or errors about bad proj definitions. The really annoying bit is that is seems to be some sort of race condition or other multi-threading related issue. We are using xarray+dask and have a pyproj CRS object in the .coords of our DataArrays. We get errors like:

    return [_execute_task(a, cache) for a in arg]
  File "/work/geo2grid/lib/python3.7/site-packages/dask/core.py", line 122, in _execute_task
    elif arg in cache:
  File "/work/geo2grid/lib/python3.7/site-packages/pyproj/crs/crs.py", line 869, in __hash__
    return hash(self.to_wkt())
  File "pyproj/_crs.pyx", line 451, in pyproj._crs.Base.to_wkt
  File "pyproj/_crs.pyx", line 120, in pyproj._crs._to_wkt
  File "pyproj/_crs.pyx", line 24, in pyproj._crs.cstrdecode
  File "/work/geo2grid/lib/python3.7/site-packages/pyproj/compat.py", line 21, in pystrdecode
    return cstr.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 0: invalid continuation byte
Command exited with non-zero status 1

Or:

  File "C:\ProgramData\Miniconda3\lib\site-packages\pyresample\geometry.py", line 1012, in invproj
    target_proj = Proj(proj_dict)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyresample\_spatial_mp.py", line 121, in __init__
    **kwargs)
  File "C:\ProgramData\Miniconda3\lib\site-packages\pyproj\proj.py", line 171, in __init__
    super().__init__(cstrencode(projstring.strip()))
  File "pyproj/_proj.pyx", line 30, in pyproj._proj.Proj.__init__
pyproj.exceptions.ProjError: Invalid projection b'C'.: (Internal Proj Error: proj_create: unrecognized format / unknown name)

And other times it will print out the invalid projection with characters mixed in where they shouldn’t be. Like very clearly wrong changes where +proj=merc is changed to some odd unicode character in place of the p in proj.

I’m trying my best to reproduce this, but so far have been unsuccessful which is why I don’t have a reproducible example yet. I’ve only ever noticed this in logs.

Expected Output

No error.

Environment Information

  • Output from: python -m pyproj -v
pyproj info:
    pyproj: 2.5.0
      PROJ: 6.3.0
  data dir: /data1/users/davidh/miniconda3/envs/geo2grid_dist/share/proj

System:
    python: 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48)  [GCC 7.3.0]
executable: /data1/users/davidh/miniconda3/envs/geo2grid_dist/bin/python
   machine: Linux-2.6.32-573.12.1.el6.x86_64-x86_64-with-centos-6.10-Final

Python deps:
       pip: 20.0.2
setuptools: 45.2.0.post20200209
    Cython: None

Specific conda-forge builds:

proj                      6.3.0                hc80f0dc_0    conda-forge
pyproj                    2.5.0            py37h8ff28aa_0    conda-forge

Installation method

  • conda, pip wheel, from source, etc…

Conda environment information (if you installed with conda):

I mentioned specific conda packages above, but we’ve seen this now on Ubuntu, Windows, and a CentOS 7 docker container running a conda-pack’d version of a conda-forge environment.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (15 by maintainers)

github_iconTop GitHub Comments

1reaction
snowman2commented, Feb 26, 2021
1reaction
djhoesecommented, Apr 16, 2020

Sorry, I thought I closed this already. This was our fault for using a CRS object with a dask map_blocks function (passing CRS objects between threads).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Adding a time.sleep to a multithreaded program solves a ...
If the time.sleep is uncommented, then this code works, however without time.sleep I get an UnicodeDecodeError in the log function.
Read more >
Python UnicodeDecodeError utf-8 codec can t decode byte ...
Unable to import this file it shows an error. My code was: import pandas as pd a = pd.read_csv("filename.csv")
Read more >
'ascii' codec can't decode byte 0xe4 in position 11: ordinal not ...
I have my keyboard to US and running US English. the image I have is what's refereenced above 21-5.iso I tryed downloading a...
Read more >
Common middleware raises UnicodeDecodeError if receives ...
META['QUERY_STRING'] UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in ... 'wsgi.multiprocess': True, 'wsgi.multithread': False, 'wsgi.run_once': ...
Read more >
'ascii' codec cannot decode byte in position | CDP Private Cloud
You may see an error such as the following while downloading Impala query results in CSV format from Hue: " UnicodeDecodeError: 'ascii' codec...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found