question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dvc.api.read() raises an "UnicodeDecodeError"

See original GitHub issue

I am trying to acess a DICOM file [image saved in the Digital Imaging and Communications in Medicine (DICOM) format]:

import dvc.api

path = 'dir/image.dcm'
remote = 'remote_name'
repo = 'git_repo'
mode = 'r'

data = dvc.api.read(path = path, remote = remote, repo = repo, mode = mode)

When I run the previous code, and after the “downloading progress bar” is complete, I get the following error:

Traceback (most recent call last): File "draft.py", line 7, in <module> mode ='r') File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\dvc\api.py", line 91, in read return fd.read() File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 764: character maps to <undefined>

I tried to overcome this issue by using the encoding argument:

data = dvc.api.read(path = path, remote = remote, repo = repo, mode = mode, encoding='ANSI')

Since, when I open a DICOM file using for example Notepad++, this is the encoding specified. However, it raises the error:

Exception ignored in: <bound method Pool.__del__ of <dvc.fs.pool.Pool object at 0x0000021D1347A160>> Traceback (most recent call last): File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\dvc\fs\pool.py", line 42, in __del__ File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\dvc\fs\pool.py", line 46, in close File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\dvc\fs\ssh\connection.py", line 71, in close File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\paramiko\sftp_client.py", line 194, in close File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\paramiko\sftp_client.py", line 185, in _log File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\site-packages\paramiko\sftp.py", line 158, in _log File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\logging\__init__.py", line 1372, in log File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\logging\__init__.py", line 1441, in _log File "C:\Users\lbrandao\anaconda3\envs\my_env\lib\logging\__init__.py", line 1411, in makeRecord TypeError: 'NoneType' object is not callable

I also tried encoding = 'utf-8', but the “UnicodeDecodeError” continues to appear:

Traceback (most recent call last): File "draft.py", line 7, in <module> mode ='r', encoding='utf-8') File "C:\Users\lbrandao\anaconda3\envs\ccab_env_dev\lib\site-packages\dvc\api.py", line 91, in read return fd.read() File "C:\Users\lbrandao\anaconda3\envs\ccab_env_dev\lib\codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 140: invalid continuation byte

Can anyone please help? Thanks.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
lilianabrandaocommented, Aug 13, 2021

@pared Here it is:

dvc

By the way, can the API work with lower DVC versions, e.g. 0.9.4?

0reactions
efiopcommented, Sep 21, 2021

@lilianabrandao We’ve migrated to sshfs (asyncssh instead of paramiko inside), so that ssh error that you were getting with rb should be resolved in recent dvc versions. Please give it a try and let us know if you still run into this issue. Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

dvc.api.read() raises an "UnicodeDecodeError" - Stack Overflow
The scope of using dvc.api.read() is only to retrieve/stream the data files from DVC remote to a Python script. Only afterwards can Pydicom,...
Read more >
dvc.api.read()
Description. This function wraps dvc.api.open() , for a simple way to return the complete contents of a file tracked in a DVC projectDVC...
Read more >
dvc.api.open unexpectedly fails on files absent from a ... - GitHub
Bug Report dvc.api.open (from the Python API) unexpectedly fails with an AssertionError when trying to open a file that should be contained ...
Read more >
How to raise UnicodeDecodeError in Python 3
For debugging work, I needed to manually raise UnicodeDecodeError in CPython 3(.4). Its constructor requires 5 arguments:.
Read more >
Changelog — PyGMT
Let pygmt.show_versions() report geopandas version (#1846) ... Raise an exception if the given parameter is not recognized and is longer than 2 characters ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found