question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sc.read_h5ad randomly produces AnnDataReadError/OSError

See original GitHub issue

I am trying to load some datasets with sc.read_h5ad(file_name). Frequently, I get the below error. When I re-run the code multiple times or at different times it sometimes works, but often I get the error (using the same code and data). This happens when reading different h5ad datasets (e.g. is not specific to one dataset). At all times there seems to be enough free RAM / similar amount of free RAM. This happens both when using jupyter-notebook and python without jn.

Error:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs)
    155         try:
--> 156             return func(elem, *args, **kwargs)
    157         except Exception as e:

~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/anndata/_io/h5ad.py in read_group(group)
    505     if "h5sparse_format" in group.attrs:  # Backwards compat
--> 506         return SparseDataset(group).to_memory()
    507 

~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/anndata/_core/sparse_dataset.py in to_memory(self)
    370         mtx = format_class(self.shape, dtype=self.dtype)
--> 371         mtx.data = self.group["data"][...]
    372         mtx.indices = self.group["indices"][...]

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/h5py/_hl/dataset.py in __getitem__(self, args)
    572         fspace = selection.id
--> 573         self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
    574 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5d.pyx in h5py.h5d.DatasetID.read()

h5py/_proxy.pyx in h5py._proxy.dset_rw()

h5py/_proxy.pyx in h5py._proxy.H5PY_H5Dread()

OSError: Can't read data (file read failed: time = Sat Aug  1 13:27:54 2020
, filename = '/path.../filtered_gene_bc_matrices.h5ad', file descriptor = 47, errno = 5, error message = 'Input/output error', buf = 0x55ec782e9031, total read size = 7011, bytes this sub-read = 7011, bytes actually read = 18446744073709551615, offset = 0)

During handling of the above exception, another exception occurred:

AnnDataReadError                          Traceback (most recent call last)
<ipython-input-14-faac769583f8> in <module>
     17     #while True:
     18         #try:
---> 19             adatas.append(sc.read_h5ad(file))
     20             file_diffs.append('_'.join([file.split('/')[i] for i in diff_path_idx]))
     21             #break

~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    411                 d[k] = read_dataframe(f[k])
    412             else:  # Base case
--> 413                 d[k] = read_attribute(f[k])
    414 
    415         d["raw"] = _read_raw(f, as_sparse, rdasp)

~/miniconda3/envs/rpy2_3/lib/python3.8/functools.py in wrapper(*args, **kw)
    873                             '1 positional argument')
    874 
--> 875         return dispatch(args[0].__class__)(*args, **kw)
    876 
    877     funcname = getattr(func, '__name__', 'singledispatch function')

~/miniconda3/envs/rpy2_3/lib/python3.8/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs)
    160             else:
    161                 parent = _get_parent(elem)
--> 162                 raise AnnDataReadError(
    163                     f"Above error raised while reading key {elem.name!r} of "
    164                     f"type {type(elem)} from {parent}."

AnnDataReadError: Above error raised while reading key '/X' of type <class 'h5py._hl.group.Group'> from /.

Versions:

scanpy==1.5.1 anndata==0.7.4 umap==0.4.6 numpy==1.18.5 scipy==1.4.1 pandas==1.0.5 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.8.2 louvain==0.6.1 leidenalg==0.8.1

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:17 (9 by maintainers)

github_iconTop GitHub Comments

9reactions
ktpolanskicommented, Jul 28, 2022

I’m pretty sure none of you are having the same issue as the original one reported here. Compare @abuchin 's error message of KeyError: 'dict' to the original poster’s error of OSError: Can't read data.

The thing you’re seeing is a new one stemming from an update to anndata. You’re trying to read in a h5ad file created with a newer version of the package with your older one. I think the cutoff point is 0.8.0 but I could be mistaken.

Upgrade your anndata and you should be ok.

5reactions
abuchincommented, Jun 3, 2022

Found the same error in our internal workflows. Saved the data to h5py files, but could not open them anymore for some reason.

Error:

"--------------------------------------------------------------------------- KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs) 155 try: –> 156 return func(elem, *args, **kwargs) 157 except Exception as e:

/opt/conda/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_group(group) 531 if encoding_type: –> 532 EncodingVersions[encoding_type].check( 533 group.name, group.attrs[“encoding-version”]

/opt/conda/lib/python3.7/enum.py in getitem(cls, name) 356 def getitem(cls, name): –> 357 return cls.member_map[name] 358

KeyError: ‘dict’

During handling of the above exception, another exception occurred:

AnnDataReadError Traceback (most recent call last) <ipython-input-20-38a594ec7d06> in <module> ----> 1 adata_ast=sc.read_h5ad(‘…/…/data_processed/Leng_2020/adata_ast.h5ad’)

/opt/conda/lib/python3.7/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size) 424 d[k] = read_dataframe(f[k]) 425 else: # Base case –> 426 d[k] = read_attribute(f[k]) 427 428 d[“raw”] = _read_raw(f, as_sparse, rdasp)

/opt/conda/lib/python3.7/functools.py in wrapper(*args, **kw) 838 ‘1 positional argument’) 839 –> 840 return dispatch(args[0].class)(*args, **kw) 841 842 funcname = getattr(func, ‘name’, ‘singledispatch function’)

/opt/conda/lib/python3.7/site-packages/anndata/_io/utils.py in func_wrapper(elem, *args, **kwargs) 161 parent = _get_parent(elem) 162 raise AnnDataReadError( –> 163 f"Above error raised while reading key {elem.name!r} of " 164 f"type {type(elem)} from {parent}." 165 )

AnnDataReadError: Above error raised while reading key ‘/layers’ of type <class ‘h5py._hl.group.Group’> from /. adata_ast=sc.read_h5ad(‘…/…/data_processed/Leng_2020/adata_ast.h5ad’)"

Versions: "Package Version


absl-py 1.1.0 aiohttp 3.8.1 aiosignal 1.2.0 anndata 0.7.5 anndata2ri 1.0.6 annoy 1.17.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asn1crypto 1.4.0 async-timeout 4.0.2 asynctest 0.13.0 attrs 20.3.0 backcall 0.2.0 beautifulsoup4 4.11.1 bleach 5.0.0 boto3 1.17.66 botocore 1.20.66 brotlipy 0.7.0 cached-property 1.5.2 cachetools 5.2.0 certifi 2020.12.5 cffi 1.14.5 chardet 4.0.0 charset-normalizer 2.0.12 chex 0.1.3 click 8.1.3 colormath 3.0.0 commonmark 0.9.1 conda 4.6.14 conda-package-handling 1.7.3 cryptography 3.4.7 cycler 0.10.0 Cython 0.29.30 decorator 5.0.7 defusedxml 0.7.1 dill 0.3.3 dm-tree 0.1.7 docrep 0.3.2 entrypoints 0.4 et-xmlfile 1.1.0 fa2 0.3.5 fastjsonschema 2.15.3 flatbuffers 2.0 flax 0.5.0 frozenlist 1.3.0 fsspec 2022.5.0 future 0.18.2 get-version 2.2 google-auth 2.6.6 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 grpcio 1.46.3 h5py 3.2.1 idna 2.10 imageio 2.19.3 importlib-metadata 4.11.4 importlib-resources 5.7.1 ipykernel 5.5.4 ipython 7.23.1 ipython-genutils 0.2.0 ipywidgets 7.7.0 jax 0.3.13 jaxlib 0.3.10 jedi 0.18.0 Jinja2 3.1.2 jmespath 0.10.0 joblib 1.0.1 jsonschema 4.6.0 jupyter-client 6.1.12 jupyter-core 4.7.1 jupyterlab-pygments 0.2.2 jupyterlab-widgets 1.1.0 kiwisolver 1.3.1 legacy-api-wrap 1.2 leidenalg 0.8.4 llvmlite 0.35.0 loompy 3.0.7 louvain 0.7.0 Markdown 3.3.7 MarkupSafe 2.1.1 matplotlib 3.4.1 matplotlib-inline 0.1.2 mistune 0.8.4 msgpack 1.0.4 multidict 6.0.2 multipledispatch 0.6.0 multiprocess 0.70.11.1 natsort 7.1.1 nbclient 0.6.4 nbconvert 6.5.0 nbformat 5.4.0 nest-asyncio 1.5.5 networkx 2.5 notebook 6.4.11 numba 0.52.0 numexpr 2.7.3 numpy 1.19.5 numpy-groupies 0.9.17 numpyro 0.9.2 oauthlib 3.2.0 openpyxl 3.0.10 opt-einsum 3.3.0 optax 0.1.2 packaging 20.9 pandas 1.2.0 pandocfilters 1.5.0 parso 0.8.2 pathos 0.2.7 patsy 0.5.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.1.1 pip 21.1.1 pox 0.2.9 ppft 1.6.6.3 prometheus-client 0.14.1 prompt-toolkit 3.0.18 protobuf 3.19.0 protobuf3-to-dict 0.1.5 ptyprocess 0.7.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycosat 0.6.3 pycparser 2.20 pyDeprecate 0.3.1 Pygments 2.9.0 pyOpenSSL 20.0.1 pyparsing 2.4.7 pyro-api 0.1.2 pyro-ppl 1.8.1 pyrsistent 0.18.1 PySocks 1.7.1 python-dateutil 2.8.1 python-igraph 0.9.1 pytorch-lightning 1.5.10 pytz 2021.1 PyWavelets 1.3.0 PyYAML 6.0 pyzmq 22.0.3 requests 2.25.1 requests-oauthlib 1.3.1 rich 12.4.4 rpy2 3.4.2 rsa 4.8 ruamel-yaml-conda 0.15.80 ruamel.yaml 0.17.21 ruamel.yaml.clib 0.2.6 s3transfer 0.4.2 sagemaker 2.39.0.post0 scanpy 1.6.1 scikit-image 0.19.2 scikit-learn 0.24.2 scikit-misc 0.1.4 scipy 1.6.0 scrublet 0.2.3 scvi-tools 0.16.2 seaborn 0.11.1 Send2Trash 1.8.0 setuptools 59.5.0 setuptools-scm 6.0.1 sinfo 0.3.1 six 1.15.0 smdebug-rulesconfig 1.0.1 soupsieve 2.3.2.post1 spectra 0.0.11 statsmodels 0.12.2 stdlib-list 0.8.0 tables 3.6.1 tensorboard 2.9.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 terminado 0.15.0 texttable 1.6.3 threadpoolctl 2.1.0 tifffile 2021.11.2 tinycss2 1.1.1 toolz 0.11.2 torch 1.11.0 torchmetrics 0.9.0 tornado 6.1 tqdm 4.60.0 traitlets 5.2.2.post1 typing-extensions 4.2.0 tzlocal 2.1 umap-learn 0.4.6 urllib3 1.26.4 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 2.1.2 wheel 0.36.2 widgetsnbextension 3.6.0 yarl 1.7.2 zipp 3.4.1 Note: you may need to restart the kernel to use updated packages."

Has anyone found any solution to work around this issue?

Read more comments on GitHub >

github_iconTop Results From Across the Web

sc.read_h5ad randomly produces AnnDataReadError/OSError
This happens when reading different h5ad datasets (e.g. is not specific to one dataset). At all times there seems to be enough free...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found