question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dataset slice reference - AttributeError: module 'h5py' has no attribute 'ref_dtype' - documentation outdated?

See original GitHub issue

To assist reproducing bugs, please include the following:

  • Operating System: Ubuntu 18.04
  • Python version: 3.7.3
  • Where Python was acquired: Anaconda, conda install h5py (conda-forge)
  • h5py version: 2.9.0
  • HDF5 version: 1.10.4
  • The full traceback/stack trace shown (if it appears)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-16-3e6d0b14c796> in <module>
      9 
     10     # ref slice of both dataset
---> 11     ds_ref = h5_store.create_dataset("ref", (100,), dtype=h5py.ref_dtype)
     12     ds_ref[:50] = ds1[:50]

AttributeError: module 'h5py' has no attribute 'ref_dtype'

Just started exploring linking 2 dataset slices through references, therefore I’m following this page: http://docs.h5py.org/en/stable/refs.html

To get the above error, I ran the following code (in a JupyterHub Notebook):

import h5py

with h5py.File("sliceref.h5", 'w') as h5_store:
    # create 2 datasets
    ds1 = h5_store.create_dataset('wav1', (100,))
    ds1[...] = np.arange(100)
    print(ds1[:])
    ds2 = h5_store.create_dataset('wav2', (100,))
    ds2[...] = np.arange(100, 200, 1)
    print(ds2[:])
    
    # ref slice of both dataset
    ds_ref = h5_store.create_dataset("ref", (100,), dtype=h5py.ref_dtype)
    ds_ref[:50] = ds1.regionref[:50]

Seeing print xxx statements in the documentation makes me assume that it was written for Python 2.7 and has just not been updated?


What would be the correct way of doing (combining the first half of 2 arrays through referencing, no data duplication)

ds_ref[:50] = ds1.regionref[:50]
ds_ref[50:] = ds2.regionref[:50]

?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:16 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
takluyvercommented, Oct 7, 2019

h5py.VirtualLayout … seems it’s not like something you can store in the .h5 itself?

Yes, you can store it in the file. You assemble the VirtualLayout, then pass it to f.create_virtual_dataset() to add it to the file. Have a look at the documentation about virtual datasets: http://docs.h5py.org/en/stable/vds.html

Am I correct in thinking that a reference is just like e.g. [5:2] (for a 1-D array) index values? So if you want to store a complete reference, you need to store both a Reference to the correct dataset and a RegionReference to select the right region in that dataset?

I had to look this up (I haven’t used references before). A region reference is both a reference to the dataset and to a selection from that dataset, so you only need to store one thing.

To use it with h5py, you need to use it in two lookups: once to get the dataset, once to get the data from it:

f[regionref][regionref]

TypeError: int() argument must be a string, a bytes-like object or a number, not ‘h5py.h5r.Reference’

I think you’re mixing up Reference and RegionReference. When you create the dataset to store references, use h5py.regionref_dtype instead of h5py.ref_dtype.

1reaction
NumesSanguiscommented, Oct 3, 2019

I’ve updated my h5py to 2.10.0 with conda install -c conda-forge h5py (Anaconda channel is still on 2.9.0).

Before updating the documentation, it would be nice to also include how to use the references.

Reference usage

First running the above code, and then this:

with h5py.File("sliceref.h5", 'r') as h5_store:
    print(h5_store["ref"][:])

Expectation:

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17.  18. 19. 20. 21. 22. 23. 24. 25. 26.
27. 28. 29. 30. 31. 32. 33. 34. 35.  36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 100. 101.
102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113.  114. 115. 116. 117. 118. 119. 120.
121. 122. 123. 124. 125. 126. 127.  128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139.
140. 141.  142. 143. 144. 145. 146. 147. 148. 149.]

Reality:

[<HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
 <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
 ...
 <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>]

Question

How do I use references to get the output of my expectation? Or should I use a different approach for that?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dataset slice reference - AttributeError: module 'h5py' has no ...
Dataset slice reference - AttributeError: module 'h5py' has no attribute 'ref_dtype' - documentation outdated?
Read more >
AttributeError: module 'h5py' has no attribute 'file'
You are writing a file. Does directory/folder output/ exist in the location where you're executing the script? Try without the directory/folder ...
Read more >
Module 'h5py' has no attribute 'FIle' - fast.ai Course Forums
I'm running lesson1.ipynb and when it does vgg = Vgg16(), the program return error: module 'h5py' has no attribute 'FIle'.
Read more >
h5py Documentation - Read the Docs
The object we obtained isn't an array, but an HDF5 dataset. Like NumPy arrays, datasets have both a shape and a data type:...
Read more >
Index — h5py 3.7.0 documentation
_ | A | B | C | D | E | F | G | H | I | K | L...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found