zarr.open_array does not properly recognize NestedDirectoryStore
See original GitHub issueimport numpy
import zarr
a = zarr.create((10, 100, 100), chunks = (1, 100, 100), dtype = 'f4',
store = zarr.NestedDirectoryStore('foo.zarr'), overwrite = True)
a[...] = 1.0
print('shape =', a.shape)
print('a has non-zero values =', numpy.any(a))
del a
a = zarr.open_array('foo.zarr', mode = 'r')
print('shape =', a.shape)
print('a has non-zero values =', numpy.any(a))
Problem description
A zarr ‘file’ created using NestedDirectoryStore will not load correctly when later opened using the path alone. The shape (and other metadata) is correct, but the values are not loaded. In the code sample above, the shape of the original and reloaded array is the same, but the contents don’t match. The second array contains only zeroes.
Version and installation information
- Value of
zarr.__version__
- 2.3.2 - Value of
numcodecs.__version__
- 0.6.4 - Version of Python interpreter - 3.6.7 (anaconda)
- Operating system (Linux/Windows/Mac) - Mac OS Mojave (10.14.6)
- How Zarr was installed - using conda
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:11 (7 by maintainers)
Top Results From Across the Web
Storage (zarr.storage) — zarr 2.13.3 documentation
The NestedDirectoryStore class provides an alternative where chunk files for multidimensional arrays will be organised into a directory hierarchy, thus reducing ...
Read more >Tutorial — zarr 2.13.3 documentation - Read the Docs
Zarr provides classes and functions for working with N-dimensional arrays that behave like NumPy arrays but whose data is divided into chunks and...
Read more >Source code for zarr.storage
Source code for zarr.storage. """This module contains storage classes for use with Zarr arrays and groups. Note that any object implementing the :class:` ......
Read more >Storage (zarr.storage) — zarr 2.7.0 documentation
The NestedDirectoryStore class provides an alternative where chunk files for multidimensional arrays will be organised into a directory hierarchy, thus reducing ...
Read more >Convenience functions (zarr.convenience) — zarr 2.13.3 ...
If loading data from a group of arrays, data will not be immediately loaded into memory. Rather, arrays will be loaded into memory...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
+1 for this issue!
Instead of guessing the store from the file extension, can the
store
class name simply be added to the metadata in.zarray
?FWIW I agree that ideally the application should not need to know ahead of time which store path separator has been used for chunks, but should be able to discover that from the array metadata. This is something I have proposed to introduce in the v3 spec, although that is still just a draft.
For the current implementation based on the v2 spec we could either leave it as is, and try to be clear in documentation that if you use nested store you need to communicate this somehow to users of the data so they know how to open correctly. Alternatively we could add some metadata field in the .zarray to communicate which chunk path separator is used.
Given this will be fixed in the v3 spec, I’d be inclined to wait until then. Happy to discuss though.
On Fri, 28 Feb 2020, 18:31 Stuart Berg, notifications@github.com wrote: