Nested storage detection in Zarr V2
See original GitHub issuesee: https://github.com/ome/ngff/issues/29 and https://github.com/bcdev/jzarr/pull/17
In order to better handle Zarr arrays created with NestedDirectoryStorage
or FSStore(key_separator="/")
, SabineEmbacher and I have been working on a “protocol heuristic” that can be used by V2 implementations to detect nested chunking rather than requiring the user to specify it correctly.
tl;dr: This proposes a new key for .zarray
which it would be good to have feedback on.
Proposal
When creating a zarr array:
- add to .zarray json:
{"dimension_separator": "/"}
- always write a 0-position chunk
When opening an array:
- try to read the separator character from the .zarray json
- if not available, try to find a 0-position chunk
- if not available, at every read action try to find chunks with both standard variants until the situation is clarified. (Standard separator list
["/", "."]
)
Points for discussion:
- The name
dimension_separator
differs from the code implementationkey_separator
to reduce confusion about whether every separator in the key name is effected. - There has been some discussion (community call, gitter) about whether or not “/” could become the default.
- Is an addition to the .zarray metadata sufficiently low impact to be rolled into v2?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:8 (5 by maintainers)
Top Results From Across the Web
Developers - Nested storage detection in Zarr V2 -
In order to better handle Zarr arrays created with NestedDirectoryStorage or FSStore(key_separator="/") , SabineEmbacher and I have been working on a "protocol ...
Read more >Release notes — zarr 2.9.3 documentation
In this release, any object that implements the MutableMapping interface can be used as an array store. See the tutorial sections on Persistent...
Read more >Storage (zarr.storage) — zarr 2.9.4 documentation
The NestedDirectoryStore class provides an alternative where chunk files for multidimensional arrays will be organised into a directory hierarchy, ...
Read more >Release notes — zarr 2.3.2 documentation
Makes azure-storage-blob optional for testing. ... Support has been added for structured arrays with sub-array shape and/or nested fields.
Read more >Release notes — zarr 2.13.3 documentation - Read the Docs
Fix bug in N5 storage that prevented arrays located in the root of the hierarchy from ... from testing, and type annotations were...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I consider the
dimension_separator
saga complete! whewNo worries, @SabineEmbacher. The suggestions were definitely useful. It’s more just a matter of https://martinfowler.com/bliki/TwoHardThings.html …
Since
key_separator
leads to confusion anddimensionSeparator
/dimension_separator
is at least used in some other implementation, I’ve updated the description withdimension_separator
.On the community call last night, there were no objections to moving forward with the .zarray addition, so I’ll open a v2 spec PR now.
I’m a bit more hesitant about defining the heuristic as part of the specification (cf. @jbms comment) I’ll leave this open for a discussion of where first-chunk writing falls on the MAY/SHOULD/MUST spectrum.