Tree option to omit array metadata (shape, dtype)
See original GitHub issueWhen using the tree()
function/method, currently arrays are printed with shape and dtype. This is useful diagnostic information but requires that the .zarray
resource is retrieved and read for every array in the tree. This is not an issue for data stored locally, but can be an issue for remote storage as retrieving each .zarray
resource will require a network round-trip.
Proposed to add an option meta=True
to the tree()
function/method, which if set to meta=False
will omit the array metadata in the output, and thus building the tree representation will require only retrieving the list of keys from the store.
Issue Analytics
- State:
- Created 6 years ago
- Comments:11 (10 by maintainers)
Top Results From Across the Web
Convenience functions (zarr.convenience) β zarr 2.13.3 ...
Convenience function to save an array or group of arrays to the local file system. Parameters. storeMutableMapping or string. Store or path to...
Read more >NeXus Tree API Modules β NeXpy 1.0.0rc1 documentation
Value, shape, dtype, and attributes of the field ... NXfields usually consist of arrays of numeric data with associated meta-data, the NeXus attributes....
Read more >NumPy Internals: An Introduction - Towards Data Science
By changing the metadata it is possible to change the shape, transpose or slice an array without rearranging the raw data. The data...
Read more >How to keep column names when converting from pandas to ...
Really, I'd just like to maintain the column_name meta data for arrays passed through a deep tree of sci-kit predictors. Its interface's .fit(X,Β ......
Read more >IO tools (text, CSV, HDF5, β¦) β pandas 1.5.2 documentation
Number of lines at bottom of file to skip (unsupported with engine='c'). ... preservation of metadata including but not limited to dtypes and...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
This may be a different issue. Would suggest looking into consolidated metadata
Has there been any progress on this? I am noticing very large wall times (currently at ~6 min) with data stored on GCP. I am new to zarr in general, so any advice to reduce this would be great too!