question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tree option to omit array metadata (shape, dtype)

See original GitHub issue

When using the tree() function/method, currently arrays are printed with shape and dtype. This is useful diagnostic information but requires that the .zarray resource is retrieved and read for every array in the tree. This is not an issue for data stored locally, but can be an issue for remote storage as retrieving each .zarray resource will require a network round-trip.

Proposed to add an option meta=True to the tree() function/method, which if set to meta=False will omit the array metadata in the output, and thus building the tree representation will require only retrieving the list of keys from the store.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:11 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
jakirkhamcommented, Apr 12, 2021

This may be a different issue. Would suggest looking into consolidated metadata

0reactions
PaulJWrightcommented, Apr 12, 2021

Has there been any progress on this? I am noticing very large wall times (currently at ~6 min) with data stored on GCP. I am new to zarr in general, so any advice to reduce this would be great too!

gcs = gcsfs.GCSFileSystem(access='read_only')
store = gcsfs,GCSMap('file.zarr', gcs=gcs, check=False)
root = zarr.group(store)

%time print(root.tree())
/
 β”œβ”€β”€ 2010
 β”‚   β”œβ”€β”€ 131A (47116, 512, 512) float32
 β”‚   β”œβ”€β”€ 1600A (47972, 512, 512) float32
 β”‚   β”œβ”€β”€ 1700A (46858, 512, 512) float32
 β”‚   β”œβ”€β”€ 171A (47186, 512, 512) float32
 β”‚   β”œβ”€β”€ 193A (47134, 512, 512) float32
 β”‚   β”œβ”€β”€ 211A (47186, 512, 512) float32
 β”‚   β”œβ”€β”€ 304A (47131, 512, 512) float32
 β”‚   β”œβ”€β”€ 335A (47187, 512, 512) float32
 β”‚   └── 94A (46930, 512, 512) float32
 β”œβ”€β”€ 2011
 β”‚   β”œβ”€β”€ 131A (75200, 512, 512) float32
 β”‚   β”œβ”€β”€ 1600A (75814, 512, 512) float32
 β”‚   β”œβ”€β”€ 1700A (74839, 512, 512) float32
 β”‚   β”œβ”€β”€ 171A (75660, 512, 512) float32
 β”‚   β”œβ”€β”€ 193A (75664, 512, 512) float32
 β”‚   β”œβ”€β”€ 211A (75678, 512, 512) float32
 β”‚   β”œβ”€β”€ 304A (74199, 512, 512) float32
 β”‚   β”œβ”€β”€ 335A (75624, 512, 512) float32
 β”‚   └── 94A (75138, 512, 512) float32
 β”œβ”€β”€ 2012
 β”‚   β”œβ”€β”€ 131A (76849, 512, 512) float32
 β”‚   β”œβ”€β”€ 1600A (76630, 512, 512) float32
 β”‚   β”œβ”€β”€ 1700A (69091, 512, 512) float32
 β”‚   β”œβ”€β”€ 171A (76750, 512, 512) float32
 β”‚   β”œβ”€β”€ 193A (76852, 512, 512) float32
 β”‚   β”œβ”€β”€ 211A (76870, 512, 512) float32
 β”‚   β”œβ”€β”€ 304A (76851, 512, 512) float32
 β”‚   β”œβ”€β”€ 335A (76855, 512, 512) float32
 β”‚   └── 94A (76878, 512, 512) float32
 β”œβ”€β”€ 2013
 β”‚   β”œβ”€β”€ 131A (82719, 512, 512) float32
 β”‚   β”œβ”€β”€ 1600A (83001, 512, 512) float32
 β”‚   β”œβ”€β”€ 1700A (74989, 512, 512) float32
 β”‚   β”œβ”€β”€ 171A (82633, 512, 512) float32
 β”‚   β”œβ”€β”€ 193A (82716, 512, 512) float32
 β”‚   β”œβ”€β”€ 211A (82746, 512, 512) float32
 β”‚   β”œβ”€β”€ 304A (82715, 512, 512) float32
 β”‚   β”œβ”€β”€ 335A (82723, 512, 512) float32
 β”‚   └── 94A (82746, 512, 512) float32
 └── 2014
     β”œβ”€β”€ 131A (73605, 512, 512) float32
     β”œβ”€β”€ 1600A (73390, 512, 512) float32
     β”œβ”€β”€ 1700A (66326, 512, 512) float32
     β”œβ”€β”€ 171A (73487, 512, 512) float32
     β”œβ”€β”€ 193A (73603, 512, 512) float32
     β”œβ”€β”€ 211A (73617, 512, 512) float32
     β”œβ”€β”€ 304A (73602, 512, 512) float32
     β”œβ”€β”€ 335A (73604, 512, 512) float32
     └── 94A (73618, 512, 512) float32
CPU times: user 1min 11s, sys: 1.9 s, total: 1min 13s
Wall time: 6min 14s
Read more comments on GitHub >

github_iconTop Results From Across the Web

Convenience functions (zarr.convenience) β€” zarr 2.13.3 ...
Convenience function to save an array or group of arrays to the local file system. Parameters. storeMutableMapping or string. Store or path to...
Read more >
NeXus Tree API Modules β€” NeXpy 1.0.0rc1 documentation
Value, shape, dtype, and attributes of the field ... NXfields usually consist of arrays of numeric data with associated meta-data, the NeXus attributes....
Read more >
NumPy Internals: An Introduction - Towards Data Science
By changing the metadata it is possible to change the shape, transpose or slice an array without rearranging the raw data. The data...
Read more >
How to keep column names when converting from pandas to ...
Really, I'd just like to maintain the column_name meta data for arrays passed through a deep tree of sci-kit predictors. Its interface's .fit(X,Β ......
Read more >
IO tools (text, CSV, HDF5, …) β€” pandas 1.5.2 documentation
Number of lines at bottom of file to skip (unsupported with engine='c'). ... preservation of metadata including but not limited to dtypes and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found