question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Store global attributes as HDF5 Datasets

See original GitHub issue

Global (file-level) attributes are currently stored as HDF5 Attributes. However, such attributes are limited to be small (no hard limit but the spec says 16 kB) and cannot be sliced.

However, it would be useful to be able to store arbitrarily large amounts of data on the global level, such as pickled objects, images, or other supporting data.

Two options

  1. Add a new API for large global objects (say, LoomConnection.blobs) and store them as Datasets (e.g. under /global) in the file. This would retain backwards compatibility but would require maintaining two different APIs that do almost the same thing. New files will use a mixture of old-style and new-style attributes indefinitely. Only new-style global attributes in new files would be invisible when opened using an older library implementation.

  2. Keep the current API but change the Loom file format spec to store global attributes as Datasets (e.g. under /global). Implementors would still need to look for attributes both as HDF5 attributes and as Datasets, to ensure old files would still be readable. New files will use a consistent API and consistent file format. For backwards-compatibility, implementors should write global attributes as HDF5 Attributes (in addition to writing them as Datasets) if they are smaller than 16 kB. Larger global attributes in new files would be invisible when opened using an older library implementation.

I think option 2 is nicer and should be compatible enough.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
slinnarssoncommented, Sep 19, 2019

Fixed in loompy3.0 branch

1reaction
slinnarssoncommented, Apr 18, 2019

Not yet, but I’ll work on it. I think it’s soon time for a loompy 3 release, which will make it possible to make changes to the file spec.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Attributes — h5py 3.7.0 documentation
AttributeManager objects are created directly by h5py. You should access instances by group.attrs or dataset.attrs , not by manually creating them.
Read more >
Chapter 8: HDF5 Attributes
An HDF5 attribute is a small metadata object describing the nature and/or intended usage of a primary data object. A primary data object...
Read more >
How to create attributes to the groups and access them in hdf5 ...
You need to use the same group and attribute names as when you created them. Simple code to print the attribute based on...
Read more >
Introduction to HDF5
HDF5 Attributes ​​ Attributes are small named datasets that are attached to primary datasets, groups, or named datatypes. Attributes can be used ...
Read more >
6. Storing Metadata with Attributes - Python and HDF5 [Book]
Chapter 6. Storing Metadata with Attributes Groups and datasets are great for keeping data organized in a file. But the feature that really...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found