Representing schema in xarray
See original GitHub issueIn order to get to feature parity with pymc3 plotting, we need to have a way to access sampler statistics (specifically, to access divergences), and ppc samples. @aseyboldt outlined a good way to think about the rest of the schema here.
At the same time, xarray supports groups, but it doesn’t look like it does so natively (yet? see the discussion at https://github.com/pydata/xarray/issues/1092).
I am proposing something like
import xarray as xr
import netCDF4 as nc
class Trace(object):
def __init__(self, filename):
self.filename = filename
self.data = nc.Dataset(filename)
self.groups = self.data.groups
def __getattr__(self, name):
if name in self.groups:
return xr.open_dataset(self.filename, group=name)
raise AttributeError("informative message")
def __dir__(self):
"""Allows for tab completion on netCDF group names"""
return super(Trace, self).__dir__() + list(self.groups.keys())
This is a pretty light wrapper around netCDF and xarray. Usage is something like
t = Trace('mytrace.nc')
t.posterior # this is an xarray.Dataset
t.posterior.mu.mean() # calculate the mean of a variable
I think this will have to change a little bit so that nested groups work fine. In particular, something like
t = Trace('mytrace.nc')
t.sampler_stats.divergences # should return an xarray.Dataset
t.sampler_stats # I think this would tend to return an empty xarray.Dataset
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
Representing & checking Dataset schemas #1900 - GitHub
Somewhat related to this issue, I have implemented in xarray-simlab some logic to validate xarray.Variable objects (dimensions, dtype, etc.).
Read more >Data Structures - Xarray
It is designed as an in-memory representation of the data model from the netCDF file format. In addition to the dict-like ... _images/dataset-diagram.png....
Read more >xarray.Dataset
A multi-dimensional, in memory, array database. A dataset resembles an in-memory representation of a NetCDF file, and consists of variables, coordinates and ...
Read more >Data Structures — xray 0.3.1 documentation - Xarray
It is designed as an in-memory representation of the data model from the netCDF file format. ... _images/dataset-diagram.png.
Read more >Xarray's Data structures
Xarray has two representation types: "html" (which is only available in notebooks) and "text" . To choose between them, use the display_style option....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
This seems more or less reasonable to me, but note that opening a netCDF file isn’t always cheap (this is also true to a lesser extent with creating an xarray.Dataset). I expect you will be happier with caching or eagerly creating Dataset objects rather than recreating them in
__getattr__
.Closed by #173 and #176