Challenges running xarray wrapped netcdf files
See original GitHub issueThis is a traceback from calling compute on an XArray computation on dask.distributed.
We’re able to use dask.array on a NetCDF4 object without locks if our workers have single threads. However, when computing on the .data attribute backed by a NetCDF object wrapped by a few XArray containers we run into the following error. It appears to be coming from computing the shape, which is odd. Traceback below:
In [168]: ds = xr.open_mfdataset(fname, lock=False)
In [169]: ds.yParticle.data.sum().compute()
/net/scratch3/pwolfram/miniconda2/lib/python2.7/site-packages/dask/array/core.pyc in getarray()
47 lock.acquire()
48 try:
---> 49 c = a[b]
50 if type(c) != np.ndarray:
51 c = np.asarray(c)
/users/pwolfram/lib/python2.7/site-packages/xarray/core/indexing.pyc in __getitem__()
396
397 def __getitem__(self, key):
--> 398 return type(self)(self.array, self._updated_key(key))
399
400 def __setitem__(self, key, value):
/users/pwolfram/lib/python2.7/site-packages/xarray/core/indexing.pyc in _updated_key()
372
373 def _updated_key(self, new_key):
--> 374 new_key = iter(canonicalize_indexer(new_key, self.ndim))
375 key = []
376 for size, k in zip(self.array.shape, self.key):
/users/pwolfram/lib/python2.7/site-packages/xarray/core/utils.pyc in ndim()
380 @property
381 def ndim(self):
--> 382 return len(self.shape)
383
384 @property
/users/pwolfram/lib/python2.7/site-packages/xarray/core/indexing.pyc in shape()
384 def shape(self):
385 shape = []
--> 386 for size, k in zip(self.array.shape, self.key):
387 if isinstance(k, slice):
388 shape.append(len(range(*k.indices(size))))
/users/pwolfram/lib/python2.7/site-packages/xarray/conventions.pyc in shape()
447 @property
448 def shape(self):
--> 449 return self.array.shape[:-1]
450
451 def __str__(self):
/users/pwolfram/lib/python2.7/site-packages/xarray/core/indexing.pyc in shape()
384 def shape(self):
385 shape = []
--> 386 for size, k in zip(self.array.shape, self.key):
387 if isinstance(k, slice):
388 shape.append(len(range(*k.indices(size))))
/users/pwolfram/lib/python2.7/site-packages/xarray/core/utils.pyc in shape()
407 @property
408 def shape(self):
--> 409 return self.array.shape
410
411 def __array__(self, dtype=None):
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.shape.__get__ (netCDF4/_netCDF4.c:32778)()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable._getdims (netCDF4/_netCDF4.c:31870)()
RuntimeError: NetCDF: Not a valid ID
Issue Analytics
- State:
- Created 7 years ago
- Comments:48 (20 by maintainers)
Top Results From Across the Web
Reading and writing files - Xarray
NetCDF files are often encountered in collections, e.g., with different files corresponding to different model runs or one file per ...
Read more >Difficulties using multi-threading and multi-processing and ...
Calculation of a grid cell has no dependency on data/result of another gridcell... we could/should be able to calculate all in parallel ...
Read more >Parallel computing with Dask — xarray 0.11.1 documentation
Operations queue up a series of tasks mapped over blocks, ... By default, open_mfdataset will chunk each netCDF file into a single Dask...
Read more >Reading larger than memory HDF data and writing ...
P.S. Running dask.compute(tasks[0]) indeed returns an xarray dataset but ... your HDF files to netCDF files, using Xarray, possibly writing Zarr, etc.
Read more >Introduction to Xarray - Pythia Foundations
Its interface is based largely on the netCDF data model (variables, ... Here we'll initialize a DataArray object by wrapping a plain NumPy...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@shoyer and @mrocklin, this looks like it is working now using pydata/xarray#1095:
Would this naturally suggest that xarray-distributed is now a reality? If so, I should try something more complex when I get the time tomorrow.
Hi, I’m having the same issue in receiving the error message:
RuntimeError: NetCDF: Not a valid ID
When trying to get values from a dask array after performing a computation. Though I see this issue was resolved, using #https://github.com/pydata/xarray/pull/1095, I don’t see the explicit solution.
Could you please redirect me to this solution? Thanks!