Multidimensional dask coordinates unexpectedly computed
See original GitHub issueMCVE Code Sample
from dask.diagnostics import ProgressBar
import xarray as xr
import numpy as np
import dask.array as da
a = xr.DataArray(da.zeros((10, 10), chunks=2), dims=('y', 'x'), coords={'y': np.arange(10), 'x': np.arange(10), 'lons': (('y', 'x'), da.zeros((10, 10), chunks=2))})
b = xr.DataArray(da.zeros((10, 10), chunks=2), dims=('y', 'x'), coords={'y': np.arange(10), 'x': np.arange(10), 'lons': (('y', 'x'), da.zeros((10, 10), chunks=2))})
with ProgressBar():
c = a + b
Output:
[########################################] | 100% Completed | 0.1s
Problem Description
Using arrays with 2D dask array coordinates results in the coordinates being computed for any binary operations (anything combining two or more DataArrays). I use ProgressBar
in the above example to show when coordinates are being computed.
In my own work, when I learned that 2D dask coordinates were possible, I started adding longitude
and latitude
coordinates. These are rather large and can take a while to load/compute so I was surprised that simple operations (ex. a.fillna(b)
) were causing things to be computed and taking a long time.
Is this computation by design or a possible bug?
Expected Output
No output from the ProgressBar
, hoping that no coordinates would be computed/loaded.
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 02:16:08) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2
xarray: 0.12.1 pandas: 0.24.2 numpy: 1.14.3 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.22 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.0.0 distributed: 2.0.0 matplotlib: 3.1.0 cartopy: 0.17.1.dev147+HEAD.detached.at.5e624fe seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: 4.6.3 IPython: 7.5.0 sphinx: 2.1.2
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (7 by maintainers)
Top GitHub Comments
FYI: @djhoese, you can inline code snippits using the permanent link to the source: https://github.com/pydata/xarray/blob/e5bb647637063153a7feb750793d6fd8fb58dda8/xarray/core/variable.py#L1223
Ah, good call. The
transpose
currently in xarray would still be a problem though.