dask.array.asarray should recognize dask duck-types
See original GitHub issueWith master for dask and xarray (with dask duck array support):
In [25]: import dask.array as da
In [26]: import xarray
In [27]: dask_obj = da.ones((3, 3), chunks=-1)
In [28]: xarray_obj = xarray.DataArray(x, dims=['x', 'y'])
In [29]: dict(da.asarray(xarray_obj).dask)
Out[29]:
{('array-5ce95599636c17df4c5746fb5f3c220b',
0,
0): (<function dask.array.core.getter_inline>, 'array-original-5ce95599636c17df4c5746fb5f3c220b', (slice(0, 3, None),
slice(0, 3, None))),
'array-original-5ce95599636c17df4c5746fb5f3c220b': <xarray.DataArray 'wrapped-8015d128b19387281f3d8bd3a663300a' (x: 3, y: 3)>
dask.array<shape=(3, 3), dtype=float64, chunksize=(3, 3)>
Dimensions without coordinates: x, y}
The dask graph from asarray()
has an xarray.DataArray
object in it, with its own nested dask graph. This is pretty messy: asarray()
should really recognize the dask duck-type and combine the argument’s dask graph into its own.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Array - Dask documentation
Dask arrays coordinate many NumPy arrays (or “duck arrays” that are sufficiently NumPy-like in API such as CuPy or Sparse arrays) arranged into...
Read more >Array creation and __array_function__ · Issue #4883 ... - GitHub
One of the issues that arise when introducing __array_function__ in a NumPy-like library, such as Dask, is array creation.
Read more >Parallel computing with Dask - Xarray
Dask divides arrays into many small pieces, called chunks, each of which is presumed to be small enough to fit into memory. Unlike...
Read more >NEP 22 — Duck typing for NumPy arrays – high level overview
If you are trying to implement a duck array, then you should strive to implement everything. You certainly need .shape , .ndim and...
Read more >duck_array_ops.py - "Compatibility module defining operations on ...
Currently, this means Dask or NumPy arrays. ... pandas_isnull(data)else:# Not reachable yet, but intended for use with other duck array# types.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Since
xarray.DataArray
does not inherit from daskArray
, bothasarray
andasanyarray
should convert it in a daskArray
. That part of the existing functionality is exactly right.But if a
xarray.DataArray
contains a dask graph (via the custom collections interface), it would be nice to merge its dask graph directly into the resulting dask Array, instead of having nested dask arrays. The later is pretty messy, and can result in various problems like the inability to optimize and resource contention.It’s not entirely clear to me that there is a good way to do this with the current custom collections interface. @mrocklin @jcrist any thoughts?
xarray.DataArray
does support the custom collections interface.But merely being a dask object isn’t enough here. We also need some way to indicate how xarray objects can be converted into dask arrays.
Coincidentally I’ve been working this past week with @njsmith on a proposal for how exactly this sort of thing should be done. We’ll hopefully be sending it out very soon…