question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dask.array.asarray should recognize dask duck-types

See original GitHub issue

With master for dask and xarray (with dask duck array support):

In [25]: import dask.array as da

In [26]: import xarray

In [27]: dask_obj = da.ones((3, 3), chunks=-1)

In [28]: xarray_obj = xarray.DataArray(x, dims=['x', 'y'])

In [29]: dict(da.asarray(xarray_obj).dask)
Out[29]:
{('array-5ce95599636c17df4c5746fb5f3c220b',
  0,
  0): (<function dask.array.core.getter_inline>, 'array-original-5ce95599636c17df4c5746fb5f3c220b', (slice(0, 3, None),
   slice(0, 3, None))),
 'array-original-5ce95599636c17df4c5746fb5f3c220b': <xarray.DataArray 'wrapped-8015d128b19387281f3d8bd3a663300a' (x: 3, y: 3)>
 dask.array<shape=(3, 3), dtype=float64, chunksize=(3, 3)>
 Dimensions without coordinates: x, y}

The dask graph from asarray() has an xarray.DataArray object in it, with its own nested dask graph. This is pretty messy: asarray() should really recognize the dask duck-type and combine the argument’s dask graph into its own.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
shoyercommented, Nov 20, 2017

Since xarray.DataArray does not inherit from dask Array, both asarray and asanyarray should convert it in a dask Array. That part of the existing functionality is exactly right.

But if a xarray.DataArray contains a dask graph (via the custom collections interface), it would be nice to merge its dask graph directly into the resulting dask Array, instead of having nested dask arrays. The later is pretty messy, and can result in various problems like the inability to optimize and resource contention.

It’s not entirely clear to me that there is a good way to do this with the current custom collections interface. @mrocklin @jcrist any thoughts?

0reactions
shoyercommented, Mar 31, 2018

xarray.DataArray does support the custom collections interface.

But merely being a dask object isn’t enough here. We also need some way to indicate how xarray objects can be converted into dask arrays.

Coincidentally I’ve been working this past week with @njsmith on a proposal for how exactly this sort of thing should be done. We’ll hopefully be sending it out very soon…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Array - Dask documentation
Dask arrays coordinate many NumPy arrays (or “duck arrays” that are sufficiently NumPy-like in API such as CuPy or Sparse arrays) arranged into...
Read more >
Array creation and __array_function__ · Issue #4883 ... - GitHub
One of the issues that arise when introducing __array_function__ in a NumPy-like library, such as Dask, is array creation.
Read more >
Parallel computing with Dask - Xarray
Dask divides arrays into many small pieces, called chunks, each of which is presumed to be small enough to fit into memory. Unlike...
Read more >
NEP 22 — Duck typing for NumPy arrays – high level overview
If you are trying to implement a duck array, then you should strive to implement everything. You certainly need .shape , .ndim and...
Read more >
duck_array_ops.py - "Compatibility module defining operations on ...
Currently, this means Dask or NumPy arrays. ... pandas_isnull(data)else:# Not reachable yet, but intended for use with other duck array# types.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found