question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Consider adding nonzero/flatnonzero to dask.array

See original GitHub issue

This would need to eagerly evaluated, or perhaps return an imperative value, because the result shape is not known apriori.

This sort of thing could be useful for https://github.com/pydata/xarray/pull/815 if you want to mask out a small region of a very large array. For example, imagine you have a tiled digital elevation model dataset at high resolution covering the globe, and now you want to extract out the region corresponding to California. There are certainly more intelligent indexing strategies for geospatial data, but I could see something like ds.sel_where(ds.region_enum == CALIFORNIA) being convenient.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:2
  • Comments:22 (22 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, Mar 31, 2017

If you don’t want to evaluate the entire result then you’ll need to know which chunks to look for, in advance. Generally this isn’t doable. Dask.array isn’t good when the structure of the computation depends on the values of the array.

1reaction
mrocklincommented, Mar 29, 2017

Unknown dimension lengths are fine

On Wed, Mar 29, 2017 at 1:10 PM, jakirkham notifications@github.com wrote:

Is it only unknown chunks or are unknown dimension lengths for the overall array allowed too?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/dask/issues/1076#issuecomment-290157722, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszF2g6ovAtr1rPVsj7pWhibmoDO-2ks5rqpCbgaJpZM4H98G7 .

Read more comments on GitHub >

github_iconTop Results From Across the Web

dask.array.nonzero - Dask documentation
Returns a tuple of arrays, one for each dimension of a , containing the indices of the non-zero elements in that dimension. The...
Read more >
dask.array.flatnonzero - Dask documentation
Return indices that are non-zero in the flattened version of a. This docstring was copied from numpy.flatnonzero. Some inconsistencies with the Dask version ......
Read more >
Source code for dask.array.routines
Consider removing it in a future version of dask. import cupy xp = cupy ... adjust_chunks={0: 1}, # one row for each block...
Read more >
Dask and the __array_function__ protocol
In short, the protocol allows a NumPy function call to dispatch the appropriate NumPy-like library implementation, depending on the array type ...
Read more >
dask.array.count_nonzero - Dask documentation
Counts the number of non-zero values in the array a . ... For example, any number is considered truthful if it is nonzero,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found