question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add dask array support

See original GitHub issue

Stages:

  1. Basic support for dask.array
  2. Stream dask.array
  3. Use dask.delayed or dask.futures to help downsample images.

CC: @mrocklin @dani-lbnl xref #43 https://github.com/scisprints/2018_05_sklearn_skimage_dask/issues/12

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, May 26, 2018

The coarsen function might be relevant here

http://dask.pydata.org/en/latest/array-api.html#dask.array.coarsen

On Fri, May 25, 2018, 4:45 PM Matt McCormick notifications@github.com wrote:

Stages:

  1. Basic support for dask.array
  2. Stream dask.array
  3. Use dask.delayed or dask.futures to help downsample images.

CC: @mrocklin https://github.com/mrocklin @dani-lbnl https://github.com/dani-lbnl xref #43 https://github.com/InsightSoftwareConsortium/itk-jupyter-widgets/issues/43 scisprints/2018_05_sklearn_skimage_dask#12 https://github.com/scisprints/2018_05_sklearn_skimage_dask/issues/12

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/InsightSoftwareConsortium/itk-jupyter-widgets/issues/44, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszI3FbtJ66NHWn1cuorIXbBs5G3Hlks5t2JeugaJpZM4UOsT5 .

0reactions
thewtexcommented, Oct 30, 2018

you are downsampling 2048, 2048, 1600 by 10, 10, 10 which causes the array to get padded. The input array should be contiguous and its shape should be divisible by the shrink factors.

For this use case, we have to accept all data as it comes. All data is welcome 😃.

PS, that you can probably speedup your large array creating by following some of the not so much hacks suggested here: numpy/numpy#11919 though these shouldn’t be so necessary in numpy 1.16.

Nicely done!

2\. use `persist`. It is unlikely that you care about creating a final contiguous matrix (unless you do).

3\. Remove the call to `astype`. This is pretty contentious in scikit-image. I've tried to see if using smaller dtypes would help, it just depends on how the compiler can optimize them. It isn't obvious that Cython is the right tool to ensure that fast operations occur on uint8.

Finally, if you really want to hack, provide a dtype to coarsen and to local_means and feed in float32.

Good ideas, but these were unfortunately constraints from the use case.

Thanks for the reviews @hmaarrfk @mrocklin !

Read more comments on GitHub >

github_iconTop Results From Across the Web

dask.array.add - Dask documentation
The arrays to be added. If x1.shape != x2.shape , they must be broadcastable to a common shape (which becomes the shape of...
Read more >
dask.array.insert - Dask documentation
Object that defines the index or indices before which values is inserted. New in version 1.8.0. Support for multiple insertions when obj is...
Read more >
Create Dask Arrays - Dask documentation
You can load or store Dask arrays from a variety of common sources like HDF5, NetCDF, Zarr, or any format that supports NumPy-style...
Read more >
Array - Dask documentation
Dask Array implements a subset of the NumPy ndarray interface using blocked algorithms, cutting up the large array into many small arrays.
Read more >
API - Dask documentation
Implements NumPy's indices for Dask Arrays. insert (arr, obj, values, axis). Insert values along the given axis before the given ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found