question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Experiment with map_overlap and cupy arrays

See original GitHub issue

It would be useful to try the da.map_overlap function with CuPy arrays on large 2d datasets.

A trivial example might look something like the following (untested):

import dask.array as da
import cupy

rs = da.random.RandomState(RandomState=cupy.random.RandomState)  # swap cupy->numpy here for comparison
x = rs.random(500000, 500000), chunks=(10000, 10000))

x.map_overlap(lambda x: x, depth=1)
x.sum().compute()  # trigger computation, but don't ask for the entire array as a result

My guess is that we’ll be badly bound by communication. I would verify this probably by running this computation under the dask distributed scheduler but started with dask-cuda’s LocalCUDACluster, and then by watching the dashboard.

My hope is that once the UCX work finishes that this cost goes down considerably. It will be interesting to see by how much.

Additionally, we might try using numba.cuda.jit to build some simple nearest-neighbor kernel function and applying that with map_overlap over the array. This notebook from this blogpost might be an interesting starting point here (but there are probably more interesting operations).

cc @madsbk

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:33 (33 by maintainers)

github_iconTop GitHub Comments

2reactions
mrocklincommented, Aug 12, 2019

The easiest way I’ve found to do this is to push the HTML files to a gh-pages branch and then go to username.github.io/repo-name/path-to-file.html

http://mrocklin.github.io/raw-host/map-overlap/map_overlap_10k_tcp.html

http://mrocklin.github.io/raw-host/map-overlap/map_overlap_10k_ucx.html

https://github.com/mrocklin/raw-host/commit/72f0876b88c2f7d4e4bc0b5e845811a28fc220cc

1reaction
jakirkhamcommented, Sep 14, 2020

Have gone ahead and put together a simple benchmark script in PR ( https://github.com/rapidsai/dask-cuda/pull/399 ). This should give us a way to track performance and measure improvements. Perhaps we can close this once that is in? New issues could follow up on more specific improvements as needed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Interoperability — CuPy 11.4.0 documentation
This enables NumPy ufuncs to be directly operated on CuPy arrays. ... mpi4py now provides (experimental) support for passing CuPy arrays to MPI...
Read more >
SkePU 3: Portable High-Level Programming of ... - Springer Link
MapArray was a dedicated skeleton in SkePU 1 created as a clone of Map with the ability to accept an auxiliary, random-accessible array...
Read more >
Extending the SkelCL Skeleton Library for Stencil ... - CiteSeerX
computations – MapOverlap and Stencil – and we describe ... efficient parallel implementation, and report experimental ... To copy otherwise, to.
Read more >
Designing a Modern Skeleton Programming Framework for ...
experimental results and evaluation from all papers is collected and repro- ... pattern frameworks (such as MapOverlap in SkePU), demonstrating the.
Read more >
Using the SkelCL Library for High-Level GPU Programming of ...
Finally, we present an application case study using the matrix data type and the MapOverlap skeleton – Sobel edge detection for 2D images ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found