question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stream in the context-manager form is not used in `ElementwiseKernel` or `ReductionKernel`

See original GitHub issue

This is actually a bug reported back in #1695 that unfortunately went unnoticed.

In examples/stream/map_reduce.py, a list of streams was created for executing cupy.matmul() in parallel, which is backed by a ReductionKernel in this case: https://github.com/cupy/cupy/blob/1af22f57fda92ae35bde806d0c4d110faf4fed52/cupy/core/core.pyx#L2513-L2516 However, inspecting the implementation I found that ReductionKernel only accepts an explicit stream argument; it does not pick up any current stream: https://github.com/cupy/cupy/blob/32718607a7808ec6bc3a24cf9231a9351f8fc95e/cupy/core/reduction.pxi#L396 In other words, that example was misleading because those streams were not used at all and so all executions were serialized, as can be checked from nvprof + nvvp (see the circle in red): 螢幕快照 2019-10-03 上午11 24 27

The same bug also appears in ElementwiseKernel: https://github.com/cupy/cupy/blob/1af22f57fda92ae35bde806d0c4d110faf4fed52/cupy/core/_kernel.pyx#L537

In my opinion, unlike RawKernel which is not used by any CuPy core functionalities, ElementwiseKernel and ReductionKernel should honor the current stream by checking the current stream pointer if no stream argument is explicitly given, since many CuPy functions like cupy.matmul() do not support passing in a stream. A similar approach is already adopted in the FFT module, see #2362.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

4reactions
emcastillocommented, Oct 7, 2019

Actually, the previous code was fine, just increasing the matrix sizes shows real overlap image

1reaction
emcastillocommented, Oct 7, 2019

Apparently, if the stream is set to None in those two functions, when the kernel is launched the current stream is retrieved: https://github.com/cupy/cupy/blob/32718607a7808ec6bc3a24cf9231a9351f8fc95e/cupy/cuda/function.pyx#L126

This is done in the linear_launch function https://github.com/cupy/cupy/blob/32718607a7808ec6bc3a24cf9231a9351f8fc95e/cupy/cuda/function.pyx#L174

Read more comments on GitHub >

github_iconTop Results From Across the Web

CuPy Documentation - Read the Docs
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in.
Read more >
The Curious Case of Python's Context Manager
If no unhandled exception occurs, the code gracefully proceeds to the finally block where you run your cleanup code. Let's implement the same ......
Read more >
cupy.ndarray
CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of cupy. ndarray, the core multi-dimensional ...
Read more >
Chainer Documentation
command from PATH environment variable and use its parent directory as the root directory of CUDA installation. If nvcc command is also not...
Read more >
[packages/python-pyopencl] - removed obsolete doc patch
ElementwiseKernel `, -+:class:`pyopencl.reduction.ReductionKernel` (or similar), and let PyOpenCL know -+about them using this function: -+ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found