Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cupy function doesn't utilize pinned memory inside stream

See original GitHub issue

Conditions CuPy Version : 7.2.0 CUDA Root : /usr/common/software/cuda/10.1.243 CUDA Build Version : 10010 CUDA Driver Version : 10020 CUDA Runtime Version : 10010 cuBLAS Version : 10202 cuFFT Version : 10102 cuRAND Version : 10102 cuSOLVER Version : (10, 3, 0) cuSPARSE Version : 10301 NVRTC Version : (10, 1) cuDNN Build Version : 7605 cuDNN Version : 7605 NCCL Build Version : 2506 NCCL Runtime Version : 2506
Code to reproduce

import numpy as np
import cupy as cp
import cupy.linalg
import cupyx.scipy.special
import cupyx as cpx

stream_1 = cp.cuda.stream.Stream()
with stream_1:
    cp.random.seed(1)
    A = cp.random.rand(10000, 10000)
    u, v = cp.linalg.eigh(cpx.scipy.sparse.csr_matrix(A).todense())

Error messages, stack traces, or logs By profiling the above code, I observe that there are many small bursts of cudaMemcpy2DAsyncs happening in eigh, despite never explicitly requesting cupy to transfer data back. I am putting the cupy call in a stream. How do I force cupy to use pinned memory efficiently? eigh_profile5.qdrep.zip

Issue Analytics

State:
Created 4 years ago
Comments:16 (11 by maintainers)

Top GitHub Comments

4reactions

jakirkhamcommented, Mar 31, 2020

FYI this was opened as a bug internally in NVIDIA.

4reactions

leofangcommented, Mar 6, 2020

Looks like those data transfers are made outside of CuPy (likely in cuSPARSE or cuSOLVER). IIUC almost all CuPy internal kernels are prefixed with cupy_ (or cupyx_), but I don’t see any in those transfers.