[BUG] `cp.dot` causes illegal memory access encountered
See original GitHub issue-
Conditions (you can just paste the output of
python -c 'import cupy; cupy.show_config()'
) CuPy Version : 7.3.0 CUDA Root : /usr/local/cuda CUDA Build Version : 10000 CUDA Driver Version : 10010 CUDA Runtime Version : 10000 cuBLAS Version : 10000 cuFFT Version : 10000 cuRAND Version : 10000 cuSOLVER Version : (10, 0, 0) cuSPARSE Version : 10000 NVRTC Version : (10, 0) cuDNN Build Version : 7605 cuDNN Version : 7600 NCCL Build Version : 2406 NCCL Runtime Version : 2604 -
Code to reproduce
import cupy as cp
X = cp.random.rand(100000000*40, dtype='float32')
X = X.reshape((100000000, 40), order='F')
B = 2 * cp.random.rand(30, 2, dtype='float32') - 1
X[:, 30:32] = cp.dot(X[:, :30], B)
- Error messages, stack traces, or logs
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "cupy/core/core.pyx", line 1248, in cupy.core.core.ndarray.__setitem__
File "cupy/core/_routines_indexing.pyx", line 49, in cupy.core._routines_indexing._ndarray_setitem
File "cupy/core/_routines_indexing.pyx", line 810, in cupy.core._routines_indexing._scatter_op
File "cupy/core/_kernel.pyx", line 951, in cupy.core._kernel.ufunc.__call__
File "cupy/core/_kernel.pyx", line 974, in cupy.core._kernel.ufunc._get_ufunc_kernel
File "cupy/core/_kernel.pyx", line 714, in cupy.core._kernel._get_ufunc_kernel
File "cupy/core/_kernel.pyx", line 61, in cupy.core._kernel._get_simple_elementwise_kernel
File "cupy/core/carray.pxi", line 194, in cupy.core.core.compile_with_cache
File "/home/dgala/miniconda3/envs/cuml_try/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 287, in compile_with_cache
extra_source, backend)
File "/home/dgala/miniconda3/envs/cuml_try/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 335, in _compile_with_cache_cuda
mod.load(cubin)
File "cupy/cuda/function.pyx", line 197, in cupy.cuda.function.Module.load
File "cupy/cuda/function.pyx", line 199, in cupy.cuda.function.Module.load
File "cupy/cuda/driver.pyx", line 240, in cupy.cuda.driver.moduleLoadData
File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
Is this a bug in CUDA? (illegal memory access was ...
1 Answer 1 ... TL;DR: The observed behavior is very likely caused by a bug in the ptxas component of the CUDA 7.5...
Read more >Bug listing with status RESOLVED with resolution OBSOLETE ...
Bug :1523 - "[IDEA] Offload work by distributing trivial ebuild ... on amd64 due to out of memory error" status:RESOLVED resolution:OBSOLETE severity:normal ...
Read more >Questions regarding optixAccelBuild and "an illegal memory ...
I wrote a standalone C++/CUDA library that deals with OptiX 7.3, and the Python code calls that library through a pybind11 wrapper. So...
Read more >Known Issues for JDK 8 - Oracle
This document describes known issues in the Oracle JDK 8 release.
Read more >Trouble running miniZ
GPU[8]: CUDA error 'an illegal memory access was encountered' in func 'eq ... Maybe a bug with miniZ integration into Nicehash miner.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The cause of the problem has been nearly identified. There seems to be a bug in the gemm implementation of cuBLAS in CUDA 10.2 or older. At least one of the input matrices has more than 2 giga elements and when the matrix is transposed in cuBLAS, the results becomes incorrect or a segmentation fault occurs.
This bug is fixed in CUDA 11.
You might work around this problem by transposing the matrices in CuPy before calling cuBLAS gemms, since the problem will not occur if matrices are not transposed in cuBLAS, However, it will increase the memory usage…
Let me close this as the issue is fixed in the latest CUDA.