cupy + numba cuda error: [304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM
See original GitHub issueReporting a bug
- I have tried using the latest released version of Numba (most recent is visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG). I’m using the latest available via conda 0.50.1.
- I have included below a minimal working reproducer (if you are unsure how to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
I’m trying to use cupy and numba.cuda in an application but I’m running into an error that seems to depend on the order in which the cupy and numba.cuda libraries are called. The application is running inside a container based on the nvidia/cuda:10.2-devel-ubuntu16.04 image. I haven’t run into this issue outside of a container environment so I suspect it’s related to finding the right cuda libraries from inside the container but it seems odd that the ordering matters.
Example:
# numba before cupy seems to work fine
python -c "from numba import cuda; cuda.to_device(range(10)); import cupy; cupy.arange(10);"
# cupy before numba results in error
python -c "import cupy; cupy.arange(10); from numba import cuda; cuda.to_device(range(10));"
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 232, in initialize
self.cuInit(0)
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 295, in safe_cuda_api_call
self._check_error(fname, retcode)
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 330, in _check_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 223, in _require_cuda_context
with _runtime.ensure_context():
File "/opt/anaconda3/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 121, in ensure_context
with driver.get_active_context():
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 388, in __enter__
driver.cuCtxGetCurrent(byref(hctx))
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 275, in __getattr__
self.initialize()
File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 235, in initialize
raise CudaSupportError("Error at driver init: \n%s:" % e)
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init:
[304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM:
Output from numba -s
:
numba -s
System info:
--------------------------------------------------------------------------------
__Time Stamp__
Report started (local time) : 2020-08-13 16:54:44.763828
UTC start time : 2020-08-13 16:54:44.763831
Running time (s) : 0.57784
__Hardware Information__
Machine : x86_64
CPU Name : skylake-avx512
CPU Count : 80
Number of accessible CPUs : 5
List of accessible CPUs cores : 0-4,40-44
CFS Restrictions (CPUs worth of runtime) : None
CPU Features : 64bit adx aes avx avx2 avx512bw
avx512cd avx512dq avx512f avx512vl
bmi bmi2 clflushopt clwb cmov cx16
cx8 f16c fma fsgsbase fxsr invpcid
lzcnt mmx movbe mpx pclmul pku
popcnt prfchw rdrnd rdseed rtm
sahf sse sse2 sse3 sse4.1 sse4.2
ssse3 xsave xsavec xsaveopt xsaves
Memory Total (MB) : 384849
Memory Available (MB) : 346182
__OS Information__
Platform Name : Linux-4.12.14-lp150.12.82-default-x86_64-with-glibc2.10
Platform Release : 4.12.14-lp150.12.82-default
OS Name : Linux
OS Version : #1 SMP Tue Nov 12 16:32:38 UTC 2019 (c939e24)
OS Specific Version : ?
Libc Version : glibc 2.23
__Python Information__
Python Compiler : GCC 7.3.0
Python Implementation : CPython
Python Version : 3.8.3
Python Locale : en_US.UTF-8
__LLVM Information__
LLVM Version : 9.0.1
__CUDA Information__
CUDA Device Initialized : True
CUDA Driver Version : 10020
CUDA Detect Output:
Found 1 CUDA devices
id 0 b'Tesla V100-SXM2-16GB' [SUPPORTED]
compute capability: 7.0
pci device id: 0
pci bus id: 26
Summary:
1/1 devices are supported
CUDA Librairies Test Output:
Finding cublas from System
named libcublas.so
trying to open library... ok
Finding cusparse from System
named libcusparse.so.10.3.1.89
trying to open library... ok
Finding cufft from System
named libcufft.so.10.1.2.89
trying to open library... ok
Finding curand from System
named libcurand.so.10.1.2.89
trying to open library... ok
Finding nvvm from System
named libnvvm.so.3.3.0
trying to open library... ok
Finding libdevice from System
searching for compute_20... ok
searching for compute_30... ok
searching for compute_35... ok
searching for compute_50... ok
__ROC information__
ROC Available : False
ROC Toolchains : None
HSA Agents Count : 0
HSA Agents:
None
HSA Discrete GPUs Count : 0
HSA Discrete GPUs : None
__SVML Information__
SVML State, config.USING_SVML : False
SVML Library Loaded : False
llvmlite Using SVML Patched LLVM : True
SVML Operational : False
__Threading Layer Information__
TBB Threading Layer Available : True
+-->TBB imported successfully.
OpenMP Threading Layer Available : True
+-->Vendor: GNU
Workqueue Threading Layer Available : True
+-->Workqueue imported successfully.
__Numba Environment Variable Information__
None found.
__Conda Information__
Conda Build : not installed
Conda Env : 4.8.4
Conda Platform : linux-64
Conda Python Version : 3.8.3.final.0
Conda Root Writable : False
__Installed Packages__
_libgcc_mutex 0.1 main
blas 1.0 mkl
ca-certificates 2020.6.24 0
certifi 2020.6.20 py38_0
cffi 1.14.0 py38he30daa8_1
chardet 3.0.4 py38_1003
conda 4.8.4 py38_0
conda-package-handling 1.6.1 py38h7b6447c_0
cryptography 2.9.2 py38h1ba5d50_0
cupy-cuda102 7.7.0 pypi_0 pypi
fastrlock 0.5 pypi_0 pypi
idna 2.9 py_1
intel-openmp 2020.1 217
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20181209 hc058e9b_0
libffi 3.3 he6710b0_1
libgcc-ng 9.1.0 hdf63c60_0
libllvm9 9.0.1 h4a3c616_1
libstdcxx-ng 9.1.0 hdf63c60_0
llvmlite 0.33.0 py38hc6ec683_1
mkl 2020.1 217
mkl-service 2.3.0 py38he904b0f_0
mkl_fft 1.1.0 py38h23d657b_0
mkl_random 1.1.1 py38h0573a6f_0
ncurses 6.2 he6710b0_1
numba 0.50.1 py38h0573a6f_1
numpy 1.19.1 py38hbc911f0_0
numpy-base 1.19.1 py38hfa32c7d_0
openssl 1.1.1g h7b6447c_0
pip 20.2.1 py38_0
pycosat 0.6.3 py38h7b6447c_1
pycparser 2.20 py_0
pyopenssl 19.1.0 py38_0
pysocks 1.7.1 py38_0
python 3.8.3 hcff3b4d_0
readline 8.0 h7b6447c_0
requests 2.23.0 py38_0
ruamel_yaml 0.15.87 py38h7b6447c_0
setuptools 46.4.0 py38_0
six 1.14.0 py38_0
sqlite 3.31.1 h62c20be_1
tbb 2020.0 hfd86e86_0
tk 8.6.8 hbc83047_0
tqdm 4.46.0 py_0
urllib3 1.25.8 py38_0
wheel 0.34.2 py38_0
xz 5.2.5 h7b6447c_0
yaml 0.1.7 had09818_2
zlib 1.2.11 h7b6447c_3
No errors reported.
__Warning log__
Warning (roc): Error initialising ROC: No ROC toolchains found.
Warning (roc): No HSA Agents found, encountered exception when searching: Error at driver init:
NUMBA_HSA_DRIVER /opt/rocm/lib/libhsa-runtime64.so is not a valid file path. Note it must be a filepath of the .so/.dll/.dylib or the driver:
Warning (psutil): psutil cannot be imported. For more accuracy, consider installing it.
Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us
Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_period_us
--------------------------------------------------------------------------------
Output from nvidia-smi
:
Thu Aug 13 16:56:16 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01 Driver Version: 440.95.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... Off | 00000000:1A:00.0 Off | 0 |
| N/A 32C P0 37W / 300W | 0MiB / 16160MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Error cudaErrorOperatingSystem: OS call failed or ...
not DriveWorks. I'm asking you to try to install host components (e.g. cuda things) in DRIVE OS 5.2.6 with sdkmanager again and see...
Read more >CUDA API error on Python with Numba
CudaAPIError : Call to cuMemcpyHtoD results in CUDA_ERROR_LAUNCH_FAILED. These errors are encountered when you put your system on 'suspend'.
Read more >Tuple of CuPy arrays - Numba Cuda - Support: How do I do
Numba cuda works just fine with the original 3D CuPy array. But it interprets the tuple of cupy arrays as a python object...
Read more >types
const ( /** * The API call returned with no errors. ... This result is not actually an error but must be indicated...
Read more >Viewing online file analysis results for 'AppSvc.exe'
Sample was identified as malicious by a large number of Antivirus engines ... e24eeff75fc1faa06be3f87b2b59cc74afb1eb8c4d1fe0f48222df233e231ea7 Copy SHA256 ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
An update: we tracked this down to an issue with our custom container implementation Shifter which adapts Docker containers for an HPC system.
The problem was that the Cuda drivers were mounted into the Shifter container using a script that copied the files rather than preserving symlinks. This resulted in 3 different driver files being mounted:
You can compare this to what nvidia-docker does for the same container. It mounts a single driver and two symlinks:
We verified that changing our Shifter driver mounting script to use
rsync
rather thancp
fixes the issue reported here. The driver versions were the same, but it seems that multiple driver files were causing the problem.Yes, I think so-- please close. Thank you for your help.