Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cupy + numba cuda error: [304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM

See original GitHub issue

Reporting a bug

I have tried using the latest released version of Numba (most recent is visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG). I’m using the latest available via conda 0.50.1.
I have included below a minimal working reproducer (if you are unsure how to write one see http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

I’m trying to use cupy and numba.cuda in an application but I’m running into an error that seems to depend on the order in which the cupy and numba.cuda libraries are called. The application is running inside a container based on the nvidia/cuda:10.2-devel-ubuntu16.04 image. I haven’t run into this issue outside of a container environment so I suspect it’s related to finding the right cuda libraries from inside the container but it seems odd that the ordering matters.

Example:

# numba before cupy seems to work fine
python -c "from numba import cuda; cuda.to_device(range(10)); import cupy; cupy.arange(10);"

# cupy before numba results in error 
python -c "import cupy; cupy.arange(10); from numba import cuda; cuda.to_device(range(10));"

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 232, in initialize
    self.cuInit(0)
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 295, in safe_cuda_api_call
    self._check_error(fname, retcode)
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 330, in _check_error
    raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 223, in _require_cuda_context
    with _runtime.ensure_context():
  File "/opt/anaconda3/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/devices.py", line 121, in ensure_context
    with driver.get_active_context():
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 388, in __enter__
    driver.cuCtxGetCurrent(byref(hctx))
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 275, in __getattr__
    self.initialize()
  File "/opt/anaconda3/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 235, in initialize
    raise CudaSupportError("Error at driver init: \n%s:" % e)
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: 
[304] Call to cuInit results in CUDA_ERROR_OPERATING_SYSTEM:

Output from numba -s:

numba -s
System info:
--------------------------------------------------------------------------------
__Time Stamp__
Report started (local time)                   : 2020-08-13 16:54:44.763828
UTC start time                                : 2020-08-13 16:54:44.763831
Running time (s)                              : 0.57784

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : skylake-avx512
CPU Count                                     : 80
Number of accessible CPUs                     : 5
List of accessible CPUs cores                 : 0-4,40-44
CFS Restrictions (CPUs worth of runtime)      : None

CPU Features                                  : 64bit adx aes avx avx2 avx512bw
                                                avx512cd avx512dq avx512f avx512vl
                                                bmi bmi2 clflushopt clwb cmov cx16
                                                cx8 f16c fma fsgsbase fxsr invpcid
                                                lzcnt mmx movbe mpx pclmul pku
                                                popcnt prfchw rdrnd rdseed rtm
                                                sahf sse sse2 sse3 sse4.1 sse4.2
                                                ssse3 xsave xsavec xsaveopt xsaves

Memory Total (MB)                             : 384849
Memory Available (MB)                         : 346182

__OS Information__
Platform Name                                 : Linux-4.12.14-lp150.12.82-default-x86_64-with-glibc2.10
Platform Release                              : 4.12.14-lp150.12.82-default
OS Name                                       : Linux
OS Version                                    : #1 SMP Tue Nov 12 16:32:38 UTC 2019 (c939e24)
OS Specific Version                           : ?
Libc Version                                  : glibc 2.23

__Python Information__
Python Compiler                               : GCC 7.3.0
Python Implementation                         : CPython
Python Version                                : 3.8.3
Python Locale                                 : en_US.UTF-8

__LLVM Information__
LLVM Version                                  : 9.0.1

__CUDA Information__
CUDA Device Initialized                       : True
CUDA Driver Version                           : 10020
CUDA Detect Output:
Found 1 CUDA devices
id 0    b'Tesla V100-SXM2-16GB'                              [SUPPORTED]
                      compute capability: 7.0
                           pci device id: 0
                              pci bus id: 26
Summary:
	1/1 devices are supported

CUDA Librairies Test Output:
Finding cublas from System
	named  libcublas.so
	trying to open library...	ok
Finding cusparse from System
	named  libcusparse.so.10.3.1.89
	trying to open library...	ok
Finding cufft from System
	named  libcufft.so.10.1.2.89
	trying to open library...	ok
Finding curand from System
	named  libcurand.so.10.1.2.89
	trying to open library...	ok
Finding nvvm from System
	named  libnvvm.so.3.3.0
	trying to open library...	ok
Finding libdevice from System
	searching for compute_20...	ok
	searching for compute_30...	ok
	searching for compute_35...	ok
	searching for compute_50...	ok


__ROC information__
ROC Available                                 : False
ROC Toolchains                                : None
HSA Agents Count                              : 0
HSA Agents:
None
HSA Discrete GPUs Count                       : 0
HSA Discrete GPUs                             : None

__SVML Information__
SVML State, config.USING_SVML                 : False
SVML Library Loaded                           : False
llvmlite Using SVML Patched LLVM              : True
SVML Operational                              : False

__Threading Layer Information__
TBB Threading Layer Available                 : True
+-->TBB imported successfully.
OpenMP Threading Layer Available              : True
+-->Vendor: GNU
Workqueue Threading Layer Available           : True
+-->Workqueue imported successfully.

__Numba Environment Variable Information__
None found.

__Conda Information__
Conda Build                                   : not installed
Conda Env                                     : 4.8.4
Conda Platform                                : linux-64
Conda Python Version                          : 3.8.3.final.0
Conda Root Writable                           : False

__Installed Packages__
_libgcc_mutex             0.1                        main  
blas                      1.0                         mkl  
ca-certificates           2020.6.24                     0  
certifi                   2020.6.20                py38_0  
cffi                      1.14.0           py38he30daa8_1  
chardet                   3.0.4                 py38_1003  
conda                     4.8.4                    py38_0  
conda-package-handling    1.6.1            py38h7b6447c_0  
cryptography              2.9.2            py38h1ba5d50_0  
cupy-cuda102              7.7.0                    pypi_0    pypi
fastrlock                 0.5                      pypi_0    pypi
idna                      2.9                        py_1  
intel-openmp              2020.1                      217  
ld_impl_linux-64          2.33.1               h53a641e_7  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.3                  he6710b0_1  
libgcc-ng                 9.1.0                hdf63c60_0  
libllvm9                  9.0.1                h4a3c616_1  
libstdcxx-ng              9.1.0                hdf63c60_0  
llvmlite                  0.33.0           py38hc6ec683_1  
mkl                       2020.1                      217  
mkl-service               2.3.0            py38he904b0f_0  
mkl_fft                   1.1.0            py38h23d657b_0  
mkl_random                1.1.1            py38h0573a6f_0  
ncurses                   6.2                  he6710b0_1  
numba                     0.50.1           py38h0573a6f_1  
numpy                     1.19.1           py38hbc911f0_0  
numpy-base                1.19.1           py38hfa32c7d_0  
openssl                   1.1.1g               h7b6447c_0  
pip                       20.2.1                   py38_0  
pycosat                   0.6.3            py38h7b6447c_1  
pycparser                 2.20                       py_0  
pyopenssl                 19.1.0                   py38_0  
pysocks                   1.7.1                    py38_0  
python                    3.8.3                hcff3b4d_0  
readline                  8.0                  h7b6447c_0  
requests                  2.23.0                   py38_0  
ruamel_yaml               0.15.87          py38h7b6447c_0  
setuptools                46.4.0                   py38_0  
six                       1.14.0                   py38_0  
sqlite                    3.31.1               h62c20be_1  
tbb                       2020.0               hfd86e86_0  
tk                        8.6.8                hbc83047_0  
tqdm                      4.46.0                     py_0  
urllib3                   1.25.8                   py38_0  
wheel                     0.34.2                   py38_0  
xz                        5.2.5                h7b6447c_0  
yaml                      0.1.7                had09818_2  
zlib                      1.2.11               h7b6447c_3  

No errors reported.


__Warning log__
Warning (roc): Error initialising ROC: No ROC toolchains found.
Warning (roc): No HSA Agents found, encountered exception when searching: Error at driver init: 
NUMBA_HSA_DRIVER /opt/rocm/lib/libhsa-runtime64.so is not a valid file path.  Note it must be a filepath of the .so/.dll/.dylib or the driver:
Warning (psutil): psutil cannot be imported. For more accuracy, consider installing it.
Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_quota_us
Warning (no file): /sys/fs/cgroup/cpuacct/cpu.cfs_period_us
--------------------------------------------------------------------------------

Output from nvidia-smi:

Thu Aug 13 16:56:16 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:1A:00.0 Off |                    0 |
| N/A   32C    P0    37W / 300W |      0MiB / 16160MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

2reactions

lastepheycommented, Feb 11, 2021

An update: we tracked this down to an issue with our custom container implementation Shifter which adapts Docker containers for an HPC system.

The problem was that the Cuda drivers were mounted into the Shifter container using a script that copied the files rather than preserving symlinks. This resulted in 3 different driver files being mounted:

-rwxr-xr-x 1 0 0 19615656 Dec 29 19:57 libcuda.so
-rwxr-xr-x 1 0 0 19615656 Dec 29 19:57 libcuda.so.1
-rwxr-xr-x 1 0 0 19615656 Dec 29 19:57 libcuda.so.450.102.04

You can compare this to what nvidia-docker does for the same container. It mounts a single driver and two symlinks:

lrwxrwxrwx  1 root root         12 Jan 28 01:31 libcuda.so -> libcuda.so.1
lrwxrwxrwx  1 root root         21 Jan 28 01:31 libcuda.so.1 -> libcuda.so.450.102.04
-rw-r--r--  1 root root   19615656 Dec 29 06:58 libcuda.so.450.102.04

We verified that changing our Shifter driver mounting script to use rsync rather than cp fixes the issue reported here. The driver versions were the same, but it seems that multiple driver files were causing the problem.

1reaction

lastepheycommented, Feb 11, 2021

Yes, I think so-- please close. Thank you for your help.

Top Results From Across the Web

Error cudaErrorOperatingSystem: OS call failed or ...

not DriveWorks. I'm asking you to try to install host components (e.g. cuda things) in DRIVE OS 5.2.6 with sdkmanager again and see...

CUDA API error on Python with Numba

CudaAPIError : Call to cuMemcpyHtoD results in CUDA_ERROR_LAUNCH_FAILED. These errors are encountered when you put your system on 'suspend'.

Tuple of CuPy arrays - Numba Cuda - Support: How do I do

Numba cuda works just fine with the original 3D CuPy array. But it interprets the tuple of cupy arrays as a python object...

types

const ( /** * The API call returned with no errors. ... This result is not actually an error but must be indicated...

Viewing online file analysis results for 'AppSvc.exe'

Sample was identified as malicious by a large number of Antivirus engines ... e24eeff75fc1faa06be3f87b2b59cc74afb1eb8c4d1fe0f48222df233e231ea7 Copy SHA256 ...