cuda tests fail when CUDA is available but not configured
See original GitHub issueI’m testing the build of the new release 3.1.1.
All tests accessing cuda are failing. This is not entirely surprising in itself. My system has nvidia drivers available and has a switchable nvidia card accessible via bumblebee (primusrun). But I have not specifically configured my system to execute CUDA. So it’s not surprising that CUDA_ERROR_NO_DEVICE is found. For me the nvidia card that I have at hand is for experimentation, not for routine operation. The main video card is intel.
What’s the best way to handle this situation? How can a non-CUDA build be enforced when CUDA is otherwise “available”.
An example test log is:
ERROR: testAllgather (test_cco_buf.TestCCOBufInplaceSelf)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/projects/python/build/mpi4py/test/test_cco_buf.py", line 382, in testAllgather
buf = array(-1, typecode, (size, count))
File "/projects/python/build/mpi4py/test/arrayimpl.py", line 459, in __init__
self.array = numba.cuda.device_array(shape, typecode)
File "/usr/lib/python3/dist-packages/numba/cuda/cudadrv/devices.py", line 223, in _require_cuda_context
with _runtime.ensure_context():
File "/usr/lib/python3.9/contextlib.py", line 117, in __enter__
return next(self.gen)
File "/usr/lib/python3/dist-packages/numba/cuda/cudadrv/devices.py", line 121, in ensure_context
with driver.get_active_context():
File "/usr/lib/python3/dist-packages/numba/cuda/cudadrv/driver.py", line 393, in __enter__
driver.cuCtxGetCurrent(byref(hctx))
File "/usr/lib/python3/dist-packages/numba/cuda/cudadrv/driver.py", line 280, in __getattr__
self.initialize()
File "/usr/lib/python3/dist-packages/numba/cuda/cudadrv/driver.py", line 240, in initialize
raise CudaSupportError("Error at driver init: \n%s:" % e)
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init:
[100] Call to cuInit results in CUDA_ERROR_NO_DEVICE:
-------------------- >> begin captured logging << --------------------
numba.cuda.cudadrv.driver: INFO: init
numba.cuda.cudadrv.driver: DEBUG: call driver api: cuInit
numba.cuda.cudadrv.driver: ERROR: Call to cuInit results in CUDA_ERROR_NO_DEVICE
--------------------- >> end captured logging << ---------------------
Issue Analytics
- State:
- Created 2 years ago
- Comments:43 (43 by maintainers)
Top Results From Across the Web
Failed in test CUDA v3.8.0 - GPU - Julia Discourse
This generally means an issue with your driver, and not CUDA.jl. Unless of course you say it didn't occur with CUDA.jl 3.5, but...
Read more >Pytorch says that CUDA is not available (on Ubuntu)
The initial message I got was that the GPU is currently in use by another application. But when I looked at nvidia-smi ,...
Read more >run 'configure' on a machine that has the CUDA compiler 'nvcc ...
I am trying to train a deep neural network acoustic model using cuda. Cuda is installed on the machine and the sample Cuda...
Read more >Installation Guide for Linux - NVIDIA Documentation Center
The installation instructions for the CUDA Toolkit on Linux. CUDA® is a parallel computing platform and programming model invented by ...
Read more >Installation — CuPy 11.4.0 documentation
If you have multiple versions of CUDA Toolkit installed, CuPy will ... SciPy and Optuna are optional dependencies and will not be installed...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It seems spawn trouble has been a long-running saga! I’ll deactivate them for now and check again later with future OpenMPI releases.
@drew-parsons I guess you are using Open MPI, right? Dynamic process management has always been semi-broken. I would suggest to just disable these tests if they are giving trouble of behave erratically. Hopefully, things will be much better in upcoming release Open MPI 5.x, are mpi4py tests are passing .