question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

no CUDA-capable device is detected

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): docker Ubuntu 16.04 image
  • Ray installed from (source or binary): pip
  • Ray version: 0.5.3
  • Python version: Python 3.5.6 :: Anaconda, Inc.
  • Exact command to reproduce:

Describe the problem

Trying to setup a rllib ppo agent with husky_env from Gibson Env The script I ran can be found here

I am getting the following Error when calling agent.train():

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected

Gibson does the environment rendering upon environment creation, and rllib agent’s seems to invoke env_creator every time train() is called. I originally thought that was the issue but I don’t think it is the case I tried using gpu_fraction, didn’t work. Not sure what is causing the problem.

nvidia-smi

root@e6b154065e88:~/mount/gibson/examples/train# nvidia-smi
Wed Nov  7 09:59:00 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.73       Driver Version: 410.73       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 00000000:04:00.0  On |                  N/A |
| 22%   42C    P8    20W / 250W |   2385MiB / 12198MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

torch.cuda.device_count()

root@e6b154065e88:~# python -c "import torch
print(torch.cuda.device_count())
print(torch.cuda.current_device())"
1
0

nvcc --version

root@e6b154065e88:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

To Reproduce

Get Nvidia-Docker2

https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)

#Ubuntu Installation
sudo apt-get install nvidia-docker2
sudo pkill -SIGHUP dockerd
Download Gibson’s dataset
wget https://storage.googleapis.com/gibsonassets/dataset.tar.gz
tar -zxf dataset.tar.gz
Pull Gibson’s image
docker pull xf1280/gibson:0.3.1
Run it in Docker

replace <dataset-absolute-path> with the absolute path to the Gibson dataset you’ve unzipped on your local machine

docker run --runtime=nvidia -ti --name gibson -v <dataset-absolute-path>:/root/mount/gibson/gibson/assets/dataset -p 5001:5001 xf1280/gibson:0.3.1
Add in the ray_husky.py script

Copy the ray_husky.py found here to ~/mount/gibson/examples/train/ directory in the docker container.

Run: python ray_husky.py

Full Log

root@e6b154065e88:~/mount/gibson/examples/train# python test.py
Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/4IFU7EUC3V2BOPDL2NFLW6T7BY:/var/lib/docker/overlay2/l/3GWVT6ULAU6NJP6MLTBNN56WBQ:/var/lib/docker/overlay2/l/CLLJDJFTZ2FMCKCN6B3WMCSXKG:/var/lib/docker/overlay2/l/QCO5RAE5DXB7MGGYLTK3YULY2O:/var/lib/docker/overlay2/l/NFJ7MEC3G7XLHLZMZWKKHLIM5Y:/var/lib/docker/overlay2/l/3LGFVLYHAWSN7GNAOYGCWVQK3Y:/var/lib/docker/overlay2/l/Q2BQDGXUX3SFP3RQYQDXOPWPSD:/var/lib/docker/overlay2/l/O5I6APSGOJZV4RFU7EOXVT5BWD:/var/lib/docker/overlay2/l/E4DOAELV7FPI6'
Unexpected end of /proc/mounts line `7XTB5ASEF7ESL:/var/lib/docker/overlay2/l/4BPII7VWNXTHZDYHMZQQ47WVGK:/var/lib/docker/overlay2/l/5RZ3I4FBOEGIAACNUMNPNJIIMM:/var/lib/docker/overlay2/l/JUDMTQV6ZO3CYJ64OCHUEOIDS4:/var/lib/docker/overlay2/l/WXFZP4STEX7JZ5S5VQCQR2MTDB:/var/lib/docker/overlay2/l/MUODDE6AS2PD6QOD6BXFE5JWN4:/var/lib/docker/overlay2/l/NV2EHBVA5EICRKTEGR3F4NADEC:/var/lib/docker/overlay2/l/MZVP7SBXRC7X7IKJKYHYQK6YOK:/var/lib/docker/overlay2/l/SVE4WWKXOSQOO2O3QQDMHW5TVB:/var/lib/docker/overlay2/l/NDRFI4BJ3ZGXEYSVAABQB6Z2OQ:/var/lib/do'
Unexpected end of /proc/mounts line `cker/overlay2/l/YTU432I3FDCY7GE4NT5VVR47GN:/var/lib/docker/overlay2/l/VCTBKUJHFQQQTCZRSPPZQKDIDZ:/var/lib/docker/overlay2/l/TR4DD4VR545GC7WIKUS5UDNRSM:/var/lib/docker/overlay2/l/BFRVMK6XAWSUK4JFRBYEOWQA4B:/var/lib/docker/overlay2/l/DLRGX3CDMNWDK66CSZZNXMTRTP:/var/lib/docker/overlay2/l/IPOZCPD7GVR3P3ECGOTQWPJ737:/var/lib/docker/overlay2/l/X6WEEMZQY3LGKMQELCNCCWVVHH:/var/lib/docker/overlay2/l/7APKFGZZGMNJ7BXSRL7A3WFVI6:/var/lib/docker/overlay2/l/PE6OSOUQSWBVJMTELFCNCFEG7X:/var/lib/docker/overlay2/l/FHHGDNFDT'
Unexpected end of /proc/mounts line `A32ESWYKQJTKH77LR:/var/lib/docker/overlay2/l/VEP2IVXB7LSMARPAJOF2SGEWTA:/var/lib/docker/overlay2/l/EAPK6KKCRU7YHHL6QVKDLQKSAH:/var/lib/docker/overlay2/l/5SZECZZ64ECDDARDWCQ2QOH2PY:/var/lib/docker/overlay2/l/XAL23ADNRDHSDATFJJSD3HA5T2:/var/lib/docker/overlay2/l/V7MN4H5N26LKKYRY4JGORHE4PI:/var/lib/docker/overlay2/l/3E3ILIVYCBQ52OYJLKCSZXAYPD:/var/lib/docker/overlay2/l/B4GW3N34A6DMEUWEO24TKYCJIW:/var/lib/docker/overlay2/l/XM3K5GW7VB5HRODVU7CTK5HUGD:/var/lib/docker/overlay2/l/7QHY2DH3GUNNMTOYULZIOK6F6O:/var/li'
pybullet build time: Sep 27 2018 00:17:23
pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
Process STDOUT and STDERR is being redirected to /tmp/raylogs/.
Waiting for redis server at 127.0.0.1:46828 to respond...
Waiting for redis server at 127.0.0.1:15517 to respond...
Warning: Reducing object store memory because /dev/shm has only 67104768 bytes available. You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you may need to pass an argument with the flag '--shm-size' to 'docker run'.
Starting the Plasma object store with 0.00 GB memory.
Starting local scheduler with the following resources: {'CPU': 32, 'GPU': 1}.
Failed to start the UI, you may need to run 'pip install jupyter'.
Created LogSyncer for /root/ray_results/PPO_test_2018-11-07_09-49-37kxrhxuku -> None
/root/mount/gibson/examples/train/../configs/husky_navigate_rgb_train.yaml
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
Processing the data:
Total 1 scenes 0 train 1 test
Indexing
  0%|                                                                                                                                                                                 | 0/1 [00:00<?, ?it/s]number of devices found 1
Loaded EGL 1.5 after reload.
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=GeForce GTX TITAN X/PCIe/SSE2
GL_VERSION=4.6.0 NVIDIA 410.73
GL_SHADING_LANGUAGE_VERSION=4.60 NVIDIA
finish loading shaders
100%|#########################################################################################################################################################################| 1/1 [00:00<00:00,  1.99it/s]
  9%|###############7                                                                                                                                                      | 18/190 [00:01<02:14,  1.28it/s]terminate called after throwing an instance of 'zmq::error_t'
  what():  Address already in use
100%|#####################################################################################################################################################################| 190/190 [00:12<00:00, 16.75it/s]
/root/mount/gibson/gibson/core/render/pcrender.py:204: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.imgv = Variable(torch.zeros(1, 3 , self.showsz, self.showsz), volatile = True).cuda()
/root/mount/gibson/gibson/core/render/pcrender.py:205: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.maskv = Variable(torch.zeros(1,2, self.showsz, self.showsz), volatile = True).cuda()
Episode: steps:0 score:0
Episode count: 0
/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
Episode: steps:0 score:0
Episode count: 1
LocalMultiGPUOptimizer devices ['/gpu:0']
Unexpected end of /proc/mounts line `overlay / overlay rw,relatime,lowerdir=/var/lib/docker/overlay2/l/4IFU7EUC3V2BOPDL2NFLW6T7BY:/var/lib/docker/overlay2/l/3GWVT6ULAU6NJP6MLTBNN56WBQ:/var/lib/docker/overlay2/l/CLLJDJFTZ2FMCKCN6B3WMCSXKG:/var/lib/docker/overlay2/l/QCO5RAE5DXB7MGGYLTK3YULY2O:/var/lib/docker/overlay2/l/NFJ7MEC3G7XLHLZMZWKKHLIM5Y:/var/lib/docker/overlay2/l/3LGFVLYHAWSN7GNAOYGCWVQK3Y:/var/lib/docker/overlay2/l/Q2BQDGXUX3SFP3RQYQDXOPWPSD:/var/lib/docker/overlay2/l/O5I6APSGOJZV4RFU7EOXVT5BWD:/var/lib/docker/overlay2/l/E4DOAELV7FPI6'
Unexpected end of /proc/mounts line `7XTB5ASEF7ESL:/var/lib/docker/overlay2/l/4BPII7VWNXTHZDYHMZQQ47WVGK:/var/lib/docker/overlay2/l/5RZ3I4FBOEGIAACNUMNPNJIIMM:/var/lib/docker/overlay2/l/JUDMTQV6ZO3CYJ64OCHUEOIDS4:/var/lib/docker/overlay2/l/WXFZP4STEX7JZ5S5VQCQR2MTDB:/var/lib/docker/overlay2/l/MUODDE6AS2PD6QOD6BXFE5JWN4:/var/lib/docker/overlay2/l/NV2EHBVA5EICRKTEGR3F4NADEC:/var/lib/docker/overlay2/l/MZVP7SBXRC7X7IKJKYHYQK6YOK:/var/lib/docker/overlay2/l/SVE4WWKXOSQOO2O3QQDMHW5TVB:/var/lib/docker/overlay2/l/NDRFI4BJ3ZGXEYSVAABQB6Z2OQ:/var/lib/do'
Unexpected end of /proc/mounts line `cker/overlay2/l/YTU432I3FDCY7GE4NT5VVR47GN:/var/lib/docker/overlay2/l/VCTBKUJHFQQQTCZRSPPZQKDIDZ:/var/lib/docker/overlay2/l/TR4DD4VR545GC7WIKUS5UDNRSM:/var/lib/docker/overlay2/l/BFRVMK6XAWSUK4JFRBYEOWQA4B:/var/lib/docker/overlay2/l/DLRGX3CDMNWDK66CSZZNXMTRTP:/var/lib/docker/overlay2/l/IPOZCPD7GVR3P3ECGOTQWPJ737:/var/lib/docker/overlay2/l/X6WEEMZQY3LGKMQELCNCCWVVHH:/var/lib/docker/overlay2/l/7APKFGZZGMNJ7BXSRL7A3WFVI6:/var/lib/docker/overlay2/l/PE6OSOUQSWBVJMTELFCNCFEG7X:/var/lib/docker/overlay2/l/FHHGDNFDT'
Unexpected end of /proc/mounts line `A32ESWYKQJTKH77LR:/var/lib/docker/overlay2/l/VEP2IVXB7LSMARPAJOF2SGEWTA:/var/lib/docker/overlay2/l/EAPK6KKCRU7YHHL6QVKDLQKSAH:/var/lib/docker/overlay2/l/5SZECZZ64ECDDARDWCQ2QOH2PY:/var/lib/docker/overlay2/l/XAL23ADNRDHSDATFJJSD3HA5T2:/var/lib/docker/overlay2/l/V7MN4H5N26LKKYRY4JGORHE4PI:/var/lib/docker/overlay2/l/3E3ILIVYCBQ52OYJLKCSZXAYPD:/var/lib/docker/overlay2/l/B4GW3N34A6DMEUWEO24TKYCJIW:/var/lib/docker/overlay2/l/XM3K5GW7VB5HRODVU7CTK5HUGD:/var/lib/docker/overlay2/l/7QHY2DH3GUNNMTOYULZIOK6F6O:/var/li'
pybullet build time: Sep 27 2018 00:17:23
pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
/root/mount/gibson/examples/train/../configs/husky_navigate_rgb_train.yaml
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
Processing the data:
Total 1 scenes 0 train 1 test
Indexing
  0%|                                                                                                                                                                                 | 0/1 [00:00<?, ?it/s]number of devices found 1
Loaded EGL 1.5 after reload.
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=GeForce GTX TITAN X/PCIe/SSE2
GL_VERSION=4.6.0 NVIDIA 410.73
GL_SHADING_LANGUAGE_VERSION=4.60 NVIDIA
finish loading shaders
100%|#########################################################################################################################################################################| 1/1 [00:00<00:00,  1.74it/s]
 11%|#################4                                                                                                                                                    | 20/190 [00:02<00:47,  3.56it/s]terminate called after throwing an instance of 'zmq::error_t'
  what():  Address already in use
100%|#####################################################################################################################################################################| 190/190 [00:12<00:00, 16.88it/s]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected
Remote function __init__ failed with:

Traceback (most recent call last):
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 945, in _process_task
    *arguments)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/actor.py", line 261, in actor_method_executor
    method_returns = method(actor, *args)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 178, in __init__
    self.env = env_creator(env_context)
  File "w.py", line 36, in <lambda>
    register_env(env_name, lambda _ : getGibsonEnv())
  File "w.py", line 29, in getGibsonEnv
    config=config_file)
  File "/root/mount/gibson/gibson/envs/husky_env.py", line 40, in __init__
    self.robot_introduce(Husky(self.config, env=self))
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 349, in robot_introduce
    self.setup_rendering_camera()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 376, in setup_rendering_camera
    self.setup_camera_pc()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 636, in setup_camera_pc
    env = self)
  File "/root/mount/gibson/gibson/core/render/pcrender.py", line 172, in __init__
    comp = torch.nn.DataParallel(comp).cuda()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 191, in _apply
    param.data = fn(param.data)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:74

Remote function set_global_vars failed with:

Traceback (most recent call last):
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 923, in _process_task
    self.reraise_actor_init_error()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 267, in reraise_actor_init_error
    raise self.actor_init_error
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 945, in _process_task
    *arguments)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/actor.py", line 261, in actor_method_executor
    method_returns = method(actor, *args)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 178, in __init__
    self.env = env_creator(env_context)
  File "w.py", line 36, in <lambda>
    register_env(env_name, lambda _ : getGibsonEnv())
  File "w.py", line 29, in getGibsonEnv
    config=config_file)
  File "/root/mount/gibson/gibson/envs/husky_env.py", line 40, in __init__
    self.robot_introduce(Husky(self.config, env=self))
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 349, in robot_introduce
    self.setup_rendering_camera()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 376, in setup_rendering_camera
    self.setup_camera_pc()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 636, in setup_camera_pc
    env = self)
  File "/root/mount/gibson/gibson/core/render/pcrender.py", line 172, in __init__
    comp = torch.nn.DataParallel(comp).cuda()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 191, in _apply
    param.data = fn(param.data)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:74

killing <subprocess.Popen object at 0x7f97880d22b0>
   File "w.py", line 68, in <module>
   File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/agents/agent.py", line 233, in train
   File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/utils/filter_manager.py", line 25, in synchronize
Remote function get_filters failed with:

Traceback (most recent call last):
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 923, in _process_task
    self.reraise_actor_init_error()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 267, in reraise_actor_init_error
    raise self.actor_init_error
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 923, in _process_task
    self.reraise_actor_init_error()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 267, in reraise_actor_init_error
    raise self.actor_init_error
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 945, in _process_task
    *arguments)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/actor.py", line 261, in actor_method_executor
    method_returns = method(actor, *args)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 178, in __init__
    self.env = env_creator(env_context)
  File "w.py", line 36, in <lambda>
    register_env(env_name, lambda _ : getGibsonEnv())
  File "w.py", line 29, in getGibsonEnv
    config=config_file)
  File "/root/mount/gibson/gibson/envs/husky_env.py", line 40, in __init__
    self.robot_introduce(Husky(self.config, env=self))
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 349, in robot_introduce
    self.setup_rendering_camera()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 376, in setup_rendering_camera
    self.setup_camera_pc()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 636, in setup_camera_pc
    env = self)
  File "/root/mount/gibson/gibson/core/render/pcrender.py", line 172, in __init__
    comp = torch.nn.DataParallel(comp).cuda()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 191, in _apply
    param.data = fn(param.data)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:74
   File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 2514, in get

 RayGetError: Could not get objectid ObjectID(4a7d420ef7de86cb813dcb59e2ebc4ece375f9d7). It was created by remote function get_filters which failed with:

Remote function get_filters failed with:

Traceback (most recent call last):
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 923, in _process_task
    self.reraise_actor_init_error()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 267, in reraise_actor_init_error
    raise self.actor_init_error
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 923, in _process_task
    self.reraise_actor_init_error()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 267, in reraise_actor_init_error
    raise self.actor_init_error
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/worker.py", line 945, in _process_task
    *arguments)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/actor.py", line 261, in actor_method_executor
    method_returns = method(actor, *args)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 178, in __init__
    self.env = env_creator(env_context)
  File "w.py", line 36, in <lambda>
    register_env(env_name, lambda _ : getGibsonEnv())
  File "w.py", line 29, in getGibsonEnv
    config=config_file)
  File "/root/mount/gibson/gibson/envs/husky_env.py", line 40, in __init__
    self.robot_introduce(Husky(self.config, env=self))
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 349, in robot_introduce
    self.setup_rendering_camera()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 376, in setup_rendering_camera
    self.setup_camera_pc()
  File "/root/mount/gibson/gibson/envs/env_modalities.py", line 636, in setup_camera_pc
    env = self)
  File "/root/mount/gibson/gibson/core/render/pcrender.py", line 172, in __init__
    comp = torch.nn.DataParallel(comp).cuda()
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 191, in _apply
    param.data = fn(param.data)
  File "/miniconda/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:74

I1107 09:50:22.214844  9899 local_scheduler.cc:178] Killed worker pid 13341 which hadn't started yet.


Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:25 (18 by maintainers)

github_iconTop GitHub Comments

7reactions
bmazourecommented, Jul 10, 2019

Yes, so the issue was that CUDA_VISIBLE_DEVICES was being unset from the environment (somehow). Putting os.environ('CUDA_VISIBLE_DEVICES') = '0' fixed the issue. Thanks everyone!

4reactions
richardliawcommented, Apr 13, 2020

Closing this issue because it seems like this is working. Please reopen if not.

Read more comments on GitHub >

github_iconTop Results From Across the Web

no CUDA-capable device is detected
I am trying to run an application using GPU. I have setup all settings and drivers. My OS is Ubuntu14.04 LTS. I checked...
Read more >
CUDA Error: no CUDA-capable device is detected · Issue #288
As a result, it was resolved by completely removing cuda, driver, and cudnn, and installing all three by entering the desired version directly....
Read more >
failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA ...
What should I do to resolve this? I want tensorflow lib. to utilise the local GPU while running the code. EDIT: libcudart,libcuda are...
Read more >
No CUDA-capable device is detected although requirements ...
If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means ...
Read more >
3.2.1. CUDA Architecture — OmpSs User Guide
No cuda capable device detected:​​ This probably means that GPUs are not detected by CUDA. You can check your CUDA installation or GPU...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found