Ray is not finding GPU but TF, PyTorch and nvcc does
See original GitHub issueI have two NVIDIA TitanX but Ray isn’t seeing any:
ray.init(num_gpus=2)
print(ray.get_gpu_ids())
# prints []
Ray also prints below inicating no GPUs:
2019-10-16 18:20:17,954 INFO multi_gpu_optimizer.py:93 -- LocalMultiGPUOptimizer devices ['/cpu:0']
But TensorFlow sees all devices:
import tensorflow
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
That prints:
[name: "/device:CPU:0"
device_type: "CPU"
...
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
...
, name: "/device:GPU:0"
device_type: "GPU"
...
, name: "/device:GPU:1"
device_type: "GPU"
...
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
...
, name: "/device:XLA_GPU:1"
device_type: "XLA_GPU"
...
]
Similarly,
/usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
Why Ray doesn’t see my GPUs?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:8
- Comments:14 (7 by maintainers)
Top Results From Across the Web
Ray Train doesn't detect GPU
Hi, I'm using Ray Train to train a PyTorch model on an EC2 g4dn.12xlarge (4*NVIDIA T4) ... GradScaler is enabled, but CUDA is...
Read more >pytorch - GPU available in Tensorflow but not in Torch
I am attaching the specificities of the GPUs and the current version of Tensorflow and Pytorch I am using. Does anyone have any...
Read more >PyTorch cannot find GPU, 2021 version
Environment: Remote Linux with core version 5.8.0. I am not a super user. Python 3.8.6; CUDA Version: 11.1; GPU is RTX 3090 with...
Read more >Getting the Most Out of the NVIDIA A100 GPU with Multi ...
With MIG, each A100 GPU can be partitioned up to seven GPU instances, ... MIG does not allow GPU instances to be created...
Read more >How To Install CUDA 10 (together with 9.2) on Ubuntu 18.04 ...
NVIDIA recently released version 10.0 of CUDA. This is an upgrade from the 9.x series and has support for the new Turing GPU...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I am having the same issue as @Wormh0-le. This is preventing me from training a torch policy without ray.tune which I do not which to use. I just want to call .train() on my agent.
and I explicit
num_gpus=1
,but ray still can’t get GPU, andtorch.cuda.is_available()
is True. why?