Unable to access GPU - cloud VMs and other machines
See original GitHub issueI’m having issue running training/pretraining on Azure VM, I found that there are two devices on the VM when I run
sudo lshw -C video
*-display
description: VGA compatible controller
product: Hyper-V virtual VGA
vendor: Microsoft Corporation
physical id: 8
bus info: pci@0000:00:08.0
version: 00
width: 32 bits
clock: 33MHz
capabilities: vga_controller bus_master rom
configuration: driver=hyperv_fb latency=0
resources: irq:11 memory:f8000000-fbffffff memory:c0000-dffff
*-display
description: 3D controller
product: GK210GL [Tesla K80]
vendor: NVIDIA Corporation
physical id: 1
bus info: pci@6d22:00:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress bus_master cap_list
configuration: driver=nvidia latency=0
resources: iomemory:100-ff iomemory:140-13f irq:0 memory:41000000-41ffffff memory:1000000000-13ffffffff memory:1400000000-1401ffffff
But spacy will not detect GPU at all, even though the virtual machine comes preconfigured with everything installed (using Data Science Virtual Machine - Ubuntu 18)
I have a feeling it’s because its defaulting to the device 0.
Is there a way to specify device 1 instead (like we can do in spacy train
command? I don’t see any option in the CLI reference for specifying GPU in pretrain.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Troubleshooting | Deep Learning VM Images - Google Cloud
Solution: You must have GPU quota before you can create instances with GPUs. Check the quotas page to ensure that you have enough...
Read more >Troubleshoot GPU extension issues for GPU VMs on Azure ...
This article gives guidance for resolving the most common issues that cause installation of the GPU extension on a GPU VM to fail...
Read more >Enable GPUs in your Google Cloud Platform VM for Machine ...
Your browser can't play this video. Learn more. Switch camera.
Read more >Assign GPUs to virtual machines with VMware vGPU mode
However, other VMs cannot access or benefit from that GPU. Pass-through operates at the chip level, not the core level. This means a...
Read more >Unable to get VM debian machine work with K80
GPUs : 1 x NVIDIA Tesla K80; Boot disk: debian-10-buster-v20201216. As you mentioned in your post there are no drivers for Linux: CUDA...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
No more isssues.
Ok so I think there is something else going on because I cannot get access to GPU using spacy at all. I tried multiple VM on Azure (Data Science Windows 2019, Data Science Ubuntu, I even created my own VM from scratch and installed CUDA toolkit myself and still nothing) All tested using NC_6 level machine with Tesla GPU.
On Ubuntu Data Science VM I am able to run pretraining, and
spacy.prefer_gpu(0)
returnsTrue
however when I run it, thewps
is lower than a 1060 card which is around 40000wps
- even though the card is a Tesla K80. It seems to be defaulting to the correct card, and I can see activity in the card when I watch the gpu usage but performance is abysmal.Furthermore pytorch is able to see and access the GPU when I run the following
The only place I’ve been successful using GPU is on a physical workstation.
Has anyone else had any luck training on cloud gpu?