question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to access GPU - cloud VMs and other machines

See original GitHub issue

I’m having issue running training/pretraining on Azure VM, I found that there are two devices on the VM when I run

sudo lshw -C video

  *-display
       description: VGA compatible controller
       product: Hyper-V virtual VGA
       vendor: Microsoft Corporation
       physical id: 8
       bus info: pci@0000:00:08.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: vga_controller bus_master rom
       configuration: driver=hyperv_fb latency=0
       resources: irq:11 memory:f8000000-fbffffff memory:c0000-dffff
  *-display
       description: 3D controller
       product: GK210GL [Tesla K80]
       vendor: NVIDIA Corporation
       physical id: 1
       bus info: pci@6d22:00:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list
       configuration: driver=nvidia latency=0
       resources: iomemory:100-ff iomemory:140-13f irq:0 memory:41000000-41ffffff memory:1000000000-13ffffffff memory:1400000000-1401ffffff

But spacy will not detect GPU at all, even though the virtual machine comes preconfigured with everything installed (using Data Science Virtual Machine - Ubuntu 18)

I have a feeling it’s because its defaulting to the device 0.

Is there a way to specify device 1 instead (like we can do in spacy train command? I don’t see any option in the CLI reference for specifying GPU in pretrain.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
erotavlascommented, May 8, 2020

No more isssues.

1reaction
erotavlascommented, Apr 30, 2020

Ok so I think there is something else going on because I cannot get access to GPU using spacy at all. I tried multiple VM on Azure (Data Science Windows 2019, Data Science Ubuntu, I even created my own VM from scratch and installed CUDA toolkit myself and still nothing) All tested using NC_6 level machine with Tesla GPU.

On Ubuntu Data Science VM I am able to run pretraining, and spacy.prefer_gpu(0) returns True however when I run it, the wps is lower than a 1060 card which is around 40000 wps - even though the card is a Tesla K80. It seems to be defaulting to the correct card, and I can see activity in the card when I watch the gpu usage but performance is abysmal.

Furthermore pytorch is able to see and access the GPU when I run the following

Python 3.6.8 |Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> spacy.prefer_gpu()
False
>>> spacy.require_gpu()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Miniconda\envs\spacy\lib\site-packages\thinc\neural\util.py", line 87, in require_gpu
    raise ValueError("GPU is not accessible. Was the library installed correctly?")
ValueError: GPU is not accessible. Was the library installed correctly?
>>> import torch
>>> torch.cuda.current_device()                                                                                         0
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'Tesla K80'
>>> torch.cuda.is_available()
True
>>>

The only place I’ve been successful using GPU is on a physical workstation.

Has anyone else had any luck training on cloud gpu?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting | Deep Learning VM Images - Google Cloud
Solution: You must have GPU quota before you can create instances with GPUs. Check the quotas page to ensure that you have enough...
Read more >
Troubleshoot GPU extension issues for GPU VMs on Azure ...
This article gives guidance for resolving the most common issues that cause installation of the GPU extension on a GPU VM to fail...
Read more >
Enable GPUs in your Google Cloud Platform VM for Machine ...
Your browser can't play this video. Learn more. Switch camera.
Read more >
Assign GPUs to virtual machines with VMware vGPU mode
However, other VMs cannot access or benefit from that GPU. Pass-through operates at the chip level, not the core level. This means a...
Read more >
Unable to get VM debian machine work with K80
GPUs : 1 x NVIDIA Tesla K80; Boot disk: debian-10-buster-v20201216. As you mentioned in your post there are no drivers for Linux: CUDA...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found