question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get available devices and set a specific device in Pytorch-DML?

See original GitHub issue

Hi, For accessing available devices in Pytorch we’d normally do :

    print(f'available devices: {torch.cuda.device_count()}')
    print(f'current device: { torch.cuda.current_device()}')

However, I noticed this fails (AssertionError: Torch not compiled with CUDA enabled).
I thought the transition would be minimal, and stuff like this would work out of the box! especially so, after noting we cant write:

    print(f'available devices: {torch.dml.device_count()}')
    print(f'current device: { torch.dml.current_device()}')

as it fails with the error :

AttributeError: module 'torch.dml' has no attribute 'device_count'

Apart from this, trying to specify a device using the form “dml:number” fails if number>1! that is this fails for “dml:1”:

import torch 
import time
def bench(device ='cpu'):
    print(f'running on {device}:')
    a = torch.randn(size=(2000,2000)).to(device=device)
    b = torch.randn(size=(2000,2000)).to(device=device)
   
    start = time.time()
    c = a+b
    end = time.time()
    
    # print(f'available devices: {torch.dml.device_count()}')
    # print(f'current device: { torch.dml.current_device()}')
    print(f'--took {end-start:.2f} seconds')

bench('cpu')
bench('dml')
bench('dml:0')
bench('dml:1')    

it outputs :

running on cpu:
--took 0.00 seconds
running on dml:
--took 0.01 seconds
running on dml:0:
--took 0.00 seconds
running on dml:1:

and thats it, it doesnt execute when it comes to “dml:1”.

also trying to do :

import torch 
import time
def bench(device ='cpu'):
    print(f'running on {device}:')
    a = torch.randn(size=(2000,2000)).to(device=device)
    b = torch.randn_like(a).to(device=device)
    
    start = time.time()
    c = a+b
    end = time.time()
    
    # print(f'available devices: {torch.dml.device_count()}')
    # print(f'current device: { torch.dml.current_device()}')
    print(f'--took {end-start:.2f} seconds')

bench('cpu')
bench('dml')
bench('dml:0')
bench('dml:1')    

Fails with the following error :

running on cpu:
--took 0.00 seconds
running on dml:
Traceback (most recent call last):
  File "g:\tests.py", line 1246, in <module>
    bench('dml')
  File "g:\tests.py", line 1235, in bench
    b = torch.randn_like(a).to(device=device)
RuntimeError: Could not run 'aten::normal_' with arguments from the 'UNKNOWN_TENSOR_TYPE_ID' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom 
build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::normal_' is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradNestedTensor, UNKNOWN_TENSOR_TYPE_ID, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at D:\a\_work\1\s\build\aten\src\ATen\RegisterCPU.cpp:5926 [kernel]
BackendSelect: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\core\NamedRegistrations.cpp:11 [kernel]
AutogradOther: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradCPU: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradCUDA: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradXLA: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradNestedTensor: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse1: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse2: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse3: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
Tracer: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\TraceType_4.cpp:10612 [kernel]
Autocast: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\autocast_mode.cpp:250 [backend fallback]
Batched: registered at D:\a\_work\1\s\aten\src\ATen\BatchingRegistrations.cpp:1016 [backend fallback]
VmapMode: registered at D:\a\_work\1\s\aten\src\ATen\VmapModeRegistrations.cpp:37 [kernel]

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:4
  • Comments:11 (3 by maintainers)

github_iconTop GitHub Comments

6reactions
xsachacommented, Oct 24, 2021

I got the same issue here. Yet their examples work. It looks like you need to import more things from torch first before .to(“dml”) works, otherwise it complains about it. You still can’t do some things like create a new tensor with device set to “dml”.

Once I import the same things as the examples, I can use DML but none of my models appear to be supported. I usually have to freeze the model first so it can run it but I still get: RuntimeError: tensor.is_dml() INTERNAL ASSERT FAILED at “D:\a\_work\1\s\aten\src\ATen\native\dml\DMLTensor.cpp”:422, please report a bug to PyTorch. unbox expects Dml tensor as inputs

I decided to dive in to their headers to figure out more since they have exposed almost nothing to Python.

  • When you pick “dml”, it defaults to “dml:0”
  • None of the operators I require appear to be supported. You can see the full list in include/ATen/DMLFunctions.h
  • There is a HardwareAdapter class in the c++ that can enumerate the devices and returns a list that has vendor, driver version and name. It’s only used by the DmlBackend, which isn’t visible to Python.
  • I noticed it responds to an environment variable, similar to CUDA, DML_VISIBLE_DEVICES
  • They appear to have created it via caffe2 headers, copying some of their tensorflow implementation and basing some parts off the torch CUDA implementation to give some understanding about how it came about.
0reactions
smk2007commented, Jun 15, 2022

Is there a reason the latest pre-release is 4 versions behind the current torch pre-release? It makes it quite difficult to work out if the issues are because of that torch version or a change in DML, for instance that masked_select.

The current version of pytorch-directml is snapped to PyTorch 1.8, but we understand the pain here given the drift caused by rapid progress and updates made to Torch.

We are working on a solution to address this problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

torch.cuda — PyTorch 1.13 documentation
Returns the currently selected Stream for a given device. ... Returns a list of ByteTensor representing the random number states of all devices....
Read more >
Get Started With PyTorch With These 5 Basic Functions.
The torch.device enables you to specify the device type responsible to load a tensor into memory. The function expects a string argument ...
Read more >
Introducing PyTorch-DirectML: Train your machine learning ...
In this package, DirectML is integrated with the PyTorch framework by introducing a new device named “DML,” which calls on the DirectML APIs...
Read more >
How do I list all currently available GPUs with pytorch?
This is not a correct answer. torch.cuda.device(i) returns a context manager that causes future commands to use that device. Putting them all in ......
Read more >
Microsoft's PyTorch-DirectML Release-2 Now Works with ...
After that, install a set of needed libraries before installing the PyTorch-directML package with pip. This package includes the DML virtual ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found