Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get available devices and set a specific device in Pytorch-DML?

See original GitHub issue

Hi, For accessing available devices in Pytorch we’d normally do :

    print(f'available devices: {torch.cuda.device_count()}')
    print(f'current device: { torch.cuda.current_device()}')

However, I noticed this fails (AssertionError: Torch not compiled with CUDA enabled).
I thought the transition would be minimal, and stuff like this would work out of the box! especially so, after noting we cant write:

    print(f'available devices: {torch.dml.device_count()}')
    print(f'current device: { torch.dml.current_device()}')

as it fails with the error :

AttributeError: module 'torch.dml' has no attribute 'device_count'

Apart from this, trying to specify a device using the form “dml:number” fails if number>1! that is this fails for “dml:1”:

import torch 
import time
def bench(device ='cpu'):
    print(f'running on {device}:')
    a = torch.randn(size=(2000,2000)).to(device=device)
    b = torch.randn(size=(2000,2000)).to(device=device)
   
    start = time.time()
    c = a+b
    end = time.time()
    
    # print(f'available devices: {torch.dml.device_count()}')
    # print(f'current device: { torch.dml.current_device()}')
    print(f'--took {end-start:.2f} seconds')

bench('cpu')
bench('dml')
bench('dml:0')
bench('dml:1')

it outputs :

running on cpu:
--took 0.00 seconds
running on dml:
--took 0.01 seconds
running on dml:0:
--took 0.00 seconds
running on dml:1:

and thats it, it doesnt execute when it comes to “dml:1”.

also trying to do :

import torch 
import time
def bench(device ='cpu'):
    print(f'running on {device}:')
    a = torch.randn(size=(2000,2000)).to(device=device)
    b = torch.randn_like(a).to(device=device)
    
    start = time.time()
    c = a+b
    end = time.time()
    
    # print(f'available devices: {torch.dml.device_count()}')
    # print(f'current device: { torch.dml.current_device()}')
    print(f'--took {end-start:.2f} seconds')

bench('cpu')
bench('dml')
bench('dml:0')
bench('dml:1')

Fails with the following error :

running on cpu:
--took 0.00 seconds
running on dml:
Traceback (most recent call last):
  File "g:\tests.py", line 1246, in <module>
    bench('dml')
  File "g:\tests.py", line 1235, in bench
    b = torch.randn_like(a).to(device=device)
RuntimeError: Could not run 'aten::normal_' with arguments from the 'UNKNOWN_TENSOR_TYPE_ID' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom 
build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::normal_' is only available for these backends: [CPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradNestedTensor, UNKNOWN_TENSOR_TYPE_ID, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at D:\a\_work\1\s\build\aten\src\ATen\RegisterCPU.cpp:5926 [kernel]
BackendSelect: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\core\NamedRegistrations.cpp:11 [kernel]
AutogradOther: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradCPU: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradCUDA: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradXLA: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradNestedTensor: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse1: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse2: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
AutogradPrivateUse3: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\VariableType_4.cpp:8893 [autograd kernel]
Tracer: registered at D:\a\_work\1\s\torch\csrc\autograd\generated\TraceType_4.cpp:10612 [kernel]
Autocast: fallthrough registered at D:\a\_work\1\s\aten\src\ATen\autocast_mode.cpp:250 [backend fallback]
Batched: registered at D:\a\_work\1\s\aten\src\ATen\BatchingRegistrations.cpp:1016 [backend fallback]
VmapMode: registered at D:\a\_work\1\s\aten\src\ATen\VmapModeRegistrations.cpp:37 [kernel]

Issue Analytics

State:
Created 2 years ago
Reactions:4
Comments:11 (3 by maintainers)

Top GitHub Comments

6reactions

xsachacommented, Oct 24, 2021

I got the same issue here. Yet their examples work. It looks like you need to import more things from torch first before .to(“dml”) works, otherwise it complains about it. You still can’t do some things like create a new tensor with device set to “dml”.

Once I import the same things as the examples, I can use DML but none of my models appear to be supported. I usually have to freeze the model first so it can run it but I still get: RuntimeError: tensor.is_dml() INTERNAL ASSERT FAILED at “D:\a\_work\1\s\aten\src\ATen\native\dml\DMLTensor.cpp”:422, please report a bug to PyTorch. unbox expects Dml tensor as inputs

I decided to dive in to their headers to figure out more since they have exposed almost nothing to Python.

When you pick “dml”, it defaults to “dml:0”
None of the operators I require appear to be supported. You can see the full list in include/ATen/DMLFunctions.h
There is a HardwareAdapter class in the c++ that can enumerate the devices and returns a list that has vendor, driver version and name. It’s only used by the DmlBackend, which isn’t visible to Python.
I noticed it responds to an environment variable, similar to CUDA, DML_VISIBLE_DEVICES
They appear to have created it via caffe2 headers, copying some of their tensorflow implementation and basing some parts off the torch CUDA implementation to give some understanding about how it came about.

0reactions

smk2007commented, Jun 15, 2022

Is there a reason the latest pre-release is 4 versions behind the current torch pre-release? It makes it quite difficult to work out if the issues are because of that torch version or a change in DML, for instance that masked_select.

The current version of pytorch-directml is snapped to PyTorch 1.8, but we understand the pain here given the drift caused by rapid progress and updates made to Torch.

We are working on a solution to address this problem.

Top Results From Across the Web

torch.cuda — PyTorch 1.13 documentation

Returns the currently selected Stream for a given device. ... Returns a list of ByteTensor representing the random number states of all devices....

Get Started With PyTorch With These 5 Basic Functions.

The torch.device enables you to specify the device type responsible to load a tensor into memory. The function expects a string argument ...

Introducing PyTorch-DirectML: Train your machine learning ...

In this package, DirectML is integrated with the PyTorch framework by introducing a new device named “DML,” which calls on the DirectML APIs...

How do I list all currently available GPUs with pytorch?

This is not a correct answer. torch.cuda.device(i) returns a context manager that causes future commands to use that device. Putting them all in ......

Microsoft's PyTorch-DirectML Release-2 Now Works with ...

After that, install a set of needed libraries before installing the PyTorch-directML package with pip. This package includes the DML virtual ...