FP64 Emulation Support is Broken; Cannot Run Own Scripts
See original GitHub issueOriginal script (bench_mixed-precision_vs_single-precision_pytorch_ipex.py
) I am running using IPEX:
import torch
import intel_extension_for_pytorch as ipex
import matplotlib.pyplot as plt
import time
import sys
torch.set_default_tensor_type(torch.FloatTensor)
# Comment out the following two lines if running this script in a ROCm environment
# import torch.backends.cudnn as cudnn
# cudnn.benchmark = True
def grid(width, height):
hrange = torch.arange(width).unsqueeze(0).repeat([height, 1]).div(width)
vrange = torch.arange(height).unsqueeze(1).repeat([1, width]).div(height)
output = torch.stack([hrange, vrange], 0).float()
return output
def checker(width, height, freq):
hrange = torch.arange(width).reshape([1, width]).mul(freq / width / 2.0).fmod(1.0).gt(0.5)
vrange = torch.arange(height).reshape([height, 1]).mul(freq / height / 2.0).fmod(1.0).gt(0.5)
output = hrange.logical_xor(vrange).float()
return output
if len(sys.argv) > 1 and sys.argv[1] != "bench_mixed_precision":
print("\nUsage:", sys.argv[0], "[bench_mixed_precision]\n")
quit()
# Note the inputs are grid coordinates and the target is a checkerboard
inputs = grid(384, 384).unsqueeze(0).to("xpu")
targets = checker(384, 384, 8).unsqueeze(0).unsqueeze(1).to("xpu")
class Net(torch.jit.ScriptModule):
def __init__(self):
super().__init__()
self.net = torch.nn.Sequential(
torch.nn.Conv2d(2, 256, 1),
torch.nn.BatchNorm2d(256),
torch.nn.ReLU(),
torch.nn.Conv2d(256, 256, 1),
torch.nn.BatchNorm2d(256),
torch.nn.ReLU(),
torch.nn.Conv2d(256, 256, 1),
torch.nn.BatchNorm2d(256),
torch.nn.ReLU(),
torch.nn.Conv2d(256, 1, 1))
@torch.jit.script_method
def forward(self, x):
return self.net(x)
net = Net().to("xpu")
loss_fn = torch.nn.MSELoss().to("xpu")
opt = torch.optim.Adam(net.parameters(), 0.001)
net, opt = ipex.optimize(net, optimizer=opt, dtype=torch.float32)
print("Starting training loop, please be patient...")
start_time = time.time()
# for i in range(400):
for i in range(3):
opt.zero_grad()
if len(sys.argv) > 1 and sys.argv[1] == "bench_mixed_precision":
with torch.xpu.amp.autocast(enabled=True, dtype=torch.float16):
outputs = net(inputs)
loss = loss_fn(outputs, targets)
else:
outputs = net(inputs)
loss = loss_fn(outputs, targets)
loss.backward()
opt.step()
# if (i + 1) % 50 == 0:
if (i + 1) % 1 == 0:
print(loss)
print("Completed iteration %d/%d" % (i + 1, 400))
torch.xpu.synchronize()
print(f"Training completed in {time.time() - start_time} seconds :)")
print(loss)
Running the script without doing export OverrideDefaultFP64Settings=1 && export IGC_EnableDPEmulation=1
yields the following output:
root@d8b5bd7be0b9:/workspace# python3 bench_mixed-precision_vs_single-precision_pytorch_ipex.py | tee ipex_bench_output1.txt
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPU17dpcppMemoryScale1IffEEvPT_PKT0_mdENKUlRN2cl4sycl7handlerEE_clESA_EUlNS8_4itemILi1ELb1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
...
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
[CRITICAL ERROR] Kernel '_ZTSZZN2at15AtenIpexTypeXPUL24launch_vectorized_kernelINS0_13BUnaryFunctorIddbZZZNS0_4impl15gt_kernel_dpcppERNS_14TensorIteratorEENKUlvE_clEvENKUlvE1_clEvEUlddE_EEN3xpu5dpcpp5ArrayIPcLi2EEE23TrivialOffsetCalculatorILi1EjEEEvlRKT_T0_T1_iENKUlRN2cl4sycl7handlerEE0_clESP_EUlNSN_7nd_itemILi1EEEE_' removed due to usage of FP64 instructions unsupported by the targeted hardware. Running this kernel may result in unexpected results.
And if I enable FP64 emulation by doing export OverrideDefaultFP64Settings=1 && export IGC_EnableDPEmulation=1
I get the following output instead:
root@d8b5bd7be0b9:/workspace# python3 bench_mixed-precision_vs_single-precision_pytorch_ipex.py 2>&1 | tee ipex_bench_output2.txt
Starting training loop, please be patient...
Traceback (most recent call last):
File "/workspace/bench_mixed-precision_vs_single-precision_pytorch_ipex.py", line 75, in <module>
print(loss)
File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/site-packages/torch/_tensor.py", line 249, in __repr__
return torch._tensor_str._str(self)
File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/site-packages/torch/_tensor_str.py", line 415, in _str
return _str_intern(self)
File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/site-packages/torch/_tensor_str.py", line 390, in _str_intern
tensor_str = _tensor_str(self, indent)
File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/site-packages/torch/_tensor_str.py", line 251, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "/opt/intel/oneapi/intelpython/latest/lib/python3.9/site-packages/torch/_tensor_str.py", line 102, in __init__
if value != torch.ceil(value):
RuntimeError: Native API failed. Native API returns: -1 (CL_DEVICE_NOT_FOUND) -1 (CL_DEVICE_NOT_FOUND)
As you can see because the script I am running (and probably may more scripts I may run in the future for that matter) requires FP64 instructions to be supported, but because the support for such instructions is broken right now as shown above, I absolutely cannot run the workloads I’d like to run using the IPEX binary wheel that you guys have distributed (and the same issue occurs when I try building my own wheels from source). Could you please fix this so that I can properly take advantage of FP64 emulation in IPEX, or at least give me advice on how I can modify my script so that I can run it without needed to resort to enabling FP64 emulation?
I’ll be more than happy to provide additional info about my system if needed to help solve this issue. 😄
Issue Analytics
- State:
- Created 10 months ago
- Comments:9 (4 by maintainers)
I also get these messages printed out despite having absolutely no FP64 code. The message does not appear if you disable profiling.
Reproducer:
Output (100 or so of these):
@tedliosu This is our first release mainly for data center GPU. We are taking client dGPU into account. Please give us more time.
Although Arc Alchemist dGPU has similar arch, XPU should work as well. But without full verification, I cannot make you take risk to waste your money. 😦