CUDA: Numerical differences between CPython / CPU target and the CUDA target in `math.pow`
See original GitHub issueOriginally noticed on Discourse.
The following:
from numba import cuda, njit
import numpy as np
import math
def inner(x, y):
return int(math.pow(x, y))
cpu_jitted = njit(inner)
def cuda_jitted(x, y):
inner_device = cuda.jit(device=True)(inner)
@cuda.jit
def outer(r, x, y):
r[()] = inner_device(x[()], y[()])
r_arr = np.array(0, dtype=np.int64)
x_arr = np.array(0, dtype=np.int64)
y_arr = np.array(0, dtype=np.int64)
x_arr[()] = x
y_arr[()] = y
outer[1, 1](r_arr, x_arr, y_arr)
return r_arr[()]
x = 3
y = 1
print(inner(x, y))
print(cpu_jitted(x, y))
print(cuda_jitted(x, y))
prints
3
3
2
This can cause some surprise.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
Supported Python features in CUDA Python
This page lists the Python features supported in the CUDA Python. This includes all kernel and device functions compiled with @cuda.jit and other...
Read more >Supported Python features in CUDA Python - Numba
This page lists the Python features supported in the CUDA Python. ... with @cuda.jit and other higher level Numba decorators that targets the...
Read more >CUDA C++ Best Practices Guide
CUDA C++ Best Practices Guide. The programming guide to using the CUDA Toolkit to obtain the best performance from NVIDIA GPUs.
Read more >Some trouble of using CUDA kernel function to calculate heap sort ...
hey guys, I created Preformatted text some cuda kernel -funtion for heap-sort. When I use CPU to simulate calculation, everything is normal, but...
Read more >CUDA for Python - Numba and GPUs - JM Coastal
If you replace target="parallel" with target="cuda" the function runs on the GPU ... version and some of the differences between CUDA C++ and...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
IIRC a similar issue affected cuDF: https://github.com/rapidsai/cudf/issues/10178
If the result is expected to be integral, it is possible to solve this in the same way as CuPy with an exponentiation-by-squaring algorithm, as I noted in the cuDF issue you linked. I think that might be the best solution for Numba as well as cuDF. https://github.com/rapidsai/cudf/issues/10178#issuecomment-1029200496