Unexpected NaN when using big-endian arrays
See original GitHub issueWhen a big-endian array is loaded on the GPU using cp.array(), random NaNs appear in the data and calculations will start returning NaN. No errors or warnings are given to the user.
Conditions
CuPy Version : 7.6.0 CUDA Root : /usr/local/cuda CUDA Build Version : 9010 CUDA Driver Version : 10010 CUDA Runtime Version : 9010 cuBLAS Version : 9010 cuFFT Version : 9010 cuRAND Version : 9010 cuSOLVER Version : (9, 1, 0) cuSPARSE Version : 9010 NVRTC Version : (9, 1) cuDNN Build Version : 7102 cuDNN Version : 7102 NCCL Build Version : 2115 NCCL Runtime Version : (unknown) CUB Version : None cuTENSOR Version : None
Code to reproduce
import numpy as np
data = np.arange(1000*1000, dtype='>f4')/1e9
print(' numpy:', type(data), data.shape, data.dtype)
print(' nan:', np.where(np.isnan(data)))
print(' total:', np.sum(data))
arr = cp.array(data)
print('-----')
print(' cupy:' , type(arr), arr.shape, arr.dtype)
print(' nan:' , cp.where(cp.isnan(arr)))
print(' total:', cp.sum(arr))`
Output of the above code:
numpy: <class 'numpy.ndarray'> (1000000,) >f4
nan: (array([], dtype=int64),)
total: 499.99963
-----
cupy: <class 'cupy.core.core.ndarray'> (1000000,) >f4
nan: (array([ 213, 385, 426, ..., 999227, 999242, 999391]),)
total: nan
The numpy array shows no NaNs as expected, while the cupy array on the GPU shows several NaNs and functions like sum() that operate on the whole array return NaN as well.
Scenarios
A fairly common occurence in a scientific environment is when readings FITS files, which store data big-endian, for example using the astropy.io module:
import cupy as cp
from astropy.io import fits
data = fits.getdata(filename)
gpu_data = cp.array(data) # Results in NaN
A workaround is to convert the array to little endian before using it with cupy:
data = data.astype(np.float32)
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top GitHub Comments
I would never create big-endian arrays on purpose, but it can happen if you use third party libraries that produce numpy arrays, like the astropy example I gave in the original report.
I triggered the bug just reading files from disk using astropy and loading them on the GPU, three lines of code. Nothing naughty 😃 Since FITS files are very common where I work, I now need to wrap the cupy loading code with some checks, otherwise it will happen all the time.
Found a similar issue: #2744 ndarray.dot produces wrong values with non-native endian arrays