Sparse array created from a cupy ndarray has incorrect values
See original GitHub issueI’m trying to build a sparse array from a reasonably large cupy ndarray using cp.sparse.csr_matrix(x)
, but when I test the correctness of the result, some values mismatch between the input array and the output sparse array.
Code to reproduce
import cupy as cp
dense_ary = cp.ones(shape=(50000, 1003))
sones = cp.sparse.csr_matrix(dense_ary)
cp.testing.assert_array_almost_equal(dense_ary, sones.toarray(), decimal=1)
Error messages, stack traces, or logs
AssertionError:
Arrays are not almost equal to 1 decimals
Mismatched elements: 17973 / 50150000 (0.0358%)
Max absolute difference: 1.
Max relative difference: 0.
x: array([[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],...
y: array([[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],...`
Conditions
- CuPy Version : 7.2.0
- CUDA Root : /usr/local/cuda
- CUDA Build Version : 10000
- CUDA Driver Version : 10010
- CUDA Runtime Version : 10000
- cuBLAS Version : 10000
- cuFFT Version : 10000
- cuRAND Version : 10000
- cuSOLVER Version : (10, 0, 0)
- cuSPARSE Version : 10000
- NVRTC Version : (10, 0)
- cuDNN Build Version : 7605
- cuDNN Version : 7600
- NCCL Build Version : 2406
- NCCL Runtime Version : 2507
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (7 by maintainers)
Top Results From Across the Web
cupy.ndarray — CuPy 11.4.0 documentation
Multi-dimensional array on a CUDA device. This class implements a subset of methods of numpy.ndarray . The difference is that this class allocates...
Read more >Scipy: Sparse Matrix giving incorrect values - Stack Overflow
You have repeated coordinates, and the constructor is adding them all up. Do the following : x_, row = np.unique(X, return_inverse=True) y_, ...
Read more >CuPy Documentation - Read the Docs
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated ... Required only when coping sparse matrices from GPU to CPU (see Sparse ...
Read more >Source code for dask.array.routines
This may produce incorrect values for `dtype` or `shape`, so we recommend providing them. """ arr = asarray(arr) # Verify that axis is...
Read more >cuML API Reference — cuml 22.12.00 documentation - RAPIDS Docs
Encode categorical features as a one-hot numeric array. The input to this estimator should be a cuDF.DataFrame or a cupy.ndarray, denoting the unique...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I have reproduced the issue, and I think this might be a cusparse bug. cc @anaruse @jakirkham @pentschev @leofang
When creating a csr matrix copy only calls the corresponding
cusparseXdense2csr
routine, the data array that is returned from there has some entries set to zero. This can be seen incupy/cusparse.py
line 690.This is a reduced example
If we look at the elements that are set to 0 in the cusparse
dense2csr
returned data array we see that starting from the row 40960 to row 49999 all elements in column 8 are set to 0. Even if we increase the number of columns to 1003, the 0s start always in the row 40960 and will be in the column 1002. This only happens when the column number is even.Note that this error only happens when creating a csr matrix and not a csc one.
This issue should be left open, in my opinion, but it is a issue of cuSparse, more specifically, cusparse<t>dense2csr().