Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sparse array created from a cupy ndarray has incorrect values

See original GitHub issue

I’m trying to build a sparse array from a reasonably large cupy ndarray using cp.sparse.csr_matrix(x), but when I test the correctness of the result, some values mismatch between the input array and the output sparse array.

Code to reproduce

import cupy as cp

dense_ary = cp.ones(shape=(50000, 1003))
sones = cp.sparse.csr_matrix(dense_ary)
cp.testing.assert_array_almost_equal(dense_ary, sones.toarray(), decimal=1)

Error messages, stack traces, or logs

AssertionError: 
Arrays are not almost equal to 1 decimals

Mismatched elements: 17973 / 50150000 (0.0358%)
Max absolute difference: 1.
Max relative difference: 0.
 x: array([[1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],...
 y: array([[1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],...`

Conditions

CuPy Version : 7.2.0
CUDA Root : /usr/local/cuda
CUDA Build Version : 10000
CUDA Driver Version : 10010
CUDA Runtime Version : 10000
cuBLAS Version : 10000
cuFFT Version : 10000
cuRAND Version : 10000
cuSOLVER Version : (10, 0, 0)
cuSPARSE Version : 10000
NVRTC Version : (10, 0)
cuDNN Build Version : 7605
cuDNN Version : 7600
NCCL Build Version : 2406
NCCL Runtime Version : 2507

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:7 (7 by maintainers)

Top GitHub Comments

3reactions

emcastillocommented, Mar 26, 2020

I have reproduced the issue, and I think this might be a cusparse bug. cc @anaruse @jakirkham @pentschev @leofang

When creating a csr matrix copy only calls the corresponding cusparseXdense2csr routine, the data array that is returned from there has some entries set to zero. This can be seen in cupy/cusparse.py line 690.

This is a reduced example

import cupy as cp

dense_ary = cp.ones(shape=(50000, 9))
sones = cp.sparse.csr_matrix(dense_ary)
print(dense_ary.sum(), sones.sum(), sones.toarray().sum())

If we look at the elements that are set to 0 in the cusparse dense2csr returned data array we see that starting from the row 40960 to row 49999 all elements in column 8 are set to 0. Even if we increase the number of columns to 1003, the 0s start always in the row 40960 and will be in the column 1002. This only happens when the column number is even.

Note that this error only happens when creating a csr matrix and not a csc one.

2reactions

anarusecommented, Mar 31, 2020

This issue should be left open, in my opinion, but it is a issue of cuSparse, more specifically, cusparse<t>dense2csr().

Top Results From Across the Web

cupy.ndarray — CuPy 11.4.0 documentation

Multi-dimensional array on a CUDA device. This class implements a subset of methods of numpy.ndarray . The difference is that this class allocates...

Scipy: Sparse Matrix giving incorrect values - Stack Overflow

You have repeated coordinates, and the constructor is adding them all up. Do the following : x_, row = np.unique(X, return_inverse=True) y_, ...

CuPy Documentation - Read the Docs

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated ... Required only when coping sparse matrices from GPU to CPU (see Sparse ...

Source code for dask.array.routines

This may produce incorrect values for `dtype` or `shape`, so we recommend providing them. """ arr = asarray(arr) # Verify that axis is...

cuML API Reference — cuml 22.12.00 documentation - RAPIDS Docs

Encode categorical features as a one-hot numeric array. The input to this estimator should be a cuDF.DataFrame or a cupy.ndarray, denoting the unique...