question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sparse array created from a cupy ndarray has incorrect values

See original GitHub issue

I’m trying to build a sparse array from a reasonably large cupy ndarray using cp.sparse.csr_matrix(x), but when I test the correctness of the result, some values mismatch between the input array and the output sparse array.

Code to reproduce

import cupy as cp

dense_ary = cp.ones(shape=(50000, 1003))
sones = cp.sparse.csr_matrix(dense_ary)
cp.testing.assert_array_almost_equal(dense_ary, sones.toarray(), decimal=1)

Error messages, stack traces, or logs

AssertionError: 
Arrays are not almost equal to 1 decimals

Mismatched elements: 17973 / 50150000 (0.0358%)
Max absolute difference: 1.
Max relative difference: 0.
 x: array([[1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],...
 y: array([[1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],
       [1., 1., 1., ..., 1., 1., 1.],...`

Conditions

  • CuPy Version : 7.2.0
  • CUDA Root : /usr/local/cuda
  • CUDA Build Version : 10000
  • CUDA Driver Version : 10010
  • CUDA Runtime Version : 10000
  • cuBLAS Version : 10000
  • cuFFT Version : 10000
  • cuRAND Version : 10000
  • cuSOLVER Version : (10, 0, 0)
  • cuSPARSE Version : 10000
  • NVRTC Version : (10, 0)
  • cuDNN Build Version : 7605
  • cuDNN Version : 7600
  • NCCL Build Version : 2406
  • NCCL Runtime Version : 2507

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
emcastillocommented, Mar 26, 2020

I have reproduced the issue, and I think this might be a cusparse bug. cc @anaruse @jakirkham @pentschev @leofang

When creating a csr matrix copy only calls the corresponding cusparseXdense2csr routine, the data array that is returned from there has some entries set to zero. This can be seen in cupy/cusparse.py line 690.

This is a reduced example

import cupy as cp

dense_ary = cp.ones(shape=(50000, 9))
sones = cp.sparse.csr_matrix(dense_ary)
print(dense_ary.sum(), sones.sum(), sones.toarray().sum())

If we look at the elements that are set to 0 in the cusparse dense2csr returned data array we see that starting from the row 40960 to row 49999 all elements in column 8 are set to 0. Even if we increase the number of columns to 1003, the 0s start always in the row 40960 and will be in the column 1002. This only happens when the column number is even.

Note that this error only happens when creating a csr matrix and not a csc one.

2reactions
anarusecommented, Mar 31, 2020

This issue should be left open, in my opinion, but it is a issue of cuSparse, more specifically, cusparse<t>dense2csr().

Read more comments on GitHub >

github_iconTop Results From Across the Web

cupy.ndarray — CuPy 11.4.0 documentation
Multi-dimensional array on a CUDA device. This class implements a subset of methods of numpy.ndarray . The difference is that this class allocates...
Read more >
Scipy: Sparse Matrix giving incorrect values - Stack Overflow
You have repeated coordinates, and the constructor is adding them all up. Do the following : x_, row = np.unique(X, return_inverse=True) y_, ...
Read more >
CuPy Documentation - Read the Docs
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated ... Required only when coping sparse matrices from GPU to CPU (see Sparse ...
Read more >
Source code for dask.array.routines
This may produce incorrect values for `dtype` or `shape`, so we recommend providing them. """ arr = asarray(arr) # Verify that axis is...
Read more >
cuML API Reference — cuml 22.12.00 documentation - RAPIDS Docs
Encode categorical features as a one-hot numeric array. The input to this estimator should be a cuDF.DataFrame or a cupy.ndarray, denoting the unique...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found