cub.device_csrmv is corrupting sparse arrays
See original GitHub issueHere’s a reproducer of the behavior we are seeing on cuML. What’s really strange is that this doesn’t seem to be happening consistently across Linux versions & GPUs.
>>> a = cupy.sparse.random(10000, 1500, format='coo', density=0.5)
>>> b = a.tocsr()
>>> cupy.testing.assert_array_equal(a.col, b.indices)
>>> cupy.testing.assert_array_equal(a.data, b.data)
>>> a.sum(axis=0)
array([[0.00000000e+000, 0.00000000e+000, 4.24399158e-314, ...,
1.10383165e-307, 1.10512862e-307, 1.10636362e-307]])
>>> b.sum(axis=0)
array([[3056.30047519, 3058.09608272, 3009.4495842 , ..., 1505.96878088,
1460.47145535, 1505.26624145]])
We know the CSR (b
) is correct because the COO that’s yielding incorrect results in one of our primitives.
Original cuml issue: https://github.com/rapidsai/cuml/issues/2724
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:50 (49 by maintainers)
Top Results From Across the Web
cub::DeviceSpmv Struct Reference - NVlabs
A is an mxn sparse matrix whose non-zero structure is specified in compressed-storage-row (CSR) format (i.e., three arrays: values, row_offsets, ...
Read more >cuSPARSE - NVIDIA Documentation Center
The cuSPARSE library contains a set of basic linear algebra subroutines used for handling sparse matrices. The library targets matrices with a number...
Read more >Sparse Matrices in ArrayFire v3.4
A sparse data structure is one where all the non-zero elements are not stored. Sparse matrices are useful when the number of zero-values ......
Read more >How to quickly compact a sparse array with CUDA C?
I have an array A of integers on device (GPU) memory. At each iteration, I randomly choose a few elements that are larger...
Read more >Sparse matrices (scipy.sparse) — SciPy v1.9.3 Manual
SciPy 2-D sparse array package for numeric data. Note. This package is switching to an array interface, compatible with NumPy arrays, from the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This seems to be the commit where it broke: https://github.com/NVlabs/cub/commit/b2e64cf0fb4ea7ace6c86ca6765ca7c1087ef82e
@leofang, the interesting thing is that the only code path that seems to use cub’s csrmv is when the dot product is taken between a coo matrix and a dense array. A different code path is used when a dot product is taken between a CSR and a dense vector. Is it possible that exact code path is not tested? We wouldn’t have caught this in our pytests on cuml had
cupy.sparse.random
not returned a coo by default (and the temporary fix for us was to just have cupy.sparse.random output a CSR).I’m on my phone right now but I can look through the cupy pytests when I get in front of a computer next.