bug in sparse matrix conversion COO->CSR for some matrices
See original GitHub issueI am finding that conversion of COO->CSR or CSC does not agree with scipy when the underlying matrix is large (nnz > 50k elements or so). The problem does not seem to occur for small matrices such as those used in the test suite.
- Conditions (you can just paste the output of
python -c 'import cupy; cupy.show_config()'
)
CuPy Version : 6.0.0b2 CUDA Root : /usr/local/cuda CUDA Build Version : 10000 CUDA Driver Version : 10000 CUDA Runtime Version : 10000
- Code to reproduce
The data for the example below is ~1MB in size and is available via: https://drive.google.com/open?id=1BRaTNQoAYJPOjfGyio51OwwmCZdh3vdT
import scipy
import numpy as np
import cupy
import cupyx
data = np.load('/tmp/coo_to_csc_example_100k.npz')
sl = slice(None)
A_cpu = scipy.sparse.coo_matrix((data['data'][sl],
(data['row'][sl], data['col'][sl])))
A_gpu = cupyx.scipy.sparse.coo_matrix(A_cpu)
# this comparison is okay
cupy.testing.assert_allclose(A_gpu.data, A_cpu.data)
# conversion on the GPU does not match for large matrices
A_csr_gpu = A_gpu.tocsr()
A_csr_cpu = A_cpu.tocsr()
cupy.testing.assert_allclose(A_csr_gpu.data,
A_csr_cpu.data)
# convert via round-trip to CPU works fine
A_csr_gpu2 = cupyx.scipy.sparse.coo_matrix(A_gpu.get().tocsr())
cupy.testing.assert_allclose(A_csr_gpu2.data,
A_csr_cpu.data)
And oddly, the data array after conversion does match at the beginning and then has huge mismatches in many of the later elements. I plot this below:
import matplotlib.pyplot as plt
plt.figure()
plt.plot(A_csc_cpu.data - A_csc_gpu.data.get())
The input data has values in the range [0, 1.0]. In this example the data is dtype np.float32, but the same thing occurs for float64 as well.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:7 (5 by maintainers)
Top GitHub Comments
I believe the issue is that
x.data
is used both as input and output of CuSPARSE’sgthr
here: https://github.com/cupy/cupy/blob/d0dd06d5145f73da4c56d8678ddf64cea702106a/cupy/cusparse.py#L583This can be fixed by creating another buffer. I will submit a PR!
Many functions such as
__mul__
first convert to CSR, so this is pretty problematic.The only workaround I found so far is to transfer to the host, convert via scipy’s tocsr() and then transfer back to the GPU, but that is obviously not ideal.