question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Array indexing of sparse matrices

See original GitHub issue

After talking to @quasiben I’ve been looking into using CuPy as a shim to support sparse matrices in PyTorch (the current support is not great).

The gist is that I can fit a sparse matrix into GPU memory but not a dense one. So, the idea is to use a CuPy sparse matrix and a custom DataLoader: I should be able to quickly train a model by indexing into the CuPy matrix, densifying a batch of data, and doing a zero-copy conversion to a tensor to feed into my model.

Unfortunately it appears that CuPy’s sparse matrices don’t support array indexing, which would be the natural way to do this (in the sense that PyTorch’s DataLoader class returns a list of integers to index into your data). There’s definitely a way to get this working with single indices and vstack but I thought I’d raise an issue because scipy does support this and it’d be a lot cleaner for me.

Thanks!

  • Conditions (you can just paste the output of python -c 'import cupy; cupy.show_config()') CuPy Version : 6.2.0 CUDA Root : /usr/local/cuda CUDA Build Version : 9000 CUDA Driver Version : 9000 CUDA Runtime Version : 9000 cuDNN Build Version : 7402 cuDNN Version : 7402 NCCL Build Version : 2307 NCCL Runtime Version : 2307
  • Code to reproduce
scipy_sparse = scipy.sparse.random(100, 100).tocsr()
scipy_sparse[[0,1],:] # this works for csr matrices

cupy_sparse = cupy.sparse.csr_matrix(scipy_sparse)
cupy_sparse[[0,1],:] # error :(
  • Error messages, stack traces, or logs
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-70-425afa4195a2> in <module>
      3 
      4 cupy_sparse = cupy.sparse.csr_matrix(scipy_sparse)
----> 5 cupy_sparse[[0,1],:]

/opt/conda/lib/python3.7/site-packages/cupyx/scipy/sparse/compressed.py in __getitem__(self, slices)
    247                 return self._get_major_slice(major)
    248 
--> 249         raise ValueError('unsupported indexing')
    250 
    251     def _get_single(self, major, minor):

ValueError: unsupported indexing

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
quasibencommented, Nov 5, 2020

Thanks for testing this out @jamestwebber ! Now for distributed GPUs 😃

1reaction
jamestwebbercommented, Oct 31, 2020

Thanks for fixing this, I started using it about a week ago. I see about 2.5x speed increase in training my model when I can keep it entirely in GPU memory and convert to dense tensors on demand.

I wrote up my (very simple) solution in a gist in case anyone wants to use it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Accessing Sparse Matrices - MATLAB & Simulink - MathWorks
Indexing and visualizing sparse data. ... a few elements in a sparse matrix, so in those cases it's normal to use regular array...
Read more >
Scipy: Do sparse matrices support advanced indexing?
sparse matrices have a very limited indexing support, and what is available depends on the format of the matrix. For example: >>> a...
Read more >
Sparse matrices (scipy.sparse) — SciPy v1.9.3 Manual
Sparse arrays currently must be two-dimensional. ... find (A). Return the indices and values of the nonzero elements of a matrix ...
Read more >
What is meant by Sparse Array? - GeeksforGeeks
A sparse array is an array in which elements do not have contiguous indexes starting at zero. Sparse arrays are used over arrays...
Read more >
Sparse Arrays · The Julia Language - Julia Documentation
Sparse arrays are arrays that contain enough zeros that storing them in a ... Indexing of, assignment into, and concatenation of sparse matrices...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found