Sparse arrays do not support all of Scipy's slicing options
See original GitHub issueI encountered this behavior in the most recent (CuPy 8.0.0 at the time of writing this) branch when attempting to call dask.array.from_array()
with a csr_matrix
. It appears that Dask is attempting to do some slicing that is supported in Scipy’s csr_matrix
but is not supported by CuPy.
The following is a reproducible example:
>>> import cupy as cp
>>> import dask.array
>>>
>>> a = cp.sparse.random(100, 100, format='csr')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/share/workspace/dask/dask/base.py", line 165, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/share/workspace/dask/dask/base.py", line 436, in compute
results = schedule(dsk, keys, **kwargs)
File "/share/workspace/dask/dask/threaded.py", line 81, in get
**kwargs
File "/share/workspace/dask/dask/local.py", line 486, in get_async
raise_exception(exc, tb)
File "/share/workspace/dask/dask/local.py", line 316, in reraise
raise exc
File "/share/workspace/dask/dask/local.py", line 222, in execute_task
result = _execute_task(task, data)
File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/share/workspace/dask/dask/array/core.py", line 104, in getter
c = a[b]
File "/share/workspace/cupy/cupyx/scipy/sparse/compressed.py", line 509, in __getitem__
raise ValueError('unsupported indexing')
ValueError: unsupported indexing
Here’s the same code using Scipy:
>>> import scipy
>>>
>>> a = scipy.sparse.random(100, 100, format='csr')
>>> b = dask.array.from_array(a)
>>> b.compute()
array(<100x100 sparse matrix of type '<class 'numpy.float64'>'
with 100 stored elements in Compressed Sparse Row format>, dtype=object)
When using a CuPy coo_matrix
:
>>> a = cp.sparse.random(100, 100, format='coo')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/share/workspace/dask/dask/base.py", line 165, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/share/workspace/dask/dask/base.py", line 436, in compute
results = schedule(dsk, keys, **kwargs)
File "/share/workspace/dask/dask/threaded.py", line 81, in get
**kwargs
File "/share/workspace/dask/dask/local.py", line 486, in get_async
raise_exception(exc, tb)
File "/share/workspace/dask/dask/local.py", line 316, in reraise
raise exc
File "/share/workspace/dask/dask/local.py", line 222, in execute_task
result = _execute_task(task, data)
File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/share/workspace/dask/dask/array/core.py", line 104, in getter
c = a[b]
TypeError: 'coo_matrix' object is not subscriptable
As expected, the CuPy csc_matrix
yields the same exception as CSR:
>>> a = cp.sparse.random(100, 100, format='csc')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/share/workspace/dask/dask/base.py", line 165, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/share/workspace/dask/dask/base.py", line 436, in compute
results = schedule(dsk, keys, **kwargs)
File "/share/workspace/dask/dask/threaded.py", line 81, in get
**kwargs
File "/share/workspace/dask/dask/local.py", line 486, in get_async
raise_exception(exc, tb)
File "/share/workspace/dask/dask/local.py", line 316, in reraise
raise exc
File "/share/workspace/dask/dask/local.py", line 222, in execute_task
result = _execute_task(task, data)
File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
return func(*args2)
File "/share/workspace/dask/dask/array/core.py", line 104, in getter
c = a[b]
File "/share/workspace/cupy/cupyx/scipy/sparse/compressed.py", line 509, in __getitem__
raise ValueError('unsupported indexing')
ValueError: unsupported indexing
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (7 by maintainers)
Top Results From Across the Web
Sparse matrices (scipy.sparse) — SciPy v1.9.3 Manual
Sparse arrays currently must be two-dimensional. This also means that all slicing operations on these objects must produce two-dimensional results, ...
Read more >Cope with different slicing-behaviour in scipy.sparse and numpy
"The slice on the sparse_matrix shall have the same output as matrix ". The problem is that matrix is a 2-d numpy array,...
Read more >NDArray in Compressed Sparse Row Storage Format
You can slice a CSRNDArray on axis 0 with operator [] , which copies the slices and returns a new CSRNDArray. ... Note...
Read more >Compilation of Sparse Array Programming Models
This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports ......
Read more >Construct Sparse Arrays - PyData/Sparse
In addition, it is possible to access single elements and slices of the DOK array using normal Numpy indexing, as well as fancy...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@crumleyc, we are needing this feature in Dask and RAPIDS currently. I think it would be great if you worked on this issue! Ideally, this change will be included in the 8.0.0 release.
As a first step, it will be helpful to familiarize yourself with Cupy’s contribution guide, if you have not done so already.
I tend to create a pull request early, so that I can solicit feedback as I work through changes. An additional benefit to this is that I am able to reference my changes directly when asking for help.
Current implementation raises a
ValueError
, but it should raiseNotImplementedError
. Sorry for being confusing.