question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sparse arrays do not support all of Scipy's slicing options

See original GitHub issue

I encountered this behavior in the most recent (CuPy 8.0.0 at the time of writing this) branch when attempting to call dask.array.from_array() with a csr_matrix. It appears that Dask is attempting to do some slicing that is supported in Scipy’s csr_matrix but is not supported by CuPy.

The following is a reproducible example:

>>> import cupy as cp
>>> import dask.array
>>> 
>>> a = cp.sparse.random(100, 100, format='csr')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/workspace/dask/dask/base.py", line 165, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/share/workspace/dask/dask/base.py", line 436, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/share/workspace/dask/dask/threaded.py", line 81, in get
    **kwargs
  File "/share/workspace/dask/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/share/workspace/dask/dask/local.py", line 316, in reraise
    raise exc
  File "/share/workspace/dask/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/share/workspace/dask/dask/array/core.py", line 104, in getter
    c = a[b]
  File "/share/workspace/cupy/cupyx/scipy/sparse/compressed.py", line 509, in __getitem__
    raise ValueError('unsupported indexing')
ValueError: unsupported indexing

Here’s the same code using Scipy:

>>> import scipy
>>> 
>>> a = scipy.sparse.random(100, 100, format='csr')
>>> b = dask.array.from_array(a)
>>> b.compute()
array(<100x100 sparse matrix of type '<class 'numpy.float64'>'
	with 100 stored elements in Compressed Sparse Row format>, dtype=object)

When using a CuPy coo_matrix:

>>> a = cp.sparse.random(100, 100, format='coo')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/workspace/dask/dask/base.py", line 165, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/share/workspace/dask/dask/base.py", line 436, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/share/workspace/dask/dask/threaded.py", line 81, in get
    **kwargs
  File "/share/workspace/dask/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/share/workspace/dask/dask/local.py", line 316, in reraise
    raise exc
  File "/share/workspace/dask/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/share/workspace/dask/dask/array/core.py", line 104, in getter
    c = a[b]
TypeError: 'coo_matrix' object is not subscriptable

As expected, the CuPy csc_matrix yields the same exception as CSR:

>>> a = cp.sparse.random(100, 100, format='csc')
>>> b = dask.array.from_array(a)
>>> b.compute()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/share/workspace/dask/dask/base.py", line 165, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/share/workspace/dask/dask/base.py", line 436, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/share/workspace/dask/dask/threaded.py", line 81, in get
    **kwargs
  File "/share/workspace/dask/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/share/workspace/dask/dask/local.py", line 316, in reraise
    raise exc
  File "/share/workspace/dask/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/share/workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/share/workspace/dask/dask/array/core.py", line 104, in getter
    c = a[b]
  File "/share/workspace/cupy/cupyx/scipy/sparse/compressed.py", line 509, in __getitem__
    raise ValueError('unsupported indexing')
ValueError: unsupported indexing

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
cjnoletcommented, Apr 25, 2020

@crumleyc, we are needing this feature in Dask and RAPIDS currently. I think it would be great if you worked on this issue! Ideally, this change will be included in the 8.0.0 release.

As a first step, it will be helpful to familiarize yourself with Cupy’s contribution guide, if you have not done so already.

I tend to create a pull request early, so that I can solicit feedback as I work through changes. An additional benefit to this is that I am able to reference my changes directly when asking for help.

1reaction
asi1024commented, Apr 23, 2020

Current implementation raises a ValueError, but it should raise NotImplementedError. Sorry for being confusing.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sparse matrices (scipy.sparse) — SciPy v1.9.3 Manual
Sparse arrays currently must be two-dimensional. This also means that all slicing operations on these objects must produce two-dimensional results, ...
Read more >
Cope with different slicing-behaviour in scipy.sparse and numpy
"The slice on the sparse_matrix shall have the same output as matrix ". The problem is that matrix is a 2-d numpy array,...
Read more >
NDArray in Compressed Sparse Row Storage Format
You can slice a CSRNDArray on axis 0 with operator [] , which copies the slices and returns a new CSRNDArray. ... Note...
Read more >
Compilation of Sparse Array Programming Models
This paper shows how to compile sparse array programming languages. A sparse array programming language is an array programming language that supports ......
Read more >
Construct Sparse Arrays - PyData/Sparse
In addition, it is possible to access single elements and slices of the DOK array using normal Numpy indexing, as well as fancy...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found