Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PyData/Sparse support

See original GitHub issue

Describe the workflow you want to enable

Replacing scipy.sparse arrays with PyData/Sparse in scikit-learn.

Describe your proposed solution

Hello. I’m the maintainer of PyData/Sparse. So far, I feel the conversation between us and the scikit-learn developers has been mostly second-hand.

I’m opening this issue in hopes of opening up a channel, and creating a meta-issue over in the PyData/Sparse repo for any requirements that scikit-learn may have to bring this to fruition. Over time, my hope is we can resolve issues on both sides to enable this.

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:12 (8 by maintainers)

Top GitHub Comments

2reactions

rgommerscommented, May 30, 2020

Realistically I don’t see us transitioning to pydata/sparse entirely without very extensive testing,

I agree. There’s also performance to worry about - pydata/sparse is not yet on par with scipy.sparse. And the sparse linalg support is still going through scipy.sparse.linalg anyway.

PyData/Sparse could be a soft, rather than a hard, dependency until Numba gains PyPy support.

I think there’s much more to that than PyPy support. Numba isn’t mature/stable enough to be a hard dependency yet. I don’t think much has fundamentally changed there since the discussion on making Numba a dependency of SciPy in 2018 and for Pandas in 2019. What Pandas is doing right now - accepting user-defined kernels written with Numba as input - seems fine. Numba is great for writing individual functions that are fast easily. But for a hard dependency, you’d expect things like non-super-scary-exceptions, portability, stability, etc.

1reaction

amuellercommented, Jun 3, 2020

Ah ok, that will probably work then, though I assume we’ll have to write some conversion tools.

Top Results From Across the Web

Sparse — sparse 0.13.0+0.g0b7dfeb.dirty ... - PyData |

This implements sparse arrays of arbitrary dimension on top of numpy and ... to two dimensions, pydata/sparse supports the GCXS sparse array format, ......

Sparse multi-dimensional arrays for the PyData ecosystem

This library provides multi-dimensional sparse arrays. Documentation · Contributing · Bug Reports/Feature Requests ...

pydata/sparse - Gitter

scipy.sparse array support np.shape, but sparse.COO does not. In [1]: import sparse In [2]: import numpy as np In [3]: ...

Scientific python and sparse arrays (scipy summary + future ...

sparse.csgraph ; that code would be a lot harder to replace. PyData Sparse array instances are understood and supported, via conversion to the ......

6. Sparse Backend — TensorLy: Tensor Learning in Python

TensorLy supports sparse tensors for some backends and algorithms. ... For the NumPy backend, the PyData/Sparse project is used as the sparse representation ......