question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PyData/Sparse support

See original GitHub issue

Describe the workflow you want to enable

Replacing scipy.sparse arrays with PyData/Sparse in scikit-learn.

Describe your proposed solution

Hello. I’m the maintainer of PyData/Sparse. So far, I feel the conversation between us and the scikit-learn developers has been mostly second-hand.

I’m opening this issue in hopes of opening up a channel, and creating a meta-issue over in the PyData/Sparse repo for any requirements that scikit-learn may have to bring this to fruition. Over time, my hope is we can resolve issues on both sides to enable this.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
rgommerscommented, May 30, 2020

Realistically I don’t see us transitioning to pydata/sparse entirely without very extensive testing,

I agree. There’s also performance to worry about - pydata/sparse is not yet on par with scipy.sparse. And the sparse linalg support is still going through scipy.sparse.linalg anyway.

PyData/Sparse could be a soft, rather than a hard, dependency until Numba gains PyPy support.

I think there’s much more to that than PyPy support. Numba isn’t mature/stable enough to be a hard dependency yet. I don’t think much has fundamentally changed there since the discussion on making Numba a dependency of SciPy in 2018 and for Pandas in 2019. What Pandas is doing right now - accepting user-defined kernels written with Numba as input - seems fine. Numba is great for writing individual functions that are fast easily. But for a hard dependency, you’d expect things like non-super-scary-exceptions, portability, stability, etc.

1reaction
amuellercommented, Jun 3, 2020

Ah ok, that will probably work then, though I assume we’ll have to write some conversion tools.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sparse — sparse 0.13.0+0.g0b7dfeb.dirty ... - PyData |
This implements sparse arrays of arbitrary dimension on top of numpy and ... to two dimensions, pydata/sparse supports the GCXS sparse array format, ......
Read more >
Sparse multi-dimensional arrays for the PyData ecosystem
This library provides multi-dimensional sparse arrays. Documentation · Contributing · Bug Reports/Feature Requests ...
Read more >
pydata/sparse - Gitter
scipy.sparse array support np.shape, but sparse.COO does not. In [1]: import sparse In [2]: import numpy as np In [3]: ...
Read more >
Scientific python and sparse arrays (scipy summary + future ...
sparse.csgraph ; that code would be a lot harder to replace. PyData Sparse array instances are understood and supported, via conversion to the ......
Read more >
6. Sparse Backend — TensorLy: Tensor Learning in Python
TensorLy supports sparse tensors for some backends and algorithms. ... For the NumPy backend, the PyData/Sparse project is used as the sparse representation ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found