question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make SparseArray an ExtensionArray

See original GitHub issue

We should make SparseArray a proper ExtensionArray.

It seems like this will be somewhat difficult to do properly when SparseArray subclasses ndarray. Basic things like np.asarray(sparse_array) don’t match the required ExtensionArray API (https://github.com/pandas-dev/pandas/issues/14167). Fixing this, especially when we subclass ndarray, is going to be difficult. I can’t override the behavior of np.asarray(sparse_array) in Python.

So, some questions

  1. Do people rely on SparseArray being an ndarray subclass?
  2. Do we want to make a clean break, or introduce deprecations for things that will need changing (but with no clear upgrade path)?

My current preference is to just break things, but I don’t use sparse. SparseArray would compose an ndarray of dense values and a SparseIndex, but it would no longer subclass ndarray.

CCing some people who seem to use pandas’ sparse: @hexgnu @kernc @Licht-T

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:16 (16 by maintainers)

github_iconTop GitHub Comments

1reaction
TomAugspurgercommented, Aug 13, 2018

Yeah, they’re not in the official public API, but they were likely not considered in the pandas.core privatization shuffle.

So, I don’t really know. I suppose it depends on if people find them useful (@hexgnu @kernc @Licht-T), otherwise I would default to making them private implementation details of SparseArray.

1reaction
jorisvandenbosschecommented, Aug 13, 2018

That let me wonder: should this be public? Or more in general, are the sparse index objects considered public? (you can pass it in the SparseSeries constructor currently, but the objects are never used in the docs / not exposed top-level).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sparse data structures — pandas 1.5.2 documentation
SparseArray is a ExtensionArray for storing an array of sparse values (see ... and on the Series class itself for creating a Series...
Read more >
how to make a sparse pandas DataFrame from a csv file
I would have guessed there would be a way to load the sparse part of the csv into some form of sparse array,...
Read more >
Sparse data structures — pandas 1.0.0rc0+111.ge72cd7c52 ...
SparseArray is a ExtensionArray for storing an array of sparse values (see ... and on the Series class itself for creating a Series...
Read more >
Quickly creating a sparse array - Mathematica Stack Exchange
You can use the first documented usage for SparseArray : enter image description here. So what you want to do is collect all...
Read more >
array.py - " SparseArray data structure " from collections import abc ...
"""Create a SparseArray from a scipy.sparse matrix... versionadded:: ... Want# ExtensionArray.factorize -> Tuple[EA, EA]# Given that we have to return a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found