ExtensionDtype API for preferred type when converting to NumPy array
See original GitHub issueThis is coming out of SparseArray. Right now if you have a homogeneous DataFrame of EAs, then doing something like np.asarray(df)
will always convert to object. https://github.com/pandas-dev/pandas/blob/32a74f19b655f4b0ef198d5ff4e15d409c3670d2/pandas/core/internals/managers.py#L787-L794.
Should we give ExtensionDtype some input on the dtype used here? This would allow for some consistency between np.asarray(series)
and np.asarray(homogeneous_dataframe)
.
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (13 by maintainers)
Top Results From Across the Web
API: how to handle NA in conversion to numpy arrays #30038
It returns a different iterative object (a pandas array type), in which ... ExtensionDtype API for preferred type when converting to NumPy ......
Read more >NumPy or Pandas: Keeping array type as integer while having ...
Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still...
Read more >pandas.api.extensions.ExtensionArray
pandas.api.extensions. ... Abstract base class for custom 1-D array types. ... on how the data are stored, just that it can be converted...
Read more >Data types — NumPy v1.24 Manual
NumPy supports a much greater variety of numerical types than Python does. ... To convert the type of an array, use the .astype()...
Read more >API Reference — db-dtypes documentation - Google Cloud
Pandas Data Types for SQL systems (BigQuery, Spanner) ... array – An ExtensionArray if dtype is ExtensionDtype, Otherwise a NumPy ndarray with 'dtype'...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It was indeed from the sparse discussion, my comment there: https://github.com/pandas-dev/pandas/pull/22325#discussion_r209979714
Basically, we want to know
np.array(EA).dtype
without doing the conversion to array.We could add a
numpy_dtype
attribute to the Dtype object?This might also overlap with https://github.com/pandas-dev/pandas/issues/22224
Not sure if that was from the same discussion, but I recall a discussion where we said that we basically wanted to know the “numpy dtype if I would be converted to a numpy array” of an ExtensionArray, without actually converting it. That seems something useful in general I think, and could then be used to determine the best interleaved dtype in the use case above?