question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Behavior of Series.values when dtype is "category" is surprising

See original GitHub issue

Say I make a category-type Series:

s = pd.Series(["a", "b", "c"], dtype="category")

I want to pass this to a function that expects a numpy array, so I use the values property. According to the documentation:

s.values?
Type:        property
String form: <property object at 0x106d84050>
Docstring:
Return Series as ndarray

Returns
-------
arr : numpy.ndarray

However:

s.values
[a, b, c]
Categories (3, object): [a < b < c]

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Feb 17, 2017

The docstring of .values is updated in the meantime (https://github.com/pandas-dev/pandas/pull/10477/files#diff-150fd3c5a732ae915ec47bc54a933c41R328), so closing this. The different output type is of course still confusing, but that is not something we are going to solve before pandas 2.

0reactions
jrebackcommented, Mar 3, 2015

i’ll clarify

I meant unadvertised. Its in the public-api.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using pandas categories properly is tricky... here's why
dtype.categories which contain the unique categorical values, rather than on the whole series. Merging with categorical columns.
Read more >
Categorical data — pandas 1.5.2 documentation
Categoricals are a pandas data type corresponding to categorical variables in statistics. A categorical variable takes on a limited, and usually fixed, number ......
Read more >
Using The Pandas Category Data Type
category, NA, NA, Finite list of text values ... In addition, categorical data can yield some surprising behaviors in real world usage.
Read more >
Pandas' read_csv with dtype=pd.CategoricalDtype() creates ...
This is a bit surprising because if I create a list of numbers and then generate a Series, without using read_csv, the categories...
Read more >
What's new in 0.25.0 (July 18, 2019) - Joris Van den Bossche
Starting with the 0.25.x series of releases, pandas only supports Python 3.5.3 and higher. ... this was an undesired behavior and may have...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found