BUG: unstack() does not always sort index in 0.23
See original GitHub issueCode Sample
import pandas as pd
index = pd.MultiIndex(levels=[['A','B','C','D','E']] * 2, labels=[[4,4,4,3], [4,2,0,1]])
pd.Series(0, index).unstack()
Problem description
In Pandas 0.20, 0.21, and 0.22, this gave the expected result:
A B C E
D NaN 0.0 NaN NaN
E 0.0 NaN 0.0 0.0
But in Pandas 0.23, the result is not sorted:
E C A B
E 0.0 0.0 0.0 NaN
D NaN NaN NaN 0.0
The documentation says “The level involved will automatically get sorted”, and while I’ve seen the explanation of confusing implementation details leaking out in #15105 and some other outright bugs in #9514, this seems to be a different bug, and a regression.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.4.final.0 python-bits: 64 OS: Linux machine: x86_64 LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.21.1 numpy: 1.13.1
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (7 by maintainers)
Top Results From Across the Web
Pandas unstack should not sort remaining indexes
The problem is that the order of the third index has changed since the index was sorted automatically and alphabetically. Now, the line...
Read more >pandas.DataFrame.unstack — pandas 1.5.2 documentation
Returns a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels. If the index is...
Read more >python-pandas-0.23.4-bp151.2.3 - SUSE Package Hub -
(GH19320) * Bug Fixes * Conversion + Bug in constructing Index with an iterator or ... GroupBy.bfill() where the fill within a grouping...
Read more >pandas documentação - Python - 22 - Passei Direto
Grouper object is used to override ambiguous column name (GH17383) • Bug in ... Bug in SparseDataFrame.fillna() not filling all NaNs when frame...
Read more >What's New — pandas 0.23.4 documentation
With NumPy 1.15 and pandas 0.23.1 or earlier, numpy.all() will no longer ... Bug in DataFrame.unstack() which raises an error if index is...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@WillAyd, #15105 asks for a new
sort
option to be added to unstack. The documented, specified, and actual behaviors have for many years been that it sorts the index. Suddenly in Pandas 0.23, the behavior changed, without a FutureWarning, without a documentation change, and without a mention in #15105, which suggests the change was an accident. Significant functional changes should be made deliberately, with discussion, not slipped in without notice and then documented a few releases later.I suggest that the longstanding and clearly documented behavior (sorted unstack) should be restored, and then #15105 can continue to explore new ideas such as adding an option to not sort. If the default behavior is to be changed, a FutureWarning could be used to help users transition.
You have marked this as a Docs issue. But it is a regression in Pandas 0.23, and a functional bug in the code, not a cosmetic one in the docs.
Hi @deisdenis, I have also looked into this issue. It seems that the function descriptor of
remove_unused_levels
specifies that the multiIndex order needs to be preserved so no sorting should be done in this function. Maybe an alternative is to add asort
argument forunstack
. Are you still interested in this issue? What do you think?