index.levels not being updated by groupby
See original GitHub issueSummary:
Input:
df.D.ix['c1','d1']
t1 0
t2 0
t3 1
t4 1
t5 1
Name: D
Operation:
grouped = df.groupby('D')
for i,j in grouped:
print 'D:',i
print 'Actual index[2]:',j.index[0][2]
print 'First element of levels[2]:',j.index.levels[2][0]
Output:
D: 0.0
Actual index[2]: t1
First element of levels[2]: t1
D: 1.0
Actual index[2]: t3
First element of levels[2]: t1
Details:
Issue Analytics
- State:
- Created 11 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
Pandas MultiIndex groupby retaining index levels
OLD ANSWER (using last ): You can rather simply achieve this using groupby by making the index level you want to retain in...
Read more >BUG: Indexes still include values that have been deleted #2770
If we create a Dataframe with a multi-index, then delete all the rows with value X, we'd expect the ... index.levels not being...
Read more >How to do groupby on a multiindex in Pandas? - GeeksforGeeks
We have to pass the name of indexes, in the list to the level argument in groupby function. The 'region' index is level...
Read more >Reshaping and pivot tables — pandas 1.5.2 documentation
Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as...
Read more >Functions That Generate a Multi-index in Pandas and How to ...
Functions That Generate a Multi-index in Pandas and How to Remove the Levels. How groupby and unstack operations create a multiindex and how...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Right now, I don’t consider this a bug. Can you help me understand why an end user needs to care about what is actually in the levels?
To be clear, if we don’t update them, we can share the levels indexes between all the views and copies of this MI, instead of allocating new ndarrays (and hash tables?) for each.
I could see adding a method to allow consolidation of a MultiIndex, but you can get the same thing now by doing:
See https://github.com/pydata/pandas/issues/2770#issuecomment-29551251. That’s not what levels is for. Not a bug. closing.
Edit: It should be
Which does an extra copy (or two).