DOC: Order of groups in groupby and head method
See original GitHub issueWhen sort = True
is passed to groupby
(which is by default) the groups will be in sorted order. If you loop through them they are in sorted order, if you compute the mean, std… they are in sorted order but if you use the method head
they are NOT in sorted order.
import pandas as pd
df = pd.DataFrame([[2, 100], [2, 200], [2, 300], [1, 400], [1, 500], [1, 600]], columns = ['A', 'B'])
grouped = df.groupby(df['A'], sort = True)
for name, group in grouped:
print(group)
print(grouped.mean())
print(grouped.head(1))
Is this expected? I have not found this behaviour documented.
I think it is confusing and has caused me a headache because I was combining output of the mean
and head
methods in the same DataFrame, and since the data was not ordered before those results were getting mixed because of these order issues. I have pandas 0.20.3
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Pandas Groupby Sort within Groups - Spark by {Examples}
You can sort values in descending order by using ascending=False param to sort_values() method. The head() function is used to get the first...
Read more >pandas groupby, then sort within groups - Stack Overflow
The order of rows WITHIN A SINGLE GROUP are preserved, however groupby has a sort=True statement by default which means the groups themselves ......
Read more >GroupBy.head - Pandas
head (n)) , but it returns a subset of rows from the original DataFrame with original index and order preserved ( as_index flag...
Read more >All Pandas groupby() You Should Know for Grouping Data ...
In SQL, the GROUP BY statement groups row that has the same category ... In this article, you'll learn the “group by” process...
Read more >pandas GroupBy: Your Guide to Grouping Data in Python
You'll work with real-world datasets and chain GroupBy methods together ... SELECT state, count(name) FROM df GROUP BY state ORDER BY state;.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
There is a big problem with the docstrings here for
DataFrameGroupBy.head
. They say:Dataframegroupby.head
keeps the original ordering of the dataframe. It doesn’t even order by the keys. Using.apply(lambda x: x.head(n))
puts the group keys in the index and sorts them.Edit, I see that @Ifnister already pointed this out. I think it would make a lot more sense to actually do
.apply(lambda x: x.head(n))
.Head is a filter; sort is only applied to reducers within groupby. To my knowledge this isn’t documented, but I haven’t checked. I think documentation on this should be added to the
Series.groupby
andDataFrame.groupby
API docs as well as the User Guide.