ENH: add groupby(..).sort_values()
See original GitHub issuemight be useful to expose this as a top-level groupby method for intra-group sorting. Anyone have a good usecase for this? We could make this much faster that the current impl (not that this is an issue, though potentially it could be).
In [10]: df = DataFrame({'A': [1, 1, 2, 2, 2], 'B': [2, 1, 3, 3, 1], 'C':[1, 1, 2, 1, 1]})
In [11]: df
Out[11]:
A B C
0 1 2 1
1 1 1 1
2 2 3 2
3 2 3 1
4 2 1 1
In [12]: df.groupby('A').apply(lambda x: x.sort_values(['B', 'C']))
Out[12]:
A B C
A
1 1 1 1 1
0 1 2 1
2 4 2 1 1
3 2 3 1
2 2 3 2
Issue Analytics
- State:
- Created 6 years ago
- Reactions:2
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Pandas Groupby Sort within Groups - Spark by {Examples}
You can find out how to perform groupby and apply sort within groups of Pandas DataFrame by using DataFrame.Sort_values() and DataFrame.groupby()and.
Read more >pandas groupby, then sort within groups - Stack Overflow
Would there be a way to sum up everything that isn't contained in the top three results per group and add them to...
Read more >Pandas Groupby - Sort within groups - GeeksforGeeks
Pandas Groupby is used in situations where we want to split data and set into groups so that we ... df.groupby( 'X' ,...
Read more >Pandas: How to Use GroupBy & Sort Within Groups - Statology
This tutorial explains how to use GroupBy in a pandas DataFrame and then sort the values, including an example.
Read more >Group By: split-apply-combine — pandas 0.17.1 documentation
Creating the GroupBy object only verifies that you've passed a valid mapping. Note. Many kinds of complicated data manipulations can be expressed in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Because the sum of all sorting’ complexity inside groups is lower than the sorting’ complexity on the whole dataset.
groupby
complexity doesn’t change.There is a huge performance difference when dataset is big, and groups small. It is right now slower, not because of the complexity, but because of
apply()
.A good way to compare both operations complexity would be :
And again, why don’t you use
df.sort_values(['story_id', 'relevance'])
?