Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ENH: add groupby(..).sort_values()

See original GitHub issue

might be useful to expose this as a top-level groupby method for intra-group sorting. Anyone have a good usecase for this? We could make this much faster that the current impl (not that this is an issue, though potentially it could be).

In [10]: df = DataFrame({'A': [1, 1, 2, 2, 2], 'B': [2, 1, 3, 3, 1], 'C':[1, 1, 2, 1, 1]})

In [11]: df
Out[11]: 
   A  B  C
0  1  2  1
1  1  1  1
2  2  3  2
3  2  3  1
4  2  1  1

In [12]: df.groupby('A').apply(lambda x: x.sort_values(['B', 'C']))
Out[12]: 
     A  B  C
A           
1 1  1  1  1
  0  1  2  1
2 4  2  1  1
  3  2  3  1
  2  2  3  2

Issue Analytics

State:
Created 6 years ago
Reactions:2
Comments:11 (6 by maintainers)

Top GitHub Comments

10reactions

jacquespeeterscommented, May 15, 2018

Because the sum of all sorting’ complexity inside groups is lower than the sorting’ complexity on the whole dataset. groupby complexity doesn’t change.

There is a huge performance difference when dataset is big, and groups small. It is right now slower, not because of the complexity, but because of apply().

A good way to compare both operations complexity would be :

df.groupby('story_id').apply(lambda x: x.sort_values(by = 'relevance', ascending = False))
df.apply(lambda x: x.sort_values(by = 'relevance', ascending = False)).groupby('story_id')

2reactions

jorisvandenbosschecommented, Oct 20, 2017

And again, why don’t you use df.sort_values(['story_id', 'relevance']) ?

Top Results From Across the Web

Pandas Groupby Sort within Groups - Spark by {Examples}

You can find out how to perform groupby and apply sort within groups of Pandas DataFrame by using DataFrame.Sort_values() and DataFrame.groupby()and.

pandas groupby, then sort within groups - Stack Overflow

Would there be a way to sum up everything that isn't contained in the top three results per group and add them to...

Pandas Groupby - Sort within groups - GeeksforGeeks

Pandas Groupby is used in situations where we want to split data and set into groups so that we ... df.groupby( 'X' ,...

Pandas: How to Use GroupBy & Sort Within Groups - Statology

This tutorial explains how to use GroupBy in a pandas DataFrame and then sort the values, including an example.

Group By: split-apply-combine — pandas 0.17.1 documentation

Creating the GroupBy object only verifies that you've passed a valid mapping. Note. Many kinds of complicated data manipulations can be expressed in...