question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ENH: add groupby(..).sort_values()

See original GitHub issue

might be useful to expose this as a top-level groupby method for intra-group sorting. Anyone have a good usecase for this? We could make this much faster that the current impl (not that this is an issue, though potentially it could be).

In [10]: df = DataFrame({'A': [1, 1, 2, 2, 2], 'B': [2, 1, 3, 3, 1], 'C':[1, 1, 2, 1, 1]})

In [11]: df
Out[11]: 
   A  B  C
0  1  2  1
1  1  1  1
2  2  3  2
3  2  3  1
4  2  1  1

In [12]: df.groupby('A').apply(lambda x: x.sort_values(['B', 'C']))
Out[12]: 
     A  B  C
A           
1 1  1  1  1
  0  1  2  1
2 4  2  1  1
  3  2  3  1
  2  2  3  2

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:2
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

10reactions
jacquespeeterscommented, May 15, 2018

Because the sum of all sorting’ complexity inside groups is lower than the sorting’ complexity on the whole dataset. groupby complexity doesn’t change.

There is a huge performance difference when dataset is big, and groups small. It is right now slower, not because of the complexity, but because of apply().

A good way to compare both operations complexity would be :

df.groupby('story_id').apply(lambda x: x.sort_values(by = 'relevance', ascending = False))
df.apply(lambda x: x.sort_values(by = 'relevance', ascending = False)).groupby('story_id')
2reactions
jorisvandenbosschecommented, Oct 20, 2017

And again, why don’t you use df.sort_values(['story_id', 'relevance']) ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas Groupby Sort within Groups - Spark by {Examples}
You can find out how to perform groupby and apply sort within groups of Pandas DataFrame by using DataFrame.Sort_values() and DataFrame.groupby()and.
Read more >
pandas groupby, then sort within groups - Stack Overflow
Would there be a way to sum up everything that isn't contained in the top three results per group and add them to...
Read more >
Pandas Groupby - Sort within groups - GeeksforGeeks
Pandas Groupby is used in situations where we want to split data and set into groups so that we ... df.groupby( 'X' ,...
Read more >
Pandas: How to Use GroupBy & Sort Within Groups - Statology
This tutorial explains how to use GroupBy in a pandas DataFrame and then sort the values, including an example.
Read more >
Group By: split-apply-combine — pandas 0.17.1 documentation
Creating the GroupBy object only verifies that you've passed a valid mapping. Note. Many kinds of complicated data manipulations can be expressed in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found