question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`groupby` Update and Use Case

See original GitHub issue

groupby seems to still produce an itertools._grouper object, which appears to be a type in process to deprecate by 2.0.

I also wonder how the keyfunc parameter works and if it’s scoped for pipe end users. I tried to pass something to it, and I got a multiple values error. Does it, by chance, allow for type recasting of the object returned? If not, a feature like that would be wonderful so as to quickly generate outputs of pipe operations (maybe pipes have something like that already?)

Finally, in the documentation x%2 and "Even" will produce unexpected results 😃

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
JulienPalardcommented, Jan 14, 2022

Wait, so the lambda is the key function?

Yes!

It’s used in itertools.groupby and in sorted, whose keyfunc I thought had to be something that told sorted how some elements should be sorted relative to one another (like a before b, b before c, whatever).

Yes, the same key function are used to both sort then group. Read like like “sort by group before groupping”.

But it seems the pipe makes the first argument (the lambda) the second and passes the iterable prior to the pipe to the first?

Exactly \o/

If that’s true, I think I get it now. Just surprised that the keyfunc works for sorted as well as for the itertools.groupby.

Python use keyfuncs in other places too like max and min. As I said a few line before, sorting by the same keyfunc is a “sort by groups”, so for example feeding:

A A B B A A

won’t give three groups, one of two A’s, one of two B’s and one of two A’s, but the sort will first reorganise as:

A A A A B B

then the itertools.groupby will give only two groups, one of 4 A’s and one of two B’s.

It’s probably a bold move to stuff a sorted inside a groupby, it distances the Pipe semantic from the expected itertools semantic, which is probably not that good, but I wrote this like 10+ years ago, and I don’t think changing it now is any better.

Anyway, if you really need a groupby which does not uses sort, feel free to implement it yourself, it should be as simple as:

@Pipe
def groupby(iterable, keyfunc):
    return itertools.groupby(iterable, keyfunc)

(nothing enforces using only pipes declared on pipe.py).

0reactions
Servinjesus1commented, Jan 14, 2022

Wonderful, thank you for walking me through this. I know this was only a very random example of what Pipe can do, but it’s the first time I’ve seen something like this, so I appreciate your patience.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GROUP BY + CASE statement - sql - Stack Overflow
An output column's name can be used to refer to the column's value in ORDER BY and GROUP BY clauses, but not in...
Read more >
Grouping with a Case Statement | Tutorial by Chartio
Build a CASE STATEMENT to GROUP a column with an alias or new string. Using GROUP BY allows you to divide rows returned...
Read more >
How do I select data with a case statement and group by?
In your query when you are using the GROUP BY it is using the individual values for each row. You'd need to use...
Read more >
[Solved] Group by with case statement - CodeProject
I have below mentioned query I have used case statement to sum balance based on specific criterion ,I am getting error of group...
Read more >
Group by: split-apply-combine — pandas 1.5.2 documentation
We'll address each area of GroupBy functionality then provide some non-trivial examples / use cases. See the cookbook for some advanced strategies. Splitting...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found