question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

With an external grouper, there is no way to access the grouped value in a DataFrame(...).groupby(...).apply(...) workflow

See original GitHub issue

groupby-apply workflows are important pandas idioms. Here’s a brief example grouping on a named DataFrame column:

>>> df = pd.DataFrame({'key': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'value': range(9)})
>>> result = df.groupby('key').apply(lambda x: x['key'])
>>> result
key   
1    0    1
     1    1
     2    1
2    3    2
     4    2
     5    2
3    6    3
     7    3
     8    3
Name: key, dtype: int64

An important highlight of this example is the ability to reference the grouped value – eg, x['key'] – inside the applied function.

pandas also supports grouping on arbitrary mapping functions, iterables, and lots of other objects. In these cases, the grouped value is not represented as a named column in the DataFrame. Thus, when using apply(…), there is no apparent way to access the group key value. The only alternative is to use a (slow) for-loop solution as in:

foo = lambda _k, _g: ...
grouped = df.groupby(grouper)
result_iter = (foo(key, group) for key, group in grouped) 
key_iter = (key for key, group in grouped)
pd.DataFrame.from_records(result_iter, index=key_iter)

IMHO, the ability to access the grouped value in an idiomatic way from within the applied function is ergonomically important; the groupby-apply idiom is at best partially realized without it.

Issue Analytics

  • State:open
  • Created 9 years ago
  • Comments:13 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
ericabrauercommented, Oct 22, 2021

Is there any risk of clobbering an attribute on someone’s Series?

@TomAugspurger yes 😦 Just happened to me. Not sure how I feel about an attribute with such a common name added to the dataframe which wasn’t there in the ungrouped dataframe. Was a really hard bug to pinpoint. EDIT: the problem for me was that had a column named "name" and tried accessing it with the dot access syntactic sugar. Same would happen with "key".

#25457

Could you change it to something like .name_ or .key_ for the official (non-column) attribute referencing?

1reaction
brianthelioncommented, Feb 25, 2015

It appears that .name is the attribute that I’ve been looking for! Is this the intended use-case for the attribute? If so, my feeling is that “name” is somewhat nondescript.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Group by: split-apply-combine — pandas 1.5.2 documentation
Compute group sums or means. ... Filling NAs within groups with a value derived from each group. ... pandas objects can be split...
Read more >
How to access pandas groupby dataframe by key
groupby.DataFrameGroupBy object thing which doesn't seem to have any methods that correspond to the DataFrame I want. The best I could think of...
Read more >
pandas GroupBy: Your Guide to Grouping Data in Python
How to use pandas GroupBy operations on real-world data ... It doesn't really do any operations to produce a useful result until you...
Read more >
How to Use Pandas GroupBy, Counts and Value Counts - Kite
The input to groupby is quite flexible. You can choose to group by multiple columns. For example, if we had a year column...
Read more >
pd.DataFrame.groupby() – A Simple Illustrated Guide - Finxter
I will present you a full group by operation example of pandas. ... By the way, we will use 2015-2016 world happiness report...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found