question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GroupBy using TimeGrouper does not work

See original GitHub issue

BUG: TimeGrouper not too friendly with other groups, e.g.

df.set_index('Date').groupby([pd.TimeGrouper('6M'),'Branch']).sum() should work


Hi everybody,

I found two issues with TimeGrouper:

  1. TimeGrouper does not work at all:

Let’s take the following example:

df = pd.DataFrame({ ‘Branch’ : ‘A A A A A B’.split(), ‘Buyer’: ‘Carl Mark Carl Joe Joe Carl’.split(), ‘Quantity’: [1,3,5,8,9,3], ‘Date’ : [ DT.datetime(2013,1,1,13,0), DT.datetime(2013,1,1,13,5), DT.datetime(2013,10,1,20,0), DT.datetime(2013,10,3,10,0), DT.datetime(2013,12,2,12,0),
DT.datetime(2013,12,2,14,0), ]})

gr = df.groupby(pd.TimeGrouper(freq=‘6M’))

def testgr(df): print df

gr.apply(testgr)

This will raise the Exception: “Exception: All objects passed were None”

  1. With previous Panda’s version it was not possible to combine TimeGrouper with another criteria such as “Branch” in my case.

Thank you very much

Andy

Issue Analytics

  • State:closed
  • Created 10 years ago
  • Comments:20 (12 by maintainers)

github_iconTop GitHub Comments

3reactions
jrebackcommented, Jun 7, 2013

You need to set_index as TimeGrouper operates on the index

In [15]: df
Out[15]: 
  Branch Buyer                Date  Quantity
0      A  Carl 2013-01-01 13:00:00         1
1      A  Mark 2013-01-01 13:05:00         3
2      A  Carl 2013-10-01 20:00:00         5
3      A   Joe 2013-10-03 10:00:00         8
4      A   Joe 2013-12-02 12:00:00         9
5      B  Carl 2013-12-02 14:00:00         3

In [16]: df.set_index('Date').groupby(pd.TimeGrouper('6M')).sum()
Out[16]: 
            Quantity
2013-01-31         4
2013-07-31       NaN
2014-01-31        25
1reaction
jrebackcommented, Jun 7, 2013

If you return a custom function then you need to handle the string cases, but you can return pretty much anything you want (make it a Series) to get this kind of functionaility, you function is passed a slice of the original frame

In [55]: def testf(df):
   ....:     if (df['Buyer'] == 'Mark').sum() > 0:
   ....:         return Series(dict(quantity = df['Quantity'].sum(), buyer = 'mark'))
   ....:     return Series(dict(quantity = df['Quantity'].sum()*100, buyer = 'other'))
   ....: 

In [56]: df.set_index('Date').groupby(pd.TimeGrouper('6M')).apply(lambda x: x.groupby('Branch').apply(testf))
Out[56]: 
                   buyer quantity
           Branch                
2013-01-31 A        mark        4
2014-01-31 A       other     2200
           B       other      300
Read more comments on GitHub >

github_iconTop Results From Across the Web

group by - TimeGrouper, pandas - Stack Overflow
The problem can be solved by adding closed = 'left' df.groupby(pd.TimeGrouper('6M', closed = 'left')).aggregate(numpy.sum).
Read more >
GroupBy — pandas 1.5.2 documentation
GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby() , pandas. ... Provide resampling when using a TimeGrouper. DataFrameGroupBy.sample ...
Read more >
Python Examples of pandas.TimeGrouper - ProgramCreek.com
You can vote up the ones you like or vote down the ones you don't like, ... b = TimeGrouper('M') g = df.groupby(b)...
Read more >
[Code]-Pandas: group with TimeGrouper-pandas
If you wish to use TimeGrouper , you should first set a Datetimeindex and then you can use any aggregating function - e.g....
Read more >
Python – pandas: where is the documentation for TimeGrouper
The best use of pd.Grouper() is within groupby() when you're also grouping on non-datetime-columns. If you just need to group on a frequency,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found