question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

issue when shifting with Timedelta in a groupby

See original GitHub issue

Hello the awesome Pandas team!

Consider the example below

data = pd.DataFrame({'mydate' : [pd.to_datetime('2016-06-06'),
                                 pd.to_datetime('2016-06-08'),
                                 pd.to_datetime('2016-06-09'),
                                 pd.to_datetime('2016-06-10'),
                                 pd.to_datetime('2016-06-12'),
                                 pd.to_datetime('2016-06-13')],
                     'myvalue' : [1, 2, 3, 4, 5, 6],
                     'group' : ['A', 'A', 'A', 'B', 'B', 'B']})

data.set_index('mydate', inplace = True)
Out[58]: 
           group  myvalue
mydate                   
2016-06-06     A        1
2016-06-08     A        2
2016-06-09     A        3
2016-06-10     B        4
2016-06-12     B        5
2016-06-13     B        6

Now I need to compute the difference between the current value of myvalue and its lagged value, where by lagged I actually mean lagged by 1 day (if possible).

So this returns a result, but its not what I need

data['delta_one'] = data.groupby('group').myvalue.transform(lambda x: x - x.shift(1))

data
Out[56]: 
           group  myvalue  delta_one
mydate                              
2016-06-06     A        1        nan
2016-06-08     A        2     1.0000
2016-06-09     A        3     1.0000
2016-06-10     B        4        nan
2016-06-12     B        5     1.0000
2016-06-13     B        6     1.0000

This is what I need, but it does not work

data['delta_two'] = data.groupby('group').myvalue.transform(lambda x: x - x.shift(1, pd.Timedelta('1 days')))

  File "C:\Users\john\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 120, in __init__
    len(self.mgr_locs)))

ValueError: Wrong number of items passed 4, placement implies 3

Any ideas? Is this a bug? I think I am using the correct pandonic syntax here.

Thanks!

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
PyJaycommented, May 14, 2018

I will add this example to the documentation

1reaction
mroeschkecommented, Mar 26, 2018

You can do some reshaping and remerge the result of the groupby.apply to your original data

In [5]: res = data.groupby('group').myvalue.apply(lambda x: x - x.shift(1, pd.Timedelta('1 days')))

In [6]: res.name = 'delta_one'

In [7]: data.reset_index().merge(res.reset_index(), how='left', on=['mydate', 'group']).set_index('mydate')
Out[7]:
           group  myvalue  delta_one
mydate
2016-06-06     A        1        NaN
2016-06-08     A        2        NaN
2016-06-09     A        3        1.0
2016-06-10     B        4        NaN
2016-06-12     B        5        NaN
2016-06-13     B        6        1.0
Read more comments on GitHub >

github_iconTop Results From Across the Web

Adjust the overlapping dates in group by with priority from ...
As Title Suggest, I am working on a problem to find overlapping dates based on ID and adjust overlapping date based on priority(weight)....
Read more >
All the Pandas shift() you should know for data analysis
We will focus on practical problems and should help you get started with data analysis. Shifting values with periods; Shifting time-series ...
Read more >
What's new in 1.4.0 (January 22, 2022) - Pandas
Negative arguments for GroupBy.head() and GroupBy.tail() now work correctly and result in ranges relative to the end and start of each group, respectively....
Read more >
Pandas Groupby datetime by multiple hours [closed]
You can use the Grouper function. With the freq argument, you can set the time interval. The example is for 6 hours.
Read more >
Python | Pandas dataframe.shift() - GeeksforGeeks
Pandas dataframe.shift() function Shift index by desired number of periods with an ... freq : DateOffset, timedelta, or time rule string, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found