question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Handling a CustomBusinessDay in time-based .rolling()

See original GitHub issue

Starting with 10 business days following Christmas Eve:

from pandas.tseries.offsets import CustomBusinessDay
from pandas.tseries.holiday import USFederalHolidayCalendar

days = CustomBusinessDay(calendar=USFederalHolidayCalendar())

df = pd.DataFrame({'value': np.arange(n)},
                  index=pd.date_range('2015-12-24', periods=n, freq=days))

I can compute the three-day sum of the values with just:

In [21]: df.rolling('3d').sum()
Out[21]:
            value
2015-12-24    0.0
2015-12-28    1.0
2015-12-29    3.0
2015-12-30    6.0
2015-12-31    9.0
2016-01-04    5.0
2016-01-05   11.0
2016-01-06   18.0
2016-01-07   21.0
2016-01-08   24.0

But this is purely in terms of Gregorian calendar days, not the business calendar days that I had created the DataFrame with.

I can easily compute

df.index - 3*days

though I get a performance warning:

Non-vectorized DateOffset being applied to Series or DatetimeIndex

But I can’t just pass this offset directly:

In [23]: df.rolling(3*days).sum()
...
ValueError: <3 * CustomBusinessDays> is a non-fixed frequency

I would like to be able to handle a CustomBusinessDay in .rolling(). (Because the DataFrame may come from any source, it would be easier to just pass the DateOffset object to .rolling() instead of using anything specific to the DataFrame’s index.)

I know that the freq parameter was deprecated in 0.18, though 0.19 kinda brings this back in window. Is there an intrinsic reason window can’t handle a CustomBusinessDay?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
chris-b1commented, Aug 11, 2016

You could probably build a second version of this class that handled an arbitrary offset object rather than a fixed window, although it wouldn’t be performant, as there would have to be a ton of calls back into python space. Unless there’s some trick I’m not thinking of. https://github.com/pydata/pandas/blob/master/pandas/window.pyx#L252

There was an issue about cythonized offsets, #11214, and I started a proof of concept, but never went any further with it. https://github.com/chris-b1/pandas/tree/cythonize-offset

0reactions
mroeschkecommented, Oct 5, 2020

I think this issue can be closed as the VariableOffsetWindowIndexer was added to handle non-fixed offsets. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.api.indexers.VariableOffsetWindowIndexer.html

Read more comments on GitHub >

github_iconTop Results From Across the Web

Time series / date functionality — pandas 1.5.2 documentation
If we need timestamps on a regular frequency, we can use the date_range() and ... For example, business offsets will roll dates that...
Read more >
Pandas: rolling mean by time interval - python - Stack Overflow
According to this question, the rolling_* functions compute the window based on a specified number of values, and not a specific datetime range....
Read more >
Time Series / Date functionality — pandas 0.20.3 documentation
compute “relative” dates based on various non-standard time increments (e.g. 5 business days before the last business day of the year), or “roll”...
Read more >
What's New — pandas 0.18.1 documentation - API Manual
Many bug fixes in the handling of sparse, see here ... These return another deferred object (similar to what .rolling() and .expanding() do...
Read more >
python2-pandas-0.23.4-bp153.1.19 RPM for x86_64 - RPMFind
Series.rolling() which incorrectly accepted a 0 window size rather ... + Bug where MySQL interface could not handle numeric table/column ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found