Time-based rolling apply inconsistent behavior on NaN rows
See original GitHub issueCode Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
s = pd.Series([1, 2, 3, np.nan, 5], index=pd.date_range('2017-01-01', '2017-01-05'))
print(s.rolling('3d', min_periods=1).apply(lambda x: 42))
Problem description
Output of the above code sample:
2017-01-01 42.0
2017-01-02 42.0
2017-01-03 42.0
2017-01-04 NaN
2017-01-05 42.0
It seems that the user function did not get applied to the window corresponding to the original NaN row, resulting in NaN as the result for that row. Why is that? The more reasonable output would be all 42 because by the min_periods
constraint, all of the windows are valid.
Compare this to the equivalent fixed window version:
s = pd.Series([1, 2, 3, np.nan, 5])
print(s.rolling(3, min_periods=1).apply(lambda x: 42))
which gives the following output:
0 42.0
1 42.0
2 42.0
3 42.0
4 42.0
Is this inconsistency between fixed and variable size window a desired behavior?
Expected Output
2017-01-01 42.0
2017-01-02 42.0
2017-01-03 42.0
2017-01-04 42.0
2017-01-05 42.0
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 4.11.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None
pandas: 0.21.0.dev+316.gf2b0bdc9b pytest: None pip: 9.0.1 setuptools: 36.2.5 Cython: 0.26 numpy: 1.13.1 scipy: None pyarrow: None xarray: None IPython: 5.4.1 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
@YS-L : No worries! We don’t expect everyone to pull
master
and buildpandas
from source, though since you’re an Arch user, I guess I would expect that from you 😉But actually, thanks for taking the time to do this. We’re happy to see that we anticipated your problem 😄
Yes, pulled
master
to confirm.Thought I found a fix after some digging into
window.pyx
, just to discover that it had been applied there on master. Note to self: should always test on latest master.