Methods min and max give NaN in time-aware rolling window even if min_periods=1
See original GitHub issueCode Sample, a copy-pastable example if possible
df = pd.DataFrame({'a': [None, 2, 3]}, index=pd.to_datetime(['20170403', '20170404', '20170405']))
df.rolling('3d', min_periods=1)['a'].sum()
df.rolling('3d', min_periods=1)['a'].min()
df.rolling('3d', min_periods=1)['a'].max()
Problem description
Even if we set min_periods=1
, the functions min
and max
give NaN if there is one NaN value inside the time-aware rolling window.
However, there is no bug when the window width is fixed (not a time period):
In [397]: df.rolling(3, min_periods=1)['a'].min()
Out[397]:
2017-04-03 NaN
2017-04-04 2.0
2017-04-05 2.0
Name: a, dtype: float64
Expected Output
The expected output, analogously to the one given by the function sum
, should be a non-NaN value if at least there is a non-NaN value inside the rolling window.
In [397]: df.rolling('3d', min_periods=1)['a'].min()
Out[397]:
2017-04-03 NaN
2017-04-04 2.0
2017-04-05 2.0
Name: a, dtype: float64
In [397]: df.rolling('3d', min_periods=1)['a'].min()
Out[397]:
2017-04-03 NaN
2017-04-04 2.0
2017-04-05 3.0
Name: a, dtype: float64
Output of pd.show_versions()
pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.5.1 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: 3.3.0 numexpr: 2.6.1 matplotlib: 1.5.1 openpyxl: 2.4.0 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.6 lxml: 3.7.2 bs4: 4.5.3 html5lib: None httplib2: 0.9.2 apiclient: None sqlalchemy: 1.1.4 pymysql: None psycopg2: None jinja2: 2.8.1 boto: 2.45.0 pandas_datareader: None
Issue Analytics
- State:
- Created 6 years ago
- Comments:16 (15 by maintainers)
Top GitHub Comments
thanks for checking @ihsansecer
ok if someone wants to take a crack at this, have at it.