BUG: "min_period" and "min_periods" problem in df.groupby().rolling
See original GitHub issue-
[y] I have checked that this issue has not already been reported.
-
[y] I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
# Your code here
import pandas as pd
name_l = ["Alice"] * 5 + ["Bob"] * 5
val_l = [np.nan,np.nan,1,2,3]+[np.nan,1,2,3,4]
test_df = pd.DataFrame([name_l,val_l]).T
test_df.columns = ["name","val"]
# correct one
test_df.groupby("name")["val"].rolling(window=2,min_periods=1).sum()
# wrong one with "min_period” parameter
test_df.groupby("name")["val"].rolling(window=2,min_period=1).sum()
Problem description
The one with min_period does not work the same as min_periods and should not be allowed.
[this should explain why the current behaviour is a problem and why the expected output is a better solution]
Expected Output
The one with min_period as parameter should produce an error.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None python : 3.7.5.final.0 python-bits : 64 OS : Linux OS-release : 5.3.0-7629-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.0.3 numpy : 1.16.4 pytz : 2019.2 dateutil : 2.7.3 pip : 20.1 setuptools : 46.1.3 Cython : None pytest : 5.4.1 hypothesis : None sphinx : 2.2.1 blosc : None feather : None xlsxwriter : None lxml.etree : 4.4.2 html5lib : 1.0.1 pymysql : None psycopg2 : None jinja2 : 2.10 IPython : 7.11.1 pandas_datareader: None bs4 : 4.8.2 bottleneck : None fastparquet : None gcsfs : None lxml.etree : 4.4.2 matplotlib : 3.1.1 numexpr : None odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : None pytables : None pytest : 5.4.1 pyxlsb : None s3fs : None scipy : 1.3.1 sqlalchemy : 1.3.12 tables : None tabulate : None xarray : 0.15.1 xlrd : 1.2.0 xlwt : None xlsxwriter : None numba : 0.46.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:5 (4 by maintainers)
Top GitHub Comments
I agree with you but lets see what @charlesdong1991 thinks just to check
take