question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: "min_period" and "min_periods" problem in df.groupby().rolling

See original GitHub issue
  • [y] I have checked that this issue has not already been reported.

  • [y] I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here

import pandas as pd
name_l = ["Alice"] * 5 + ["Bob"] * 5 
val_l = [np.nan,np.nan,1,2,3]+[np.nan,1,2,3,4]
test_df = pd.DataFrame([name_l,val_l]).T
test_df.columns = ["name","val"]

# correct one
test_df.groupby("name")["val"].rolling(window=2,min_periods=1).sum()

# wrong one with "min_period” parameter 
test_df.groupby("name")["val"].rolling(window=2,min_period=1).sum() 

Problem description

The one with min_period does not work the same as min_periods and should not be allowed.

[this should explain why the current behaviour is a problem and why the expected output is a better solution]

Expected Output

The one with min_period as parameter should produce an error.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None python : 3.7.5.final.0 python-bits : 64 OS : Linux OS-release : 5.3.0-7629-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.0.3 numpy : 1.16.4 pytz : 2019.2 dateutil : 2.7.3 pip : 20.1 setuptools : 46.1.3 Cython : None pytest : 5.4.1 hypothesis : None sphinx : 2.2.1 blosc : None feather : None xlsxwriter : None lxml.etree : 4.4.2 html5lib : 1.0.1 pymysql : None psycopg2 : None jinja2 : 2.10 IPython : 7.11.1 pandas_datareader: None bs4 : 4.8.2 bottleneck : None fastparquet : None gcsfs : None lxml.etree : 4.4.2 matplotlib : 3.1.1 numexpr : None odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : None pytables : None pytest : 5.4.1 pyxlsb : None s3fs : None scipy : 1.3.1 sqlalchemy : 1.3.12 tables : None tabulate : None xarray : 0.15.1 xlrd : 1.2.0 xlwt : None xlsxwriter : None numba : 0.46.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
gabrielvf1commented, May 27, 2020

I agree with you but lets see what @charlesdong1991 thinks just to check

0reactions
horaceklaicommented, Aug 7, 2021

take

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issue with groupby and rolling window - python - Stack Overflow
I'd like to add a new column that includes the rolling average of the scores for each type for every 4 ranks ....
Read more >
Version 0.18.1 (May 3, 2016) — pandas 1.5.1 documentation
This is a minor bug-fix release from 0.18.0 and includes a large number of bug fixes along with several new features, enhancements, and...
Read more >
The difference between the expanding and rolling window in ...
In Pandas, there are two types of window functions. In this article, I am going to demonstrate the difference between them, explain how...
Read more >
Pandas Rolling Groupby - Linux Hint
The rolling() function provides a rolling window calculation on the input data in the given object series. The rolling window concept is mostly...
Read more >
Calculate a Rolling Average (Mean) in Pandas - Datagy
In this post, you'll learn how to calculate a rolling mean in Pandas using the rolling() function. Rolling averages are also known as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found