BUG: groupby behavior changed since 1.4.x
See original GitHub issuePandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# After 1.4.x
>>> import pandas as pd
>>> pdf = pd.DataFrame({"timestamp": [0.0], "car_id": ["A"]})
>>> pdf.groupby("car_id").apply(lambda _: pd.DataFrame({"column": [0.0]})).index
Int64Index([0], dtype='int64')
>>> pdf.groupby("car_id").apply(lambda _: pd.DataFrame({"column": [0.0]}))
column
0 0.0
Issue Description
groupby behavior changed since 1.4.x
Expected Behavior
# Before 1.4.x
>>> import pandas as pd
>>> pdf = pd.DataFrame({"timestamp": [0.0], "car_id": ["A"]})
>>> pdf.groupby("car_id").apply(lambda _: pd.DataFrame({"column": [0.0]})).index
MultiIndex([('A', 0)],
names=['car_id', None])
>>> pdf.groupby("car_id").apply(lambda _: pd.DataFrame({"column": [0.0]}))
column
car_id
A 0 0.0
Installed Versions
1.4.x
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
What's new in 1.4.0 (January 22, 2022) - Pandas
These are bug fixes that might have notable behavior changes. Inconsistent date string parsing#. The dayfirst option of to_datetime() isn't strict ...
Read more >Migration Guide: SQL, Datasets and DataFrame - Apache Spark
Upgrading from Spark SQL 3.2 to 3.3. Since Spark 3.3, the histogram_numeric function in Spark SQL returns an output type of an array...
Read more >Pandas groupby aggregation yields extraneous groups. Bug?
TLDR: As of pandas 1.2.4, groupby has nonintuitive behavior when one of the indexes is Categorical. Fixed with groupby(..., observed=True) .
Read more >SQL Expression Language Tutorial (1.x API)
SQLAlchemy expanded the users table into the set of each of its columns, and also generated a FROM clause for us. Changed in...
Read more >MySQL bugs fixed by Aurora MySQL database engine updates
MySQL bugs fixed by Aurora MySQL 3.x database engine updates ... Note that this behavior change does not apply in the case of...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The bisected commit removed apply_frame_axis0, which was a fastpath that behaved subtly differently from the non-fastpath in ways that we were never really able to consistently debug. The non-fastpath (new and current) behavior is the canonically correct behavior.
closing as duplicate and resolved by calling out this breaking change in the 1.4 release notes https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.4.0.html#groupby-apply-consistent-transform-detection