BUG: DataFrame.groupby drops timedelta column in v1.3.0
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
df = pd.DataFrame(
{
"duration":[pd.Timedelta(i, unit="days") for i in range(1,6)],
"value":[1,0,1,2,1]
}
)
df.groupby("value").sum()
results in:
Empty DataFrame
Columns: []
Index: [0, 1, 2]
The same behaviour occurs whether it’s sum, mean, median etc
Note the following gives close to the expected result:
df.groupby("value")["duration"].sum()
value
0 2 days
1 9 days
2 4 days
Name: duration, dtype: timedelta64[ns]
Also, this problem may be unique to Timedeltas as the following similar example, has no problem:
df = pd.DataFrame(
{
"duration":[i for i in range(1,6)],
"value":[1,0,1,2,1]
}
)
df.groupby("value").sum()
gives:
duration
value
0 2
1 9
2 4
Problem description
In v1.3.0 (and master, commit dad3e7) when performing a groupby operation on a dataframe a timedelta column goes missing. This behaviour does not occur in pandas 1.2.5
Expected Output
The expected output is given by pandas 1.2.5:
duration
value
0 2 days
1 9 days
2 4 days
It is a dataframe, with one column “duration”, indexed by the groupby keys
Output of pd.show_versions()
INSTALLED VERSIONS
commit : f00ed8f47020034e752baf0250483053340971b0 python : 3.7.5.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19041 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None
pandas : 1.3.0 numpy : 1.21.0 pytz : 2021.1 dateutil : 2.8.1 pip : 19.2.3 setuptools : 41.2.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
Issue Analytics
- State:
- Created 2 years ago
- Comments:33 (33 by maintainers)

Top Related StackOverflow Question
+1 on defaulting
numeric_onlyto False. To have codenot result in summing
bis surprising.I’m fine with that.