BUG: SeriesGroupBy.value_counts() throws IndexError if there is only one group
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
pd.DataFrame([[0, 1]], columns=('a', 'b')).groupby('a')['b'].value_counts()
IndexError Traceback (most recent call last)
<ipython-input-53-ab16326ac16f> in <module>
----> 1 pd.DataFrame([[0,1]], columns=('a', 'b')).groupby('a')['b'].value_counts()
~/.venvs/python3/lib/python3.9/site-packages/pandas/core/groupby/generic.py in value_counts(self, normalize, sort, ascending, bins, dropna)
762 if not len(lchanges):
763 inc = lchanges
--> 764 inc[idx] = True # group boundaries are also new values
765 out = np.diff(np.nonzero(np.r_[inc, True])[0]) # value counts
766
IndexError: index 0 is out of bounds for axis 0 with size 0
Problem description
Calling value_counts()
on a SeriesGroupBy
throws an IndexError
when there is only one group.
Expected Output
a b
0 1 1
Name: b, dtype: int64
Output of pd.show_versions()
INSTALLED VERSIONS
commit : f00ed8f47020034e752baf0250483053340971b0 python : 3.9.4.final.0 python-bits : 64 OS : Darwin OS-release : 19.6.0 Version : Darwin Kernel Version 19.6.0: Mon Apr 12 20:57:45 PDT 2021; root:xnu-6153.141.28.1~1/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 1.3.0 numpy : 1.20.2 pytz : 2021.1 dateutil : 2.8.1 pip : 21.0.1 setuptools : 54.2.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 7.23.0 pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 4.0.0 pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (4 by maintainers)
@rhshadrach mind also taking a look at the PR when you get the chance? thanks!
regression from 1.2.5
first bad commit: [fc08415436b0f5c3f8482e89c50fe3c5e8f2d381] BUG: SeriesGroupBy.value_counts() raising error on an empty Series (#39326)