question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: SeriesGroupBy.value_counts() throws IndexError if there is only one group

See original GitHub issue
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd
pd.DataFrame([[0, 1]], columns=('a', 'b')).groupby('a')['b'].value_counts()
IndexError                                Traceback (most recent call last)
<ipython-input-53-ab16326ac16f> in <module>
----> 1 pd.DataFrame([[0,1]], columns=('a', 'b')).groupby('a')['b'].value_counts()

~/.venvs/python3/lib/python3.9/site-packages/pandas/core/groupby/generic.py in value_counts(self, normalize, sort, ascending, bins, dropna)
    762         if not len(lchanges):
    763             inc = lchanges
--> 764         inc[idx] = True  # group boundaries are also new values
    765         out = np.diff(np.nonzero(np.r_[inc, True])[0])  # value counts
    766

IndexError: index 0 is out of bounds for axis 0 with size 0

Problem description

Calling value_counts() on a SeriesGroupBy throws an IndexError when there is only one group.

Expected Output

a  b
0  1    1
Name: b, dtype: int64

Output of pd.show_versions()

INSTALLED VERSIONS

commit : f00ed8f47020034e752baf0250483053340971b0 python : 3.9.4.final.0 python-bits : 64 OS : Darwin OS-release : 19.6.0 Version : Darwin Kernel Version 19.6.0: Mon Apr 12 20:57:45 PDT 2021; root:xnu-6153.141.28.1~1/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.3.0 numpy : 1.20.2 pytz : 2021.1 dateutil : 2.8.1 pip : 21.0.1 setuptools : 54.2.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 7.23.0 pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 4.0.0 pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
neelmramancommented, Jul 23, 2021

@rhshadrach mind also taking a look at the PR when you get the chance? thanks!

0reactions
simonjayhawkinscommented, Jul 23, 2021

regression from 1.2.5

first bad commit: [fc08415436b0f5c3f8482e89c50fe3c5e8f2d381] BUG: SeriesGroupBy.value_counts() raising error on an empty Series (#39326)

Read more comments on GitHub >

github_iconTop Results From Across the Web

What's new in 1.5.0 (September 19, 2022) - Pandas
It effectively “exports” the pandas dataframe as an interchange object so any ... Bug in SeriesGroupBy.value_counts() index when passing categorical column ...
Read more >
Pandas: how to do value counts within groups - Stack Overflow
I want to group by a and b first. Within each group, I need to do a value count based on c and...
Read more >
What's New — pandas 0.19.2 documentation
This is a minor bug-fix release in the 0.19.x series and includes some small ... Bug in Series.groupby.nunique() raising an IndexError for an...
Read more >
BasePandasDataset — Modin 0.11.0+0.gc3b8d7e.dirty ...
This function only applies to elements that are all numeric. ... The resample() method is more appropriate if an operation on each group...
Read more >
pyspark.pandas.series - Apache Spark
Note that if `data` is a pandas Series, other arguments should not be used. ... default False Just reset the index, without inserting...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found