question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: scipy.stats.multivariate_hypergeom.rvs raises ValueError when at least the last two populations are 0

See original GitHub issue

Describe your issue.

I ran into a strange issue when sampling using Scipy’s multivariate hypergeometric distribution. Calling scipy.stats.multivariate_hypergeom.rvs on a list of populations raises the error message ValueError: ngood + nbad < nsample, while calling the same function on the sorted array as input does not. Specfically, the error seems to be raised when at least the last two values of the population are 0.

Reproducing Code Example

import scipy.stats
r = scipy.stats.multivariate_hypergeom.rvs
res1 = r([0, 20], 5) # Returns [0, 5]
res2 = r([0, 0, 20], 5) # Returns [0, 0, 5]
res3 = r([0, 0, 20, 0], 5) # # Returns [0, 0, 5, 0]
res4 = r([0, 0, 20, 0, 0], 5) # # Raises error

Error message

Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "/home/alexander/miniconda3/envs/py37/lib/python3.7/site-packages/scipy/stats/_multivariate.py", line 4726, in rvs
    size=size))
  File "mtrand.pyx", line 3839, in numpy.random.mtrand.RandomState.hypergeometric
ValueError: ngood + nbad < nsample

SciPy/NumPy/Python version information

1.7.3 1.21.5 sys.version_info(major=3, minor=10, micro=4, releaselevel=‘final’, serial=0)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
WarrenWeckessercommented, May 13, 2022

My comment about the arguments being invalid needs some qualification.

If we could rely on random_state being a Generator instance (i.e. if we could ensure that we were using the new NumPy random API instead of the legacy API), we could replace that line with:

            rvs[..., c] = random_state.hypergeometric(m[..., c], rem, n,
                                                      size=size)

n might be 0, but Generator.hypergeometric will do the right thing. For example,

In [8]: rng = np.random.default_rng()

In [9]: rng.hypergeometric(0, 1, 0)
Out[9]: 0

In [10]: rng.hypergeometric(1, 0, 0)
Out[10]: 0

In [11]: rng.hypergeometric(0, 0, 0)
Out[11]: 0

The reason for the shenanigans with n != 0 and n == 0 is that we have to support the legacy random API (and we have to support broadcasting, so we can’t use a simple if statement), and np.random.hypergeometric doesn’t allow n == 0:

In [16]: np.random.hypergeometric(0, 1, 0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [16], in <module>
----> 1 np.random.hypergeometric(0, 1, 0)

File mtrand.pyx:3871, in numpy.random.mtrand.RandomState.hypergeometric()

File _common.pyx:919, in numpy.random._common.disc()

File _common.pyx:436, in numpy.random._common.check_constraint()

ValueError: nsample < 1 or nsample is NaN

The code as written fixes that problem by multiplying the result of random_state.hypergeometric() by (n != 0); this ensures that wherever n is 0, the result is 0. It passes the argument n + (n == 0) to random_state.hypergeometric() to prevent the ValueError from being raised in that case. However, we missed the case where both of the first two arguments to hypergeometric() are 0. In that case, our code does, in effect:

In [18]: np.random.hypergeometric(0, 0, 1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [18], in <module>
----> 1 np.random.hypergeometric(0, 0, 1)

File mtrand.pyx:3870, in numpy.random.mtrand.RandomState.hypergeometric()

ValueError: ngood + nbad < nsample

One way to fix this is to add (n == 0) to rem in addition to n:

            n0mask = n == 0
            rvs[..., c] = (~n0mask *
                           random_state.hypergeometric(m[..., c],
                                                       rem + n0mask,
                                                       n + n0mask,
                                                       size=size))
1reaction
WarrenWeckessercommented, May 13, 2022

@tirthasheshpatel, I probably wouldn’t get to a PR right away, so if you have time now, go for it!

Read more comments on GitHub >

github_iconTop Results From Across the Web

problem with numpy 0's in stats.poisson.rvs (Trac #1398) #1923
(I'm using scipy 0.8.0, numpy 1.5.0 on OS X 10.6.6, with python 2.6.6) ... line 570, in rvs raise ValueError("Domain error in arguments....
Read more >
SciPy 1.9.0 Release Notes — SciPy v1.9.3 Manual
Added scipy.stats.fit for fitting discrete and continuous distributions to data, ... #16171: BUG: scipy.stats.multivariate_hypergeom.rvs raises ValueError…
Read more >
SciPy 1.7.0 Release Notes — SciPy v1.9.3 Manual
SciPy 1.7.0 is the culmination of 6 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better...
Read more >
SciPy 1.6.0 Release Notes — SciPy v1.9.3 Manual
SciPy 1.6.0 is the culmination of 6 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better...
Read more >
SciPy 1.3.0 Release Notes — SciPy v1.9.3 Manual
SciPy 1.3.0 is the culmination of 5 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found