question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug in np.histogram when min == max and values are >= 2**53

See original GitHub issue

The following call to histogram fails:

In [5]: np.histogram([2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:566: RuntimeWarning: divide by zero encountered in double_scalars
  norm = bins / (mx - mn)
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:591: RuntimeWarning: invalid value encountered in multiply
  tmp_a *= norm
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-41f77d7d86af> in <module>()
----> 1 np.histogram([2**53])

/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
    598             # The index computation is not guaranteed to give exactly
    599             # consistent results within ~1 ULP of the bin edges.
--> 600             decrement = tmp_a_data < bin_edges[indices]
    601             indices[decrement] -= 1
    602             # The last bin includes the right edge. The other bins do not.

IndexError: index -9223372036854775808 is out of bounds for axis 1 with size 11

whereas this succeeds:

In [6]: np.histogram([2**53-1])
Out[6]: 
(array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0]),
 array([  9.00719925e+15,   9.00719925e+15,   9.00719925e+15,
          9.00719925e+15,   9.00719925e+15,   9.00719925e+15,
          9.00719925e+15,   9.00719925e+15,   9.00719925e+15,
          9.00719925e+15,   9.00719925e+15]))

The threshold for triggering the bug appears to be 2**53, but this only happens when the min of the array is the same as the max:

In [7]: np.histogram([2**53, 2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:566: RuntimeWarning: divide by zero encountered in double_scalars
  norm = bins / (mx - mn)
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:591: RuntimeWarning: invalid value encountered in multiply
  tmp_a *= norm
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-7-f74f7a6cf401> in <module>()
----> 1 np.histogram([2**53, 2**53])

/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
    598             # The index computation is not guaranteed to give exactly
    599             # consistent results within ~1 ULP of the bin edges.
--> 600             decrement = tmp_a_data < bin_edges[indices]
    601             indices[decrement] -= 1
    602             # The last bin includes the right edge. The other bins do not.

IndexError: index -9223372036854775808 is out of bounds for axis 1 with size 11

When the min and the max are different, things are fine:

In [8]: np.histogram([2**53, 2**54])
Out[8]: 
(array([1, 0, 0, 0, 0, 0, 0, 0, 0, 1]),
 array([  9.00719925e+15,   9.90791918e+15,   1.08086391e+16,
          1.17093590e+16,   1.26100790e+16,   1.35107989e+16,
          1.44115188e+16,   1.53122387e+16,   1.62129587e+16,
          1.71136786e+16,   1.80143985e+16]))

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:2
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
sanjok-blesscommented, Nov 1, 2018

I have the same issue with a big double number, reproducible code:

double_numbers = np.array([1e20] * 20)
np.histogram(double_numbers, bins=100)

I get an error:

IndexError                                Traceback (most recent call last)
<ipython-input-2-51958daa5f2c> in <module>()
      1 double_numbers = np.array([1e20] * 20)
----> 2 np.histogram(double_numbers, bins=100)

/home/oleksandr/.pyenv/versions/2.7.8/envs/dr2.7.8/lib/python2.7/site-packages/numpy/lib/histograms.pyc in histogram(a, bins, range, normed, weights, density)
    764             # The index computation is not guaranteed to give exactly
    765             # consistent results within ~1 ULP of the bin edges.
--> 766             decrement = tmp_a < bin_edges[indices]
    767             indices[decrement] -= 1
    768             # The last bin includes the right edge. The other bins do not.

IndexError: index -9223372036854775808 is out of bounds for axis 1 with size 101

I use numpy 1.15.1 and see the bug exists at least from 1.10 version. Are there any plans to fix this?

0reactions
collinmccarthycommented, May 14, 2020

I’m getting this same error with the following: np.histogram(np.nan_to_num(np.inf))

This makes it difficult to create a histogram when some values are +/- infinity, which happens a lot when logging gradient information in deep learning frameworks like PyTorch and Tensorflow.

Are there still plans to fix this issue? If not what should I set these large values to (sometimes they’re in 32 or 64-bit arrays) to avoid this? Thank you.

Read more comments on GitHub >

github_iconTop Results From Across the Web

histogram misses values in matplotlib, bug? - Stack Overflow
The lower and upper range of the bins. If not provided, range is simply (a.min(), a.max()). Share.
Read more >
Release Notes — pandas 0.13.0 documentation - PyData |
Fix a bug where reshaping a Series to its own shape raised TypeError (GH4554) and other reshaping issues. Bug in setting with ix/loc...
Read more >
numpy.histogram — NumPy v1.24 Manual
min (), a.max()) . Values outside the range are ignored. The first element of the range must be less than or equal to...
Read more >
segmentation / graph_gui | GitLab
Bug fixes, Dual version, histograms, cluster exploration, ... 1501, + mn = min(prop) ... 1626, +# cluster = np.argwhere(count == max(count))[0].
Read more >
SciPy Reference Guide
#6876: Python stopped working (Segfault?) with minimum/maximum filter. ... scipy.stats.histogram is deprecated in favor of np.histogram, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found