Bug in np.histogram when min == max and values are >= 2**53
See original GitHub issueThe following call to histogram
fails:
In [5]: np.histogram([2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:566: RuntimeWarning: divide by zero encountered in double_scalars
norm = bins / (mx - mn)
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:591: RuntimeWarning: invalid value encountered in multiply
tmp_a *= norm
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-5-41f77d7d86af> in <module>()
----> 1 np.histogram([2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
598 # The index computation is not guaranteed to give exactly
599 # consistent results within ~1 ULP of the bin edges.
--> 600 decrement = tmp_a_data < bin_edges[indices]
601 indices[decrement] -= 1
602 # The last bin includes the right edge. The other bins do not.
IndexError: index -9223372036854775808 is out of bounds for axis 1 with size 11
whereas this succeeds:
In [6]: np.histogram([2**53-1])
Out[6]:
(array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0]),
array([ 9.00719925e+15, 9.00719925e+15, 9.00719925e+15,
9.00719925e+15, 9.00719925e+15, 9.00719925e+15,
9.00719925e+15, 9.00719925e+15, 9.00719925e+15,
9.00719925e+15, 9.00719925e+15]))
The threshold for triggering the bug appears to be 2**53, but this only happens when the min of the array is the same as the max:
In [7]: np.histogram([2**53, 2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:566: RuntimeWarning: divide by zero encountered in double_scalars
norm = bins / (mx - mn)
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py:591: RuntimeWarning: invalid value encountered in multiply
tmp_a *= norm
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-f74f7a6cf401> in <module>()
----> 1 np.histogram([2**53, 2**53])
/Users/tom/miniconda3/envs/dev/lib/python3.6/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
598 # The index computation is not guaranteed to give exactly
599 # consistent results within ~1 ULP of the bin edges.
--> 600 decrement = tmp_a_data < bin_edges[indices]
601 indices[decrement] -= 1
602 # The last bin includes the right edge. The other bins do not.
IndexError: index -9223372036854775808 is out of bounds for axis 1 with size 11
When the min and the max are different, things are fine:
In [8]: np.histogram([2**53, 2**54])
Out[8]:
(array([1, 0, 0, 0, 0, 0, 0, 0, 0, 1]),
array([ 9.00719925e+15, 9.90791918e+15, 1.08086391e+16,
1.17093590e+16, 1.26100790e+16, 1.35107989e+16,
1.44115188e+16, 1.53122387e+16, 1.62129587e+16,
1.71136786e+16, 1.80143985e+16]))
Issue Analytics
- State:
- Created 7 years ago
- Reactions:2
- Comments:10 (7 by maintainers)
Top Results From Across the Web
histogram misses values in matplotlib, bug? - Stack Overflow
The lower and upper range of the bins. If not provided, range is simply (a.min(), a.max()). Share.
Read more >Release Notes — pandas 0.13.0 documentation - PyData |
Fix a bug where reshaping a Series to its own shape raised TypeError (GH4554) and other reshaping issues. Bug in setting with ix/loc...
Read more >numpy.histogram — NumPy v1.24 Manual
min (), a.max()) . Values outside the range are ignored. The first element of the range must be less than or equal to...
Read more >segmentation / graph_gui | GitLab
Bug fixes, Dual version, histograms, cluster exploration, ... 1501, + mn = min(prop) ... 1626, +# cluster = np.argwhere(count == max(count))[0].
Read more >SciPy Reference Guide
#6876: Python stopped working (Segfault?) with minimum/maximum filter. ... scipy.stats.histogram is deprecated in favor of np.histogram, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I have the same issue with a big double number, reproducible code:
I get an error:
I use numpy 1.15.1 and see the bug exists at least from 1.10 version. Are there any plans to fix this?
I’m getting this same error with the following:
np.histogram(np.nan_to_num(np.inf))
This makes it difficult to create a histogram when some values are +/- infinity, which happens a lot when logging gradient information in deep learning frameworks like PyTorch and Tensorflow.
Are there still plans to fix this issue? If not what should I set these large values to (sometimes they’re in 32 or 64-bit arrays) to avoid this? Thank you.