question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

hist breaks when range of input data is very large

See original GitHub issue

When hist is given a large range of values and the freedman bin method is chosen memory use grows rapidly, (15GB when I killed the kernel after a few minutes), CPU usage jumps to 100% and the histogram is never plotted even if the number of data points is small.

This case below reproduces the behavior on my computer; I don’t think there is anything special about the set of numbers below, just that the range needs to be large. Note that only 10 points need to be histogrammed.

import numpy as np

from astropy.visualization import hist

data = [ 9.99999914e+05, -8.31312483e-03,  6.52755852e-02,  1.43104653e-03,
             -2.26311017e-02,  2.82660007e-03,  1.80307521e-02,  9.26294279e-03,
             5.06606026e-02,  2.05418011e-03]
hist(data, bins='freedman')

I’ve seen this issue in both astropy 2.0.8 and 3.0.4 both in Python 3.6.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:21 (21 by maintainers)

github_iconTop GitHub Comments

2reactions
mwcraigcommented, Aug 22, 2018

Oops, thanks for letting me know. Accidentally tagged the wrong sub package initially too 🙄

1reaction
bsipoczcommented, Sep 14, 2018

@abhinuvpitale - The bins are computed by astropy and passed as those large arrays to mpl.

Read more comments on GitHub >

github_iconTop Results From Across the Web

some 'x' not counted; maybe 'breaks' do not span range of 'x
The best way to avoid this error is to subset the data that you feed to the base R function hist . For...
Read more >
A Complete Guide to Histograms
Histograms are a common chart type used to look at distributions of numeric variables. Check out this guide to learn how to use...
Read more >
HistogramTools for Distributions of Large Data Sets
Histograms are a common graphical representation of the distribution of a data set. They are particularly useful for collecting very large data sets...
Read more >
R Programming - Histogram Breaks and Axis Limits - YouTube
This video is a tutorial on How the histogram bins work in default R hist function and how can we specify custom vectors...
Read more >
Excel Histogram: How to put limits in the right place! - YouTube
Previously we say how to make a frequency distribution and histogram in Excel. The easy way is to put the bin ranges in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found