question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

histogram returns bad data when setting start and end parameters

See original GitHub issue

When calling the histogram URL for a numeric column with start and end parameters, the endpoint returns bad data for the buckets in the extremes.

NO START

{
	"bin_width": 46.6666666666667,
	"bins_count": 3,
	"bins_start": 9,
	"bins": [{
		"bin": 0,
		"min": 9,
		"max": 55,
		"avg": 35.405126849894295,
		"freq": 3784
	}, {
		"bin": 1,
		"min": 56,
		"max": 102,
		"avg": 75.28977482465855,
		"freq": 2709
	}, {
		"bin": 2,
		"min": 103,
		"max": 149,
		"avg": 123.25338491295938,
		"freq": 517
	}],
	"type": "histogram"
}

We can see that this histogram has a range from 9 to 149.

If we set start: 50 and end: 100 we get

{
	"bin_width": 16.6666666666667,
	"bins_count": 3,
	"bins_start": 50,
	"bins": [{
		"bin": 0,
		"min": 9,
		"max": 66,
		"avg": 40.28936078936079,
		"freq": 4662
	}, {
		"bin": 1,
		"min": 67,
		"max": 83,
		"avg": 74.27996254681648,
		"freq": 1068
	}, {
		"bin": 2,
		"min": 84,
		"max": 149,
		"avg": 105.07421875,
		"freq": 1280
	}],
	"type": "histogram"
}

We can see that the buckets are 3, starting at 50 with a width of 16.67. So they are

|------|------|------|
50   66.7   83.4    100

This is correct. We asked for data between 50 and 100 so these are the proper buckets.

BUT if we study the bins data we get: Bin 0: min 9, max 66 (min should be above 50) Bin 1: min 67, max 83 (this is ok) Bin 2: min 84, max 149 (max should be below 100)

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
ivanmalagoncommented, Jul 5, 2018

We’re already sending them to the endpoint only if both are defined so we’ll state it in the CARTO.js documentation and add the proper validations to make sure that both are defined.

0reactions
Algunenanocommented, Jul 5, 2018

E.g: as per the code, the max and min are calculated when you pass start >= end.

That specific case is handled in the base view so they will be flipped before getting there, but it’s true that unexpected things will happen if you only pass one of them.

Read more comments on GitHub >

github_iconTop Results From Across the Web

6 Reasons Why You Should Stop Using Histograms (and ...
Histograms are not free of biases. Actually, they are arbitrary and may lead to wrong conclusions about data. If you want to visualize...
Read more >
Values getting dropped from ggplot2 histogram when ...
I'd like to create a ggplot2 histogram in which the plot's limits are equal to the smallest and largest values in the data...
Read more >
Histogram – The Ultimate Guide of Binning
Histogram is a column chart. Each bar represents a range of numeric values. The height shows the number of values in that range....
Read more >
Chapter: Histograms
The distribution contained in the histogram h1 ( TH1 ) is integrated over the channel contents. It is normalized to one. The second...
Read more >
Quantitative Frequency Distributions and Histograms
The scales on both the frequency and the data axes cover the data values and not much more. Finally, there are no gaps...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found