question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Time series widget not intuitively giving correct binning

See original GitHub issue

Context

Working with a client yesterday, we had a data set of thousands of points, each with a year attribute: 2000, 2001, …, 2012. She wanted to create a simple Time Series widget to filter her map by year, but the result was the following (which must be milliseconds after the epoch?): screen shot 2017-04-12 at 08 44 28

Converting this column to time through the dataset interface changes these values to timestamps in 1970, so the values are interpreted as milliseconds. Using SQL to manually convert using to_date(time_column, 'yyyy') produces values like 2000-01-01T00:00:00Z. Creating a time series widget off this column gives: screen shot 2017-04-12 at 09 23 17

Which is an improvement, but doing any of the filtering quickly gives ranges that are not appropriate: screen shot 2017-04-12 at 08 50 00

The expectation here would be that the first bin would be 2000-01-01T00:00:00Z to 2001-01-01T00:00:00Z. I understand that I could jump through some more hoops in the data prep with SQL to manually create the ranges I want, but this seems like a lot of work for a very common use case.

We next tried creating a histogram widget but the x-axis displayed the values as follows: screen shot 2017-04-12 at 08 54 09

Here, the lack of precision in displaying the numbers is a problem.

Finally, we did a category widget: screen shot 2017-04-12 at 08 55 30

This is the closest, but the user needed the full range of data to be displayed on the widget like in the time-series widget. She also needed it to be ordered from 2000 upward like on the time series widget.

Steps to Reproduce

Please break down here below all the needed steps to reproduce the issue

  1. Import my mock dataset and create a map: https://team.carto.com/u/eschbacher/tables/mock_time_data/public
  2. Create a time series widget using the year column to see the unix timestamp conversion
  3. Create a time series widget using the year_date and choose 13 as the number of bins.
  4. Filter by time ranges to see that the bin edges do not line up based on expectations
  5. Create a histogram widget to see the x-axis precision (2k for values 2000 through 2012)

Current Result

  1. Time series does not give intuitive ranges for input values
  2. Histogram x-axis rounds too much to give user a sense of the range in values
  3. Category widget does not extend to show all values, and does not let you order them correctly

Expected result

Creating a time series widget off of a column that has a numeric value for year only should automatically have year bins without bleeding into other years.

Browser and version

Chrome Version 56.0.2924.87 (64-bit), macOS 10.12.4

.carto file

Just this dataset: https://team.carto.com/u/eschbacher/tables/mock_time_data/public

Additional info

Work with a client

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:12 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
xavijamcommented, Aug 8, 2017

Good news! Aggregations has been released and look at the results with your dataset @andy-esch ❤️ 💋 .

screen shot 2017-08-08 at 11 19 45 screen shot 2017-08-08 at 11 20 06 screen shot 2017-08-08 at 11 19 58 screen shot 2017-08-08 at 11 20 14 screen shot 2017-08-08 at 11 20 21 screen shot 2017-08-08 at 11 21 51

0reactions
xavijamcommented, Aug 8, 2017

What @noguerol commented (about timezones) will be implemented (and it is close to be released) in https://github.com/CartoDB/cartodb/issues/12088. Closing this one for the moment 💃 .

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data Scientists: STOP Randomly Binning Histograms - Medium
Histograms are a crucial part of Exploratory Data Analysis. But we often abuse them by randomly choosing a number of bins. Let's use...
Read more >
Solved: Reducing/grouping/binning time series data
I have a bunch of time series data, with data points every 15 minutes ... There is no need of groupby as it...
Read more >
Top Five: Ways to Mislead with Data Visualizations - phData
Unscrupulous people can manipulate data to tell the story they want to tell, and non-data-savvy viewers may not know the difference.
Read more >
MetaProb: accurate metagenomic reads binning based on ...
We show that MetaProb is more accurate and efficient than other ... called probabilistic sequence signature, that is not dominated by the ...
Read more >
Sketch-based fast and accurate querying of time series using ...
Abstract—Sketching is one common approach to query time series data for patterns of interest ... no existing model is capable of fully capturing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found