question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Optimization: valuesAtQuantiles qdigest functions

See original GitHub issue

I noticed that these quantile functions on a qdigest compute the percentile values by looping over the provided percentiles array argument and calling the underlying airlift qdigest method getQuantile (singular).

However, the airlift qdigest object has a plural version of this method getQuantiles which seems to be more efficient because it only traverses the qdigest tree once for a given array of quantiles.

I haven’t had a chance to quantify this optimisation yet, but wanted to gauge whether there’d be any opposition to a pull request on this topic?

Thanks

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:2
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
blrnw3commented, Feb 21, 2019

I did some perf testing on this and interestingly found no measurable difference in the two approaches. I tested on the TPCH data on several scale factors, evaluating 20 percentiles with the two approaches outlined in the issue summary. It seems the qdigest building is so dominant that extracting the percentiles has trivial cost, regardless of how it is done.

2reactions
tdcmeehancommented, Jan 24, 2019

I agree with @mehrdad-honarkhah, and we’ll be happy to review a pull request for this optimization.

Please be aware, the plural version of the function validates that the quantiles are sorted in ascending order. To preserve compatibility with approx_percentile and existing users of values_at_quantiles, we don’t want to introduce that validation to the values_at_quantiles function. I think it would also be worthwhile to port this over to approx_percentile, as it does the same thing currently.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Quantile Digest Functions — Presto 0.278 Documentation
A quantile digest is a data sketch which stores approximate percentile information. The presto type for this data structure is called qdigest ,...
Read more >
Quantile digest functions — Starburst Enterprise
A quantile digest is a data sketch which stores approximate percentile information. The Trino type for this data structure is called qdigest ,...
Read more >
Optimization Test Functions and Datasets
The functions listed below are some of the common functions and datasets used for testing optimization algorithms. They are grouped according to ...
Read more >
ts2740: type 'collection ' is missing the following properties ...
Solution 1. I finally solved the problem by changing the declaration of the class to class MyComponent extends React.Component<any, MyInterface>.
Read more >
T-Digest functions — Trino 402 Documentation
A T-digest is a data sketch which stores approximate percentile information. The Trino type for this data structure is called tdigest . T-digests...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found