Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Start bisecting after getting min/max from first database

See original GitHub issue

Some databases are awfully slow at getting min(id) and max(id) for a column when WHERE is added to the mix, for example, for this query in Snowflake, it takes 50s on a dataset in the millions of rows:

Running SQL (Snowflake): SELECT min(id), max(id) FROM TRANSFERS WHERE ('2022-06-04T14:27:44.096619' <= created_at) AND (created_at < '2022-06-14T13:57:44.096658')

In Postgres, the same query takes a few millis.

It’s going to be rare they’re different, so instead, we can just use the fast one and start bisecting.

When the second, slower database returns min + max, we compare with the faster one’s. If it’s not the same min/max, we’ll warn, and could restart the bisection. Alternatively, we can just start bisecting at the extremes if min/max are now extended, so it’s very graceful.

E.g. it should look like this:

Thread1, Time 00:00:00: Postgres: select min(id), max(id) from table
Thread2, Time 00:00:00: Snowflake: select min(id), max(id) from table
Thread1 Time 00:00:01: Postgres returns min=1, max=1000
... start bisecting ...
Thread2, Time 00:00:10: Snowflake returns min=1, max=1000
... continues bisecting because min + max are the same...

This should improve performance substantially on some platforms.

Issue Analytics

State:
Created a year ago
Comments:5

Top GitHub Comments

1reaction

sirupsencommented, Jul 27, 2022

Looks awesome @erezsh

0reactions

erezshcommented, Jul 27, 2022

@sirupsen Actually turned out to be pretty elegant!

Read more comments on GitHub >

Top Results From Across the Web

The Bisecting Min Max DBSCAN Algorithm - IOSR Journal

In this paper, a new approach of finding clusters similar to the clusters formed by. DBSCAN but with improved time complexity is introduced....

MIN/MAX vs ORDER BY and LIMIT - Stack Overflow

We just reduced a DB with >10M rows from multi-second to sub-second by pivoting from order by with limit to group by with...

Everything you need to know about Min-Max normalization

Everything you need to know about Min-Max normalization: A Python tutorial. In this post I explain what Min-Max scaling is, when to use...

Divisive Hierarchical Bisecting Min–Max Clustering Algorithm

This paper purposes a K-means clustering algorithm based on improved filtering process. Thealgorithm improves the filtering process,The two ...

SQL MIN and MAX Functions Explained in 6 Examples

First, let's talk about the MIN() function. It returns the smallest value in a set of values. The values can come from a...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

MySQL driver querying information_schema with a table_schema value of the user instead of the database

Is there a filter to hide input on terminal?