Huge time for aggregation in postgresql
See original GitHub issueHello,
I’m pretty new to prestoDB usage but was not able to find any answer in doc or others issues.
When I do some pretty basic request through presto like
select field1,max(field2) from table group by 1
with table which has several millions lines.
i have some requests which will take a few seconds directly on postgresql but when using prestodb, it’s several minutes. Checking the presto live plan, I see that presto will request all the data from table to next do the max aggregation. Can’t we force presto to request the aggregated data directly from pg ?
thx! PS: using prestoDB 0.222, didn’t find any bug related fixed in more recent versions.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:5 (4 by maintainers)
Top Results From Across the Web
How We Made Data Aggregation Better and Faster on ...
In 2019, TimescaleDB introduced continuous aggregates to solve this very problem, making the ongoing aggregation of massive time-series data ...
Read more >Speeding up GROUP BY in PostgreSQL - CYBERTEC
In SQL the GROUP BY clause groups records into summary rows and turns large amounts of data into a smaller set. GROUP BY...
Read more >PostgreSQL slow with million row aggregation (how to debug)
Next I ported all the aggregations to PostgreSQL - it was much faster, but 500'000 row aggregation still took like 10 seconds.
Read more >Very slow postgresql aggregation
In the given execution plan query takes only 1,3 seconds and likely because about 40% of the table data is not in the...
Read more >PostgreSQL 13 - Improve huge table data aggregation
You need a composite BTREE index on performance(currency, date, field02) to help satisfy this particular query efficiently.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@sachdevs will work on plan pushdown for JDBC connectors
@thomasLeclaire Your observations are correct. At the moment, Presto is not able to push down complex operations such as aggregations or joins into the data source. Hence, it read all the data, then aggregates on its own. @highker added infrastructure to enable pushdown of any part of the plan, but we are still missing support from the connectors. Specifically, Postgres connector needs to be modified to add support for pushing down operations. Let us know if you are interested in working on that.