Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Huge time for aggregation in postgresql

See original GitHub issue

Hello,

I’m pretty new to prestoDB usage but was not able to find any answer in doc or others issues.

When I do some pretty basic request through presto like select field1,max(field2) from table group by 1 with table which has several millions lines.

i have some requests which will take a few seconds directly on postgresql but when using prestodb, it’s several minutes. Checking the presto live plan, I see that presto will request all the data from table to next do the max aggregation. Can’t we force presto to request the aggregated data directly from pg ?

thx! PS: using prestoDB 0.222, didn’t find any bug related fixed in more recent versions.

Issue Analytics

State:
Created 4 years ago
Reactions:3
Comments:5 (4 by maintainers)

Top GitHub Comments

2reactions

highkercommented, Sep 24, 2019

@sachdevs will work on plan pushdown for JDBC connectors

1reaction

mbasmanovacommented, Sep 24, 2019

@thomasLeclaire Your observations are correct. At the moment, Presto is not able to push down complex operations such as aggregations or joins into the data source. Hence, it read all the data, then aggregates on its own. @highker added infrastructure to enable pushdown of any part of the plan, but we are still missing support from the connectors. Specifically, Postgres connector needs to be modified to add support for pushing down operations. Let us know if you are interested in working on that.