question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Distinct is not pushed down to connectors

See original GitHub issue

Hi,

I’m trying to upgrade to recent presto, and below query are not getting the correct plan pushed down:

SELECT distinct A FROM myTable LIMIT 10

From the PlanOptimizer, the maxSubPlan we are getting is the TableScanNode now. image

And below is the query plan generated:

presto:default> explain select distinct teamid from baseballstats limit 10;

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 - Output[teamid] => [teamid:varchar]
     - Project[] => [teamid:varchar]
         - DistinctLimit[10][$hashvalue] => [teamid:varchar, $hashvalue:bigint]
             - LocalExchange[SINGLE] () => [teamid:varchar, $hashvalue:bigint]
                 - RemoteStreamingExchange[GATHER] => [teamid:varchar, $hashvalue_3:bigint]
                     - DistinctLimitPartial[10][$hashvalue_4] => [teamid:varchar, $hashvalue_4:bigint]
                         - ScanProject[table = TableHandle {connectorId='pinot', connectorHandle='PinotTableHandle{connectorId=pinot, schemaName=default, tableName=baseballStats, isQueryShort=Optional[false], expectedColumnHandles=Optio
                                 Estimates: {rows: ? (?), cpu: ?, memory: 0.00, network: 0.00}/{rows: ? (?), cpu: ?, memory: 0.00, network: 0.00}
                                 $hashvalue_4 := combine_hash(BIGINT 0, COALESCE($operator$hash_code(teamid), BIGINT 0))
                                 teamid := PinotColumnHandle{columnName=teamID, dataType=varchar, type=REGULAR}

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
xiangfu0commented, Jul 14, 2020

@fx19880617 I see that James already reviewed and merged #14831. Will there be a follow-up PR? If not, let’s close this issue.

Not more changes for now, I’m also feeling the SortNode is also pushdown-able, but not relevant to this issue. I’ll make connector changes! Many thanks for your and @kaikalur advice and thanks to @highker for reviewing it !

0reactions
mbasmanovacommented, Jul 14, 2020

@fx19880617 I see that James already reviewed and merged #14831. Will there be a follow-up PR? If not, let’s close this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

SELECT DISTINCT Cassandra in Spark - Stack Overflow
I need a query that lists out the the unique Composite Partition Keys inside of spark. The query in CASSANDRA: SELECT DISTINCT key1,...
Read more >
Use distinct on SQL connector query
Solved: For my get rows from SQL connector, I would like to use a: SELECT DISTINCT([Customer P O No]) ,[Order no] ,[Sold To]...
Read more >
DISTINCT | InterSystems SQL Reference
The DISTINCT clause is applied to the result set of the SELECT statement. It limits the rows returned to one arbitrary row for...
Read more >
Debugging why query is not pushed down - Dremio Community
For some reason my dremio query for bigquery (custom ARP connector) is not pushed down. Even simple query with sum() is not pushed...
Read more >
MySQL 8.0 Reference Manual :: 8.8.2 EXPLAIN Output Format
The EXPLAIN statement provides information about how MySQL executes statements. EXPLAIN works with SELECT , DELETE , INSERT , REPLACE , and UPDATE ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found