Explain plan can be misleading
See original GitHub issueI have a query as follows:
select count(*) from githubEvents where dateTrunc('YEAR', event_time) = '2016-01-01 00:00:00.0'
It produces a nonzero result:
count(*) |
---|
323650488 |
However, when I try an explain plan:
explain plan for select count(*) from githubEvents where dateTrunc('YEAR', event_time) = '2016-01-01 00:00:00.0'
The query plan picks a segment at random which should have been pruned, and doesn’t reflect the way the query is evaluated:
Operator | Operator_Id | Parent_Id |
---|---|---|
BROKER_REDUCE(limit:10) | 0 | -1 |
COMBINE_AGGREGATE | 1 | 0 |
FAST_FILTERED_COUNT | 2 | 1 |
FILTER_EMPTY | 3 | 2 |
It would be helpful if the plan chose a segment which has data, or queried all segments and merged operators when the operator varies according to segment.
Issue Analytics
- State:
- Created a year ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
Oracle SQL execution plan is incorrect - Burleson Consulting
Answer: It's proven that the standard relational "explain plan for" syntax can show a execution plan that is wrong, and you may not...
Read more >Explain Plan For command may show you the wrong ...
In Oracle-L mailing list a question was asked about under which conditions can the explain plan report a wrong execution plan (not the...
Read more >Explain plan cardinality and cost - Ask TOM
Cost is the estimated amount of work the plan will do. A higher cardinality => you're going to fetch more rows => you're...
Read more >Misleading Execution Plan | Oracle Scratchpad
A couple of weeks ago I published a note about an execution plan which showed the details of a scalar subquery in the...
Read more >Why does explain plan show the wrong number of rows?
This leaves the new table without optimizer object statistics, which you can verfiy with the following query that return only NULL s
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think the deepest child heuristic might be the right trade off between accuracy and whatever it is which prevents considering the entire query execution.
Fixed by https://github.com/apache/pinot/pull/8738