question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Broaden the join types that can be used with UNNEST.

See original GitHub issue

This is the first time I’ve raised an issue in this repo, please let me know if this should be raised elsewhere / stated differently / other pre-requirements.

TL;DR: When using UNNEST, joins other than CROSS JOIN should be available to limit the rows being returned.

When expanding an array (using UNNEST and CROSS JOIN), it is often necessary to limit the result set for each row with a subsequent where clause. This can become a performance issue when the expanded array has many elements in it and there are many rows being unnested and returned before then being discarded.

Currently this has to be done as follows:

-- we only want items below 50
select  name, item
from    (
        values
          ('Sarah', sequence(1, 10000000, 1)),
          ('Jose', Array[1, 10000000, 1])
        ) as results (name, items)
cross join unnest(items) as t (item)
where   item < 50;

I want to be able to write:

-- we only want items below 50
select  name, item
from    (
        values
          ('Sarah', sequence(1, 10000000, 1)),
          ('Jose', Array[1, 10000000, 1])
        ) as results (name, items)
left join unnest(items) as t (item) on item < 50;

or with inner join.

Executing the above statement produces the following error: UNNEST on other than the right side of CROSS JOIN is not supported.

Is there a good reason for this restriction? I can’t find any discussion of this in the issues or on other forums and would suggest that the expected behaviour of using different joins is clear and should be supported.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:16
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

6reactions
CONSULTANTSFcommented, May 27, 2020

Any chance that LEFT JOIN UNNEST will be added to Presto?

2reactions
rongrongcommented, Oct 15, 2019

@JonNorman The ON part of JOIN defines the condition of the join. I don’t think SQL spec necessarily dictates the implementation. “Instead of expanding every row and then filtering it out, we would only be joining to specific values in the array and then returning those.” is an implementation optimization. We can potentially do this optimization without changing semantics. Theoretically query engine can decide that we can push down item < 50 to UNNEST in

select  name, item
from    (
        values
          ('Sarah', sequence(1, 10000000, 1)),
          ('Jose', Array[1, 10000000, 1])
        ) as results (name, items)
cross join unnest(items) as t (item)
where   item < 50;

As long as the query engine is smart enough. Presto currently has a quite naive (and inefficient) implementation of UNNEST. Whether we should optimize its performance should be discussed orthogonally from whether we should support more join types.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using UNNEST with a JOIN - sql - Stack Overflow
So what I want to do is to use the result of the SELECT unnest(taglist) in a JOIN query to compensate for the...
Read more >
PostgreSQL Unnest Function: Syntax & 6 Essential Queries
The PostgreSQL unnest() function allows you to expand one array at a time and also expand many arrays (of potentially different data types)...
Read more >
Unnest Operator Performance Enhancement with Dictionary ...
Consider the following CROSS JOIN UNNEST query on a table with one VARCHAR type and one ARRAY(VARCHAR) type columns. Elements of name column ......
Read more >
How to use the UNNEST function in BigQuery to analyze ...
It basically lets you take elements in an array and expand each one of these individual elements. You can then join your original...
Read more >
Optimizing unnesting queries with the UNNEST clause
Covering Index includes all the columns, the query refers to in the SELECT, JOIN, and WHERE clauses. If the UNNEST clause is not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found