Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for select distinct on(a, b, c) ...

See original GitHub issue

The issue in Presto is that on one side, one can’t use

select distinct on (a, b)
  c
from d

but one also cannot use:

select
  c
from d
group by a, b

Combining these two limitations together, makes deduplicating rows a relatively cumbersome process, needing resorting either to subqueries with window functions and retrieving the row number, or to array aggregations. Again, a lot of context to be carried over, a complexity which adds up exponentially as more elements get in, and much more error-prone than either of the cleaner solutions above.

Postgres implementation of select distinct on is very straightforward and even allows for custom sorting, e.g:

select distinct on (a, b)
  c
from d
order by
  e desc,
  f asc

Issue Analytics

State:
Created 4 years ago
Reactions:9
Comments:9 (2 by maintainers)

Top GitHub Comments

9reactions

Israel-Klicommented, Apr 22, 2020

@JivanRoquet The solution from here work for me on Athena:

SELECT Name, MAX(Address), MAX(other field)...
FROM MyTable
GROUP BY Name

Will give you one row per Name.

https://stackoverflow.com/a/6792357/9225626

4reactions

NicolasGuarycommented, Apr 21, 2020

Found a solution from https://redshift-support.matillion.com/s/article/2822021

ROW_NUMBER() OVER ( PARTITION BY <<unique columns>> ORDER BY <<sort columns>>) as counts

And then select where counts=1 only.

Hope this can help

Top Results From Across the Web

How to select distinct values from query results in PostgreSQL

In this post, we are going to see how to select distinct values from SQL queries/statements. One of the easiest ways to select...

What's the equivalent of DISTINCT ON in Snowflake ?

I come from postgres where it's relatively easy to eliminate duplicates on multiple columns by using DISTINCT ON (col1, col2, .

sql - Using distinct on a column and doing order by on another ...

If we add the DISTINCT operation, it would need to be added between SELECT and ORDER BY : FROM abc_test; SELECT n_num, k_str...

postgresql - SELECT DISTINCT ON, ordered by another column

SELECT * FROM ( SELECT DISTINCT ON (col1) col1, col2, col3 FROM test ORDER BY col1, ... but it keeps all rows after...

to_char and distinct - Oracle Communities

hi.. i want to retrive records from a table having a number column type .. say the table name is xyz and column...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Add support for select distinct on(a, b, c) ...

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

How to connect with elastic search using presto when elasticsearch run on remote server

Support ROW comparison for fields with NULL elements