question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for select distinct on(a, b, c) ...

See original GitHub issue

The issue in Presto is that on one side, one can’t use

select distinct on (a, b)
  c
from d

but one also cannot use:

select
  c
from d
group by a, b

Combining these two limitations together, makes deduplicating rows a relatively cumbersome process, needing resorting either to subqueries with window functions and retrieving the row number, or to array aggregations. Again, a lot of context to be carried over, a complexity which adds up exponentially as more elements get in, and much more error-prone than either of the cleaner solutions above.

Postgres implementation of select distinct on is very straightforward and even allows for custom sorting, e.g:

select distinct on (a, b)
  c
from d
order by
  e desc,
  f asc

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:9
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

9reactions
Israel-Klicommented, Apr 22, 2020

@JivanRoquet The solution from here work for me on Athena:

SELECT Name, MAX(Address), MAX(other field)...
FROM MyTable
GROUP BY Name

Will give you one row per Name.

https://stackoverflow.com/a/6792357/9225626

4reactions
NicolasGuarycommented, Apr 21, 2020

Found a solution from https://redshift-support.matillion.com/s/article/2822021

ROW_NUMBER() OVER ( PARTITION BY <<unique columns>> ORDER BY <<sort columns>>) as counts

And then select where counts=1 only.

Hope this can help

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to select distinct values from query results in PostgreSQL
In this post, we are going to see how to select distinct values from SQL queries/statements. One of the easiest ways to select...
Read more >
What's the equivalent of DISTINCT ON in Snowflake ?
I come from postgres where it's relatively easy to eliminate duplicates on multiple columns by using DISTINCT ON (col1, col2, .
Read more >
sql - Using distinct on a column and doing order by on another ...
If we add the DISTINCT operation, it would need to be added between SELECT and ORDER BY : FROM abc_test; SELECT n_num, k_str...
Read more >
postgresql - SELECT DISTINCT ON, ordered by another column
SELECT * FROM ( SELECT DISTINCT ON (col1) col1, col2, col3 FROM test ORDER BY col1, ... but it keeps all rows after...
Read more >
to_char and distinct - Oracle Communities
hi.. i want to retrive records from a table having a number column type .. say the table name is xyz and column...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found