DISTINCT function broken since 3.1.3
See original GitHub issueCrateDB version: 3.1.3
Environment description: any environment
Problem description: DISTINCT function for selects is not working since 3.1.3. Results are not unique whenever selecting more than one field in the query. Limiting DISTINCT with brackets, i.e., DISTINCT(field), does not help.
Steps to reproduce:
-
Import sample data (tweets).
-
Run queries:
select distinct created_at, id from tweets limit 100;
yields all tweets.select distinct created_at from tweets limit 100;
yields only a few tweets. -
Run same queries for 3.1.2 and both queries yield the same results.
Issue Analytics
- State:
- Created 5 years ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Entity Framework select distinct name - Stack Overflow
Using lambda expression.. var result = EFContext.TestAddresses.Select(m => m.Name).Distinct();. Another variation using where, var result = EFContext.
Read more >Distinct function in Power Apps - Microsoft Learn
The Distinct function evaluates a formula across each record of a table and returns a one-column table of the results with duplicate values...
Read more >5 easy ways to extract Unique Distinct Values - Get Digital Help
One of those new functions is the UNIQUE function, it allows you to easily extract a unique distinct list using only one function....
Read more >PySpark Groupby Count Distinct - Spark by {Examples}
In this article, I will explain how to count distinct values of the column after groupBy() in PySpark Dataframe. 1. Quick Examples of...
Read more >COUNT DISTINCT and COUNT UNIQUE functions - IBM
If the COUNT DISTINCT function encounters NULL values, it ignores them unless every value in the specified column is NULL. If every column...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@hnstbndr Ok thanks, we’ll look into this asap.
I did some digging this morning. Below the tables and data from my experiment.
The query is:
If
c_comment.id
is included in the selectc_user.id
is not unique in the returned set (independent from the crate version). In my real query I use_score
in the select statement. And I think this is the problem. For crate 3.1.2_score
was not unique in the returned set and thus crate was able to consolidate it. When I switched to crate 3.1.3_score
was unique and data was not consolidated. I have the same data in both versions… Thus, I thought there is an issue 😃 Turns out there is no issue. Closing this issue!Apologies for the delay…