map_top_n udf proposal
See original GitHub issueDescription: This function takes a map as input and returns the top n keys based on their values as a map. This feature is used frequently in a lot of scenarios so having a udf would help write simpler queries in short time.
map_top_n(map(K, V), integer n) -> map(K, V)
SELECT map_top_n(MAP(ARRAY[1, 2, 3, 4], ARRAY[120, 930, 200, 301]), 2); -- {2 -> 930, 4 -> 301}
SELECT map_top_n(MAP(ARRAY[1, 2, 3, 4], ARRAY['abc', 'abcd', 'efg', 'aaa']), 2); -- {2 -> 'abcd', 3 -> 'efg'}
SELECT map_top_n(null, 2); -- null
SELECT map_top_n(MAP(ARRAY[1, 2, 3], ARRAY[200, 100, 100]), 2); -- {1 -> 200, 2 -> 100}
SELECT map_top_n(MAP(ARRAY[1, 2], ARRAY[200, 100]), 4); -- {1 -> 200, 2 -> 100}
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
BigDansing: A System for Big Data Cleansing - ResearchGate
In this paper, we present BigDansing, a Big Data Cleansing system to tackle efficiency, scalability, and ease-of-use issues in data cleansing.
Read more >Large Scale Fuzzy Name Matching with a Custom ML Pipeline ...
In this talk, we will introduce how we use a Spark custom ML pipeline and Structured Streaming to build fuzzy name matching products...
Read more >1 Abk ¨urzungen - BSCW Shared Workspace Server - Yumpu
DCTP Developmental Certification Test Plan. DCU Data Cache Unit ... MAPTOP MAP Technical Office Protocol ... UDF Uniqueness Database File.
Read more >VERA (Virtual Entity of Relevant Acronyms) - The Well
DP: Data Processing; DP: Detection Point (IN); DP: Draft Proposal (ISO) ... FID: File Identifier Descriptor (UDF, CD-R) ...
Read more >dotfiles/acronyms at master - GitHub
AVPD Anchor Volume Descriptor Pointer (CD-R, UDF, ISO 9660) ... IPES Improved Proposed Encryption Standard (IDEA, PES, cryptography).
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
map_filter_n
is a very strange concept. The lambda is not really defining the filter, but a sorting order. So what you really want ismap_first_n(map, (k1, v1, k2, v2) -> ..., n)
, which is basically massaging all parameters inmap_from_entries(slice(array_sort(map_entries(map), v -> ...), 1, n))
into one function. I’m not sure how appealing is that. Once we support SQL expression functions (#9613), this can be added as a SQL expression function instead. A more generic need for map with lambda function is probably amap_reduce
similar to the arrayreduce
function. It allows you to keep a state while looping through entries. Though in this case the state would be quite complicated. It’s unlikely to have an implementation with lambda that’s more readable or performant than converting to array, sort, slice then convert back.Yes that definitely works ! But just curious as to why not have a udf similar to map_filter exclusively for handling maps.