argsort/sort with O(n) algorithms: counting-sort, radix-sort
See original GitHub issueYou can already sort with couting-sort using numpy: as stated in an SO answer, it is simply:
np.repeat(np.arange(1+x.max()), np.bincount(x))
But there is no way (as far as I can tell) to do an argsort.
I think it would be fairly easy to implement and slot into the argsort/sort documentation. The benefit of counting sort is that it’s O(n), although it only works for not-too-large non-negative integers…however those are actually quite common in the real world! Pandas uses it behind the scenes apparently.
Issue Analytics
- State:
- Created 8 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Radix Sort - GeeksforGeeks
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in the range from 1 to k....
Read more >Radix Sort Algorithm, Examples & Problems - Interview Kickstart
This article will show you how counting sort is used in radix sort, but we encourage you to brush up on the counting...
Read more >Radix Sort (With Code in Python, C++, Java and C) - Programiz
Radix sort is a sorting algorithm that sorts the elements by first grouping the individual digits of the same place value. Then, sort...
Read more >What Is Radix Sort Algorithm: Pseudocode, Time Complexity ...
Radix Sort is a linear sorting algorithm. Radix Sort's time complexity of O(nd), where n is the size of the array and d...
Read more >Radix Sort Algorithm Introduction in 5 Minutes - YouTube
Radix sort algorithm introduction with a simple example.Also check out my video on counting sort : https://youtu.be/OKd534EWcdk.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Everyone loves a super fast sort!
In terms of API, why not keep the old
kind
kwarg for backward compatibility, but then add other keywords to control each requirement/hint. Any/all hint kwargs could be ignored in initial/future implementations:stable
-True
/False
.nearly_sorted
-True
/False
small_int_range
-True
/False
small_float_set
-True
/False
etc.There are good implementations of radix/counting sort/argsort in this repo: https://github.com/eloj/radix-sorting
O(n) integer sort is important for PyData/Sparse as a lot of time is spent on integer sorting in many operations. See pydata/sparse#52. It’s also important to Zarr (zarr-developers/zarr#236)
Ideally, radix sort should be the default sort for integer arrays. The reason is that it’s O(n) instead of O(n log n). Benchmarks in the repository I linked show as much as a 7x speed up for moderately sized arrays.
Radix sort is also stable, which means it doesn’t change the relative ordering of any equal elements.
The only downside to radix sort is that it isn’t in-place: it needs a temporary array to get the work done.