question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

argsort/sort with O(n) algorithms: counting-sort, radix-sort

See original GitHub issue

You can already sort with couting-sort using numpy: as stated in an SO answer, it is simply:

np.repeat(np.arange(1+x.max()), np.bincount(x))

But there is no way (as far as I can tell) to do an argsort.

I think it would be fairly easy to implement and slot into the argsort/sort documentation. The benefit of counting sort is that it’s O(n), although it only works for not-too-large non-negative integers…however those are actually quite common in the real world! Pandas uses it behind the scenes apparently.

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
d1mansoncommented, Jul 6, 2015

Everyone loves a super fast sort!

In terms of API, why not keep the old kind kwarg for backward compatibility, but then add other keywords to control each requirement/hint. Any/all hint kwargs could be ignored in initial/future implementations:

  • stable - True/False.
  • nearly_sorted - True/False
  • small_int_range - True/False
  • small_float_set - True/False etc.
0reactions
hameerabbasicommented, Nov 9, 2018

There are good implementations of radix/counting sort/argsort in this repo: https://github.com/eloj/radix-sorting

O(n) integer sort is important for PyData/Sparse as a lot of time is spent on integer sorting in many operations. See pydata/sparse#52. It’s also important to Zarr (zarr-developers/zarr#236)

Ideally, radix sort should be the default sort for integer arrays. The reason is that it’s O(n) instead of O(n log n). Benchmarks in the repository I linked show as much as a 7x speed up for moderately sized arrays.

Radix sort is also stable, which means it doesn’t change the relative ordering of any equal elements.

The only downside to radix sort is that it isn’t in-place: it needs a temporary array to get the work done.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Radix Sort - GeeksforGeeks
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in the range from 1 to k....
Read more >
Radix Sort Algorithm, Examples & Problems - Interview Kickstart
This article will show you how counting sort is used in radix sort, but we encourage you to brush up on the counting...
Read more >
Radix Sort (With Code in Python, C++, Java and C) - Programiz
Radix sort is a sorting algorithm that sorts the elements by first grouping the individual digits of the same place value. Then, sort...
Read more >
What Is Radix Sort Algorithm: Pseudocode, Time Complexity ...
Radix Sort is a linear sorting algorithm. Radix Sort's time complexity of O(nd), where n is the size of the array and d...
Read more >
Radix Sort Algorithm Introduction in 5 Minutes - YouTube
Radix sort algorithm introduction with a simple example.Also check out my video on counting sort : https://youtu.be/OKd534EWcdk.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found