Non-comparison sorting

Counting sort

  • Be careful. not x is True when x is 0, not just when x is None.

  • count must have length (max value + 1) not max value

  • Be careful about arr and count and check the indices are matched up and in range

  • You can extend this implementation to negative numbers by finding the min and the max and mapping the values into the positive range.

  • Decrementing from the count array as you overwrite the input array is a little nicer than creating another counter integer.

  • Counting sort with negative numbers take some additional care (https://stackoverflow.com/questions/40476521/using-counting-sort-with-negative-values-descending-order)

def counting_sort(arr):
    max_value = None
    for i in range(len(arr)):
        if max_value is None or arr[i] > max_value:
            max_value = arr[i]

    count = [0]*(max_value + 1)

    # Be careful about arr and count
    # and what i is an index to.
    for i in range(len(arr)):
        count[arr[i]] += 1

    k = 0
    for i in range(len(count)):
        while count[i] > 0:
            arr[k] = i
            count[i] -= 1
            k += 1

The last part that converts the count array to the sorted array doesn’t work if

  • The elements have associated data that we want to copy along with the key.

  • The key (what we’re sorting by) is not the same as the data (as in radix sort).

An alternative for that last part that does work with radix sort is:

for i in range(1, len(count)):
    count[i] += count[i-1]

output[0]*len(arr)
# Don't forget to go in reverse
for i in range(len(arr)-1, -1, -1):
    # count[arr[i]-1 gives the index of output not arr
    output[count[arr[i]]-1] = arr[i]
    count[arr[i]] -= 1

for i in range(len(arr)):
  arr[i] = output[i]

Bucket sort

  • arr.sort() is in-place. sorted(arr) returns sorted array but doesn’t change arr.

  • Best for uniformly sorted data

  • Either no. of buckets is given or pick len(arr) buckets. If the latter, won’t that mean 1 element in each bucket? No, because what element goes in each bucket depends on its value and we may get unlucky and only get very small values.

def bucket_sort(arr, k):
    buckets = []
    for _ in range(k):
        buckets.append([])

    max_value = max(arr)
    for i in range(len(arr)):
        j = int((arr[i] / max_value) * (k - 1))
        buckets[j].append(arr[i])

    for j in range(len(buckets)):
        buckets[j].sort()

    k = 0
    for i in range(len(buckets)):
        for j in range(len(buckets[i])):
            arr[k] = buckets[i][j]
            k += 1

Radix sort

When you have a very large number in your array, you can’t use counting sort, because of how large the array of counts would be. But you can still use radix sort.

In radix sort, you find the largest number in the array and you determine the place of its largest digit (1s or 10s or 100s, etc). Then you sort the whole array by the digit of each number in the 1s place. Then you sort the whole array by the digit of each number in the 10s place. And so on until you’ve sorted by the digit of each number in the place of the largest digit of the largest number in the array.

For sorting, you have to use a stable array, because otherwise the work you did to sort by digits in lower places would be erased.

Why not sort by the digits in the 1s place last? Then the array would not be sorted at the end.\newline

What stable sorting to use? A stable counting sort is one option. First, you populate the count array like in vanilla counting sort. Then, you replace the count array with the cumulative sum of the counts. The property of this array is that if \(c_i \ne 0\), then \(c_i - 1\) gives you an index where you should place \(a_i\) and \(c_i - 2\) gives you another index and so on until \(c_i = 0\).

0 0 1 0 2 0
0 0 1 1 3 3
x x   x   x
def counting_sort(arr, place):
    count = [0]*10
    for i in range(len(arr)):
        # Example:
        # 1024 = (1*1000 + 0*100 + 2*10 + 4*1)
        # 1024 // 100 = (1*100 + 0*1 + 0 + 0)
        # (arr[i] // place) zeroes out the lower placeholders
        # and makes the target placeholder the 1s place.
        # We just need to get the coefficient on the 1s place.
        # We can do that by taking the number mod 10.
        digit = (arr[i] // place) % 10

        count[digit] += 1

    for i in range(1, len(count)):
        count[i] += count[i-1]

    output = [0]*len(arr)
    for i in range(len(arr)-1, -1, -1):
        digit = (arr[i] // place) % 10
      # remember the index of count is a digit
        output[count[digit]-1] = arr[i]
        count[digit] -= 1

    for i in range(len(output)):
        arr[i] = output[i]

def radix(arr):
    # The max place across each number in the array
    # is equal to the max place of the max number
    # in the array.
    m = max(arr)

    place = 1

    # Integer division divides and drops the remainder.
    # Example:
    # 1024 // 1 = 1024
    # 1024 // 10 = 102
    # 1024 // 100 = 10
    # 1024 // 1000 = 1
    # 1024 // 10000 = 0
    while m // place != 0:
        counting_sort(arr, place)
        place *= 10