Sorting Algorithms
3.1 Merge Sort
Algorithm. Divide the array in half, recursively sort each half, then merge.
MergeSort(A, l, r): if l < r: m = (l + r) / 2 MergeSort(A, l, m) MergeSort(A, m+1, r) Merge(A, l, m, r)Theorem 3.1. Merge sort runs in time in all cases (best, average, worst).
Proof. The recurrence is . By the Master theorem (case 2): , , So .
Theorem 3.2. Merge sort is stable and requires auxiliary space.
Proof of stability. In MergeWhen elements from the left and right halves are equal, we always take from the left half first. This preserves the relative order of equal elements.
Theorem 3.3 (Correctness of Merge). The Merge procedure produces a sorted array from two sorted subarrays.
Proof. Let and be sorted subarrays. The merge procedure maintains two indices and pointing to the next unmerged element in each subarray. At each step, the minimum of and is appended to the output. By induction on the number of elements placed: if elements have been placed and they are the smallest among the remaining elements, then the -th element placed is the minimum of the remaining elements. Since each subarray is sorted, the smallest remaining element is .
Theorem 3.4 (Optimality of Merge Sort). Merge sort achieves the optimal comparison-based sorting bound.
Proof. By Theorem 3.7, no comparison sort can do better than . Merge sort achieves in all cases, so it is asymptotically optimal among comparison sorts.
Worked Example: Merge Sort Trace
Sort the array :
Split: and
Split: and Split: and → Merge: Split: and → Merge: Merge:
Split: and Split: and → Merge: Merge:
Final merge:
Total comparisons: 13.
3.2 Quicksort
Algorithm. Choose a pivot, partition the array into elements pivot and pivot, recursively Sort each partition.
QuickSort(A, lo, hi): if lo < hi: p = Partition(A, lo, hi) QuickSort(A, lo, p - 1) QuickSort(A, p + 1, hi)Partition (Lomuto): Select the last element as pivot. Iterate through the array, maintaining that elements are pivot and are pivot.
Theorem 3.5. Quicksort runs in expected time and worst-case time.
Proof (expected case). If the pivot is chosen uniformly at random, the expected number of Comparisons is by an indicator random variable argument.
Let be the indicator random variable that and are compared, where are the sorted elements. Since elements are compared only if one is an ancestor of the other in the recursion tree, and the pivot is chosen uniformly at random:
The total number of comparisons is So:
Worst case occurs when the pivot is always the smallest or largest element (e.g., already sorted Array with first-element pivot): .
Theorem 3.6. Quicksort is not stable but uses expected stack space.
Worked Example: Quicksort with Median-of-Three Pivot
Sort using median-of-three (first, middle, last element).
Pivot selection: elements at positions 0, 4, 8 are . Median is .
Partition around :
Recurse on (pivot ): Recurse on (median of is ):
Recurse on (pivot ): Recurse on (pivot ):
Result:
3.3 Heapsort
Algorithm. Build a max-heap in time, then repeatedly extract the maximum.
Theorem 3.7. Heapsort runs in time in all cases, uses auxiliary space, and Is not stable.
Proof. Building the heap takes by Theorem 2.11. Each of the extract-max operations takes For a total of . Space is since the heap is stored in-place.
3.4 Lower Bound on Comparison Sorting
Theorem 3.8. Any comparison-based sorting algorithm requires comparisons in the Worst case.
Proof. A comparison-based sort can be modelled as a decision tree. The tree must have at least Leaves (one for each permutation). A binary tree with leaves has height at least .
More precisely, using Stirling”s approximation: .
3.5 Counting Sort
Counting sort sorts integers in time where is the range of values.
Algorithm:
- Count occurrences: number of elements equal to .
- Compute prefix sums: number of elements .
- Place each element in its correct position (iterating backwards for stability).
Theorem 3.9. Counting sort runs in time and space, and is stable.
Proof. Step 1 takes time. Step 2 takes time. Step 3 takes time. The output array uses space and the count array uses space. Stability follows from iterating backwards in step 3: the last occurrence of each value is placed first, preserving the order of equal elements.
3.6 Radix Sort
Radix sort processes digits from least significant to most significant (LSD radix sort) using a stable sort (e.g., counting sort) as a subroutine.
Theorem 3.10. LSD radix sort sorts integers with digits in base in time.
Proof. We perform passes of counting sort, each taking time. After the -th pass, the array is sorted by the least significant digits. By induction on After passes the array is fully sorted.
Corollary 3.11. For -digit integers where (e.g., 32-bit integers), radix sort runs in time.
Worked Example: Counting Sort Trace
Sort the array using counting sort.
Range of values: So .
Step 1 — Count: (indices 1 through 8).
Step 2 — Prefix sums: .
Step 3 — Place (iterate backwards):
- : Place at position 0. .
- : Place at position 4. .
- : Place at position 3. .
- : Place at position 5. .
- : Place at position 2. .
- : Place at position 1. .
- : Place at position 5. .
Result: .
Worked Example: Radix Sort Trace
Sort the array using LSD radix sort with base 10.
Pass 1 (ones digit): Sort by . After stable counting sort:
Pass 2 (tens digit): Sort by . After stable counting sort:
Pass 3 (hundreds digit): Sort by . After stable counting sort:
Total: 3 passes, each . Total: .
3.7 External Sorting
Problem. Sort data that is too large to fit in main memory.
Algorithm (External Merge Sort):
- Read chunks that fit in memory, sort each in memory, write sorted runs to disk.
- Repeatedly merge runs using a -way merge, reading/writing from disk.
Theorem 3.12. External merge sort with bytes of memory and bytes of data performs passes, where is the merge fan-in.
Proof. The initial pass creates sorted runs of size . Each merge pass combines runs into 1, reducing the number of runs by a factor of . After passes, a single sorted run remains. Each pass reads and writes all bytes, so total I/O is .
:::caution Common Pitfall The lower bound applies only to comparison-based sorting. Non-comparison sorts Like radix sort can achieve time for integers in a bounded range. However, non-comparison sorts sacrifice generality: they depend on the structure of the keys and cannot sort arbitrary objects.
3.8 Comparison of Sorting Algorithms
| Algorithm | Best | Average | Worst | Space | Stable |
|---|---|---|---|---|---|
| Merge Sort | Yes | ||||
| Quicksort | No | ||||
| Heapsort | No | ||||
| Counting Sort | Yes | ||||
| Radix Sort | Yes |
:::