Skip to content

Sorting Algorithms

3.1 Merge Sort

Algorithm. Divide the array in half, recursively sort each half, then merge.

MergeSort(A, l, r):
if l < r:
m = (l + r) / 2
MergeSort(A, l, m)
MergeSort(A, m+1, r)
Merge(A, l, m, r)

Theorem 3.1. Merge sort runs in O(nlogn)O(n \log n) time in all cases (best, average, worst).

Proof. The recurrence is T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n). By the Master theorem (case 2): a=2a = 2, b=2b = 2, f(n)=O(n)=O(nlogba)f(n) = O(n) = O(n^{\log_b a})So T(n)=O(nlogn)T(n) = O(n \log n). \blacksquare

Theorem 3.2. Merge sort is stable and requires O(n)O(n) auxiliary space.

Proof of stability. In MergeWhen elements from the left and right halves are equal, we always take from the left half first. This preserves the relative order of equal elements. \blacksquare

Theorem 3.3 (Correctness of Merge). The Merge procedure produces a sorted array from two sorted subarrays.

Proof. Let A[l..m]A[l..m] and A[m+1..r]A[m+1..r] be sorted subarrays. The merge procedure maintains two indices ii and jj pointing to the next unmerged element in each subarray. At each step, the minimum of A[i]A[i] and A[j]A[j] is appended to the output. By induction on the number of elements placed: if kk elements have been placed and they are the kk smallest among the remaining elements, then the (k+1)(k+1)-th element placed is the minimum of the remaining elements. Since each subarray is sorted, the smallest remaining element is min(A[i],A[j])\min(A[i], A[j]). \blacksquare

Theorem 3.4 (Optimality of Merge Sort). Merge sort achieves the optimal O(nlogn)O(n \log n) comparison-based sorting bound.

Proof. By Theorem 3.7, no comparison sort can do better than Ω(nlogn)\Omega(n \log n). Merge sort achieves O(nlogn)O(n \log n) in all cases, so it is asymptotically optimal among comparison sorts. \blacksquare

Worked Example: Merge Sort Trace

Sort the array [38,27,43,3,9,82,10][38, 27, 43, 3, 9, 82, 10]:

Split: [38,27,43,3][38, 27, 43, 3] and [9,82,10][9, 82, 10]

Split: [38,27][38, 27] and [43,3][43, 3] Split: [38][38] and [27][27] → Merge: [27,38][27, 38] Split: [43][43] and [3][3] → Merge: [3,43][3, 43] Merge: [3,27,38,43][3, 27, 38, 43]

Split: [9,82][9, 82] and [10][10] Split: [9][9] and [82][82] → Merge: [9,82][9, 82] Merge: [9,10,82][9, 10, 82]

Final merge: [3,9,10,27,38,43,82][3, 9, 10, 27, 38, 43, 82]

Total comparisons: 13.

3.2 Quicksort

Algorithm. Choose a pivot, partition the array into elements \leq pivot and >> pivot, recursively Sort each partition.

QuickSort(A, lo, hi):
if lo < hi:
p = Partition(A, lo, hi)
QuickSort(A, lo, p - 1)
QuickSort(A, p + 1, hi)

Partition (Lomuto): Select the last element as pivot. Iterate through the array, maintaining that elements A[lo..i]A[\mathrm{lo}..i] are \leq pivot and A[i+1..j1]A[i+1..j-1] are >> pivot.

Theorem 3.5. Quicksort runs in O(nlogn)O(n \log n) expected time and O(n2)O(n^2) worst-case time.

Proof (expected case). If the pivot is chosen uniformly at random, the expected number of Comparisons is O(nlogn)O(n \log n) by an indicator random variable argument.

Let XijX_{ij} be the indicator random variable that ziz_i and zjz_j are compared, where z1,,znz_1, \ldots, z_n are the sorted elements. Since elements are compared only if one is an ancestor of the other in the recursion tree, and the pivot is chosen uniformly at random:

E[Xij]=Pr(zi and zj are compared)=2ji+1\mathrm{E}[X_{ij}] = \Pr(z_i \mathrm{~and~} z_j \mathrm{~are~compared}) = \frac{2}{j - i + 1}

The total number of comparisons is X=i<jXijX = \sum_{i < j} X_{ij}So:

E[X]=i=1n1j=i+1n2ji+1k=1nn2k+1=O(nlogn)\mathrm{E}[X] = \sum_{i=1}^{n-1} \sum_{j=i+1}^{n} \frac{2}{j - i + 1} \leq \sum_{k=1}^{n} n \cdot \frac{2}{k+1} = O(n \log n)

Worst case occurs when the pivot is always the smallest or largest element (e.g., already sorted Array with first-element pivot): T(n)=T(n1)+O(n)=O(n2)T(n) = T(n-1) + O(n) = O(n^2). \blacksquare

Theorem 3.6. Quicksort is not stable but uses O(logn)O(\log n) expected stack space.

Worked Example: Quicksort with Median-of-Three Pivot

Sort [3,7,8,5,2,1,9,5,4][3, 7, 8, 5, 2, 1, 9, 5, 4] using median-of-three (first, middle, last element).

Pivot selection: elements at positions 0, 4, 8 are 3,2,43, 2, 4. Median is 33.

Partition around 33: [2,1,3,8,5,7,9,5,4][2, 1, 3, 8, 5, 7, 9, 5, 4]

Recurse on [2,1][2, 1] (pivot 22): [1,2][1, 2] Recurse on [8,5,7,9,5,4][8, 5, 7, 9, 5, 4] (median of 8,7,48, 7, 4 is 77): [4,5,5,7,9,8][4, 5, 5, 7, 9, 8]

Recurse on [4,5,5][4, 5, 5] (pivot 55): [4,5,5][4, 5, 5] Recurse on [9,8][9, 8] (pivot 99): [8,9][8, 9]

Result: [1,2,3,4,5,5,7,8,9][1, 2, 3, 4, 5, 5, 7, 8, 9]

3.3 Heapsort

Algorithm. Build a max-heap in O(n)O(n) time, then repeatedly extract the maximum.

Theorem 3.7. Heapsort runs in O(nlogn)O(n \log n) time in all cases, uses O(1)O(1) auxiliary space, and Is not stable.

Proof. Building the heap takes O(n)O(n) by Theorem 2.11. Each of the nn extract-max operations takes O(logn)O(\log n)For a total of O(nlogn)O(n \log n). Space is O(1)O(1) since the heap is stored in-place. \blacksquare

3.4 Lower Bound on Comparison Sorting

Theorem 3.8. Any comparison-based sorting algorithm requires Ω(nlogn)\Omega(n \log n) comparisons in the Worst case.

Proof. A comparison-based sort can be modelled as a decision tree. The tree must have at least n!n! Leaves (one for each permutation). A binary tree with n!n! leaves has height at least log2(n!)log2((n/2)n/2)=(n/2)log2(n/2)=Ω(nlogn)\log_2(n!) \geq \log_2((n/2)^{n/2}) = (n/2)\log_2(n/2) = \Omega(n \log n).

More precisely, using Stirling”s approximation: log2(n!)=nlog2nnlog2e+O(logn)=Ω(nlogn)\log_2(n!) = n \log_2 n - n \log_2 e + O(\log n) = \Omega(n \log n). \blacksquare

3.5 Counting Sort

Counting sort sorts integers in O(n+k)O(n + k) time where kk is the range of values.

Algorithm:

  1. Count occurrences: C[i]=C[i] = number of elements equal to ii.
  2. Compute prefix sums: C[i]=C[i] = number of elements i\leq i.
  3. Place each element in its correct position (iterating backwards for stability).

Theorem 3.9. Counting sort runs in O(n+k)O(n + k) time and O(n+k)O(n + k) space, and is stable.

Proof. Step 1 takes O(n)O(n) time. Step 2 takes O(k)O(k) time. Step 3 takes O(n)O(n) time. The output array uses O(n)O(n) space and the count array uses O(k)O(k) space. Stability follows from iterating backwards in step 3: the last occurrence of each value is placed first, preserving the order of equal elements. \blacksquare

3.6 Radix Sort

Radix sort processes digits from least significant to most significant (LSD radix sort) using a stable sort (e.g., counting sort) as a subroutine.

Theorem 3.10. LSD radix sort sorts nn integers with dd digits in base bb in O(d(n+b))O(d(n + b)) time.

Proof. We perform dd passes of counting sort, each taking O(n+b)O(n + b) time. After the ii-th pass, the array is sorted by the ii least significant digits. By induction on iiAfter dd passes the array is fully sorted. \blacksquare

Corollary 3.11. For dd-digit integers where d=O(1)d = O(1) (e.g., 32-bit integers), radix sort runs in O(n)O(n) time.

Worked Example: Counting Sort Trace

Sort the array [4,2,2,8,3,3,1][4, 2, 2, 8, 3, 3, 1] using counting sort.

Range of values: [1,8][1, 8]So k=8k = 8.

Step 1 — Count: C=[0,1,2,2,1,0,0,0]C = [0, 1, 2, 2, 1, 0, 0, 0] (indices 1 through 8).

Step 2 — Prefix sums: C=[0,1,3,5,6,6,6,6]C = [0, 1, 3, 5, 6, 6, 6, 6].

Step 3 — Place (iterate backwards):

  • A[6]=1A[6] = 1: C[1]=1C[1] = 1Place at position 0. C[1]=0C[1] = 0.
  • A[5]=3A[5] = 3: C[3]=5C[3] = 5Place at position 4. C[3]=4C[3] = 4.
  • A[4]=3A[4] = 3: C[3]=4C[3] = 4Place at position 3. C[3]=3C[3] = 3.
  • A[3]=8A[3] = 8: C[8]=6C[8] = 6Place at position 5. C[8]=5C[8] = 5.
  • A[2]=2A[2] = 2: C[2]=3C[2] = 3Place at position 2. C[2]=2C[2] = 2.
  • A[1]=2A[1] = 2: C[2]=2C[2] = 2Place at position 1. C[2]=1C[2] = 1.
  • A[0]=4A[0] = 4: C[4]=6C[4] = 6Place at position 5. C[4]=5C[4] = 5.

Result: [1,2,2,3,3,4,8][1, 2, 2, 3, 3, 4, 8].

Worked Example: Radix Sort Trace

Sort the array [170,45,75,90,802,24,2,66][170, 45, 75, 90, 802, 24, 2, 66] using LSD radix sort with base 10.

Pass 1 (ones digit): Sort by 0,5,5,0,2,4,2,60, 5, 5, 0, 2, 4, 2, 6. After stable counting sort: [170,90,802,2,24,45,75,66][170, 90, 802, 2, 24, 45, 75, 66]

Pass 2 (tens digit): Sort by 7,9,0,0,2,4,7,67, 9, 0, 0, 2, 4, 7, 6. After stable counting sort: [802,2,24,45,66,170,75,90][802, 2, 24, 45, 66, 170, 75, 90]

Pass 3 (hundreds digit): Sort by 8,0,0,0,0,1,0,08, 0, 0, 0, 0, 1, 0, 0. After stable counting sort: [2,24,45,66,75,90,170,802][2, 24, 45, 66, 75, 90, 170, 802]

Total: 3 passes, each O(n+10)=O(n)O(n + 10) = O(n). Total: O(3n)=O(n)O(3n) = O(n).

3.7 External Sorting

Problem. Sort data that is too large to fit in main memory.

Algorithm (External Merge Sort):

  1. Read chunks that fit in memory, sort each in memory, write sorted runs to disk.
  2. Repeatedly merge runs using a kk-way merge, reading/writing from disk.

Theorem 3.12. External merge sort with MM bytes of memory and NN bytes of data performs O(NMlogk(NM))O\left(\frac{N}{M} \log_{k}\left(\frac{N}{M}\right)\right) passes, where kk is the merge fan-in.

Proof. The initial pass creates N/MN/M sorted runs of size MM. Each merge pass combines kk runs into 1, reducing the number of runs by a factor of kk. After logk(N/M)\log_k(N/M) passes, a single sorted run remains. Each pass reads and writes all NN bytes, so total I/O is O(Nlogk(N/M))O(N \log_k(N/M)). \blacksquare

:::caution Common Pitfall The O(nlogn)O(n \log n) lower bound applies only to comparison-based sorting. Non-comparison sorts Like radix sort can achieve O(n)O(n) time for integers in a bounded range. However, non-comparison sorts sacrifice generality: they depend on the structure of the keys and cannot sort arbitrary objects.

3.8 Comparison of Sorting Algorithms

AlgorithmBestAverageWorstSpaceStable
Merge SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n)O(n)Yes
QuicksortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n2)O(n^2)O(logn)O(\log n)No
HeapsortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(nlogn)O(n \log n)O(1)O(1)No
Counting SortO(n+k)O(n + k)O(n+k)O(n + k)O(n+k)O(n + k)O(n+k)O(n + k)Yes
Radix SortO(d(n+b))O(d(n+b))O(d(n+b))O(d(n+b))O(d(n+b))O(d(n+b))O(n+b)O(n + b)Yes

:::