Algorithm Design
1. Divide and Conquer
1.1 Strategy
- Divide the problem into smaller subproblems.
- Conquer each subproblem recursively.
- Combine solutions to form the final answer.
1.2 Merge Sort
MERGE_SORT(A, l, r): if l < r: m = floor((l + r) / 2) MERGE_SORT(A, l, m) MERGE_SORT(A, m+1, r) MERGE(A, l, m, r)
MERGE(A, l, m, r): n1 = m - l + 1; n2 = r - m L = A[l..m]; R = A[m+1..r] i = j = 0; k = l while i < n1 and j < n2: if L[i] <= R[j]: A[k++] = L[i++] else: A[k++] = R[j++] while i < n1: A[k++] = L[i++] while j < n2: A[k++] = R[j++]Recurrence: . By Master theorem, .
Stable sort (preserves relative order of equal elements). Requires extra space.
1.3 Quicksort
QUICKSORT(A, l, r): if l < r: p = PARTITION(A, l, r) QUICKSORT(A, l, p-1) QUICKSORT(A, p+1, r)
PARTITION(A, l, r): pivot = A[r] i = l - 1 for j = l to r-1: if A[j] <= pivot: i += 1 swap A[i], A[j] swap A[i+1], A[r] return i + 1Best/Average case: . Worst case: (already sorted, bad pivot).
Randomized pivot: Pick pivot randomly to get expected .
Not stable. In-place (no extra memory beyond recursion stack).
1.4 Strassen”s Matrix Multiplication
Multiply two matrices faster than the naive .
Idea: Divide each matrix into four blocks. Compute 7 products (instead of 8) and combine.
Recurrence: . Solution: .
1.5 Master Theorem
For recurrences of the form where , :
Compare with :
| Case | Condition | Solution |
|---|---|---|
| 1 | for | |
| 2 | , | |
| 3 | and regularity |
Regularity condition (Case 3): for some .
Examples:
| Recurrence | Case | Result | ||||
|---|---|---|---|---|---|---|
| 2 | 2 | 2 | ||||
| 4 | 2 | 1 | ||||
| 4 | 2 | 2 (k=1) | ||||
| 2 | 2 | 3 |
2. Greedy Algorithms
2.1 Strategy
Make the locally optimal choice at each step. Does not always produce a globally optimal solution; must prove correctness.
Greedy choice property: A globally optimal solution can be constructed by making locally optimal choices.
Optimal substructure: An optimal solution contains optimal solutions to subproblems.
2.2 Activity Selection
Given activities with start times and finish times , select the maximum number of non-overlapping activities.
ACTIVITY_SELECT(s, f): sort activities by increasing finish time A = {1} // select first activity k = 1 for i = 2 to n: if s[i] >= f[k]: A = A ∪ {i} k = i return ATime: for sorting, for selection. Optimal.
2.3 Huffman Coding
Build an optimal prefix-free code for minimizing expected code length.
HUFFMAN(C): n = |C| Q = MIN_PRIORITY_QUEUE(C) // keyed by frequency for i = 1 to n-1: x = EXTRACT_MIN(Q) y = EXTRACT_MIN(Q) z = new node with children x, y z.freq = x.freq + y.freq INSERT(Q, z) return EXTRACT_MIN(Q)Cost: if using a binary heap.
Optimality proof: Greedy choice: merge the two lowest-frequency trees.
2.4 Kruskal’s Algorithm (MST)
KRUSKAL(G): A = {} for each v in G.V: MAKE_SET(v) sort G.E by weight w for each (u, v) in G.E in sorted order: if FIND_SET(u) != FIND_SET(v): A = A ∪ {(u, v)} UNION(u, v) return ATime: with Union-Find. Optimal.
2.5 Prim’s Algorithm (MST)
PRIM(G, r): for each u in G.V: u.key = INF; u.parent = NIL r.key = 0 Q = MIN_PRIORITY_QUEUE(G.V) // keyed by u.key while Q is not empty: u = EXTRACT_MIN(Q) for each v in G.Adj[u]: if v in Q and w(u,v) < v.key: v.parent = u DECREASE_KEY(Q, v, w(u,v))Time: with binary heap, with Fibonacci heap.
2.6 Dijkstra’s Algorithm (Shortest Paths)
DIJKSTRA(G, w, s): for each v in G.V: v.dist = INF; v.parent = NIL s.dist = 0 S = {} // processed vertices Q = MIN_PRIORITY_QUEUE(G.V) // keyed by v.dist while Q is not empty: u = EXTRACT_MIN(Q) S = S ∪ {u} for each v in G.Adj[u]: if v.dist > u.dist + w(u, v): v.dist = u.dist + w(u, v) v.parent = u DECREASE_KEY(Q, v, v.dist)Time: with binary heap. with Fibonacci heap.
Requires: Non-negative edge weights.
3. Dynamic Programming
3.1 Strategy
Solve problems by combining solutions to overlapping subproblems. Store subproblem solutions to avoid recomputation.
Two approaches:
- Top-down (memoization): Recursion with caching.
- Bottom-up (tabulation): Fill a table from smallest subproblems upward.
3.2 Memoization vs. Tabulation
Memoization:
FIB(n, memo={}): if n in memo: return memo[n] if n <= 1: return n memo[n] = FIB(n-1, memo) + FIB(n-2, memo) return memo[n]Tabulation:
FIB(n): dp[0] = 0; dp[1] = 1 for i = 2 to n: dp[i] = dp[i-1] + dp[i-2] return dp[n]When to use which:
- Memoization: When not all subproblems are needed.
- Tabulation: When all subproblems are needed (often more space-efficient with table optimization).
3.3 Longest Common Subsequence (LCS)
Given sequences and , find the longest common subsequence.
Recurrence:
LCS(X, Y): m = |X|; n = |Y| c = 2D array [0..m][0..n] for i = 1 to m: for j = 1 to n: if X[i] == Y[j]: c[i][j] = c[i-1][j-1] + 1 else: c[i][j] = max(c[i-1][j], c[i][j-1]) return c[m][n] // length of LCSTime: . Space: (optimizable to ).
3.4 0/1 Knapsack
Given items with weight and value , and a knapsack of capacity , maximize total value.
Recurrence:
KNAPSACK(W, weights, values, n): dp = 2D array [0..n][0..W] for i = 1 to n: for w = 1 to W: dp[i][w] = dp[i-1][w] if weights[i] <= w: dp[i][w] = max(dp[i][w], dp[i-1][w-weights[i]] + values[i]) return dp[n][W]Time: . Pseudo-polynomial.
3.5 Coin Change (Minimum Coins)
Given coin denominations and amount , find the minimum number of coins.
Recurrence:
COIN_CHANGE(coins, A): dp = array [0..A], initialized to INF dp[0] = 0 for a = 1 to A: for each c in coins: if c <= a: dp[a] = min(dp[a], dp[a - c] + 1) return dp[A] if dp[A] != INF else -1Time: . Space: .
3.6 Matrix Chain Multiplication
Given matrices with dimensions , parenthesize to minimize scalar multiplications.
Recurrence:
Time: . Space: .
3.7 Edit Distance (Levenshtein)
Minimum number of insertions, deletions, and substitutions to transform into .
Time: . Space: with optimization.
3.8 DP Problem Identification Checklist
- Overlapping subproblems? The same subproblem is solved multiple times.
- Optimal substructure? The optimal solution contains optimal solutions to subproblems.
- Can you define a recurrence? Express the answer in terms of smaller subproblems.
- What are the subproblem parameters? Define the DP state variables.
- What is the base case? The smallest subproblem that can be answered directly.
4. Backtracking
4.1 Strategy
Build solutions incrementally. When a partial solution cannot be extended to a valid complete solution, backtrack (undo the last choice and try another).
4.2 N-Queens Problem
Place queens on an board so no two queens attack each other.
N_QUEENS(board, row, n): if row == n: print board return true for col = 1 to n: if is_safe(board, row, col): board[row][col] = 1 if N_QUEENS(board, row+1, n): return true board[row][col] = 0 // backtrack return false
IS_SAFE(board, row, col): for i = 0 to row-1: if board[i][col] == 1: return false if col - (row-i) >= 0 and board[i][col-(row-i)] == 1: return false if col + (row-i) < N and board[i][col+(row-i)] == 1: return false return trueTime: in worst case. Backtracking prunes most invalid branches.
4.3 Subset Sum
Given a set of integers and target , determine if any subset sums to .
SUBSET_SUM(S, T, idx=0, current=0): if current == T: return true if idx >= len(S) or current > T: return false // Include S[idx] if SUBSET_SUM(S, T, idx+1, current + S[idx]): return true // Exclude S[idx] if SUBSET_SUM(S, T, idx+1, current): return true return falseTime: worst case. Pruned by current > T.
4.4 Sudoku Solver
SUDOKU(board): find empty cell (row, col) if no empty cell: return true // solved for num = 1 to 9: if is_valid(board, row, col, num): board[row][col] = num if SUDOKU(board): return true board[row][col] = 0 return false5. Branch and Bound
5.1 Strategy
Extend backtracking with bounds on the best possible solution from a given partial solution. Prune branches that cannot improve on the best solution found so far.
5.2 Branch and Bound for 0/1 Knapsack
Use the fractional knapsack solution as an upper bound for each node.
KNAPSACK_BB(items, W, idx=0, current_value=0, current_weight=0): if current_weight <= W: best = max(best, current_value) if idx >= n: return // Bound: fractional relaxation bound = current_value + fractional_knapsack_value(items[idx..n-1], W - current_weight) if bound <= best: return // prune // Branch: include or exclude items[idx] if current_weight + items[idx].weight <= W: KNAPSACK_BB(items, W, idx+1, current_value + items[idx].value, current_weight + items[idx].weight) KNAPSACK_BB(items, W, idx+1, current_value, current_weight)5.3 Traveling Salesman (Branch and Bound)
TSP_BB(cost_matrix): best = INF // Start from city 0 path = [0] visited = {0} TSP_RECURSE(cost_matrix, path, visited, 0, 0, best) return best
TSP_RECURSE(cost, path, visited, curr, curr_cost, best): if len(path) == n: total = curr_cost + cost[curr][0] // return to start best = min(best, total) return best for city = 1 to n-1: if city not in visited: new_cost = curr_cost + cost[curr][city] // Lower bound: current cost + MST of remaining cities + min edges if new_cost + lower_bound(path, visited) < best: visited.add(city) path.append(city) best = TSP_RECURSE(cost, path, visited, city, new_cost, best) path.pop() visited.remove(city) return best6. Common Pitfalls
Applying greedy when DP is needed. The activity selection problem admits a greedy solution, but 0/1 knapsack does not (unlike fractional knapsack). Always verify the greedy choice property.
Forgetting base cases in DP. Every recurrence needs a base case. Missing or incorrect base cases produce wrong answers. For LCS, .
Incorrect DP state definition. The state must capture all information needed to transition. In knapsack,
dp[i][w]works because we only need item index and remaining capacity. In harder problems, you may need additional dimensions.Assuming Dijkstra works with negative weights. Dijkstra fails with negative edges because it assumes a vertex’s distance is finalized when extracted from the priority queue. Use Bellman-Ford for negative weights.
Inefficient memoization without tabulation. Recursion with memoization has overhead (function call stack, hash table lookups). For problems where all subproblems are needed, tabulation is faster and uses less memory.
Not pruning in backtracking/branch-and-bound. Without pruning, backtracking degenerates to brute force ( or ). Effective bounding is essential for practical performance.
Wrong Master theorem case application. Ensure is polynomially different from . If , this does not cleanly fit any case and requires other methods.
Worked Examples
Example 1: Applying the Master Theorem
Problem: Solve the recurrence T(n) = 4T(n/2) + n^2 log n. Solution: a=4, b=2, f(n)=n^2 log n. n^{log_b a} = n^{log_2 4} = n^2. f(n) = n^2 log n = n^{log_b a} * log n. This fits Case 2 of the Master theorem (polylogarithmic factor). T(n) = Theta(n^2 log^2 n).
Example 2: Greedy vs Dynamic Programming
Problem: Given a set of coin denominations {1, 3, 4} and a target amount of 6, find the minimum number of coins. Does the greedy approach work? Solution: Greedy: take largest coin first: 4 + 1 + 1 = 3 coins. Optimal: 3 + 3 = 2 coins. Greedy fails here because the coin denominations are not canonical. Dynamic programming: dp[0]=0, dp[1]=1, dp[2]=2, dp[3]=1, dp[4]=1, dp[5]=2, dp[6]=2. Optimal solution: 2 coins (two 3s).
Summary
- Divide and conquer splits problems into independent subproblems: merge sort, quicksort, Strassen. The Master theorem solves recurrences of the form .
- Greedy algorithms make locally optimal choices. Correctness requires proving the greedy choice property and optimal substructure. Examples: Huffman, Kruskal, Prim, Dijkstra.
- Dynamic programming solves overlapping subproblems via memoization or tabulation. Key problems: LCS, knapsack, coin change, edit distance, matrix chain multiplication.
- Backtracking incrementally builds solutions and prunes invalid paths. Used for N-queens, subset sum, Sudoku.
- Branch and bound enhances backtracking with bounds to prune provably suboptimal branches.
Cross-References
| Topic | Link |
|---|---|
| Data Structures | View |
| Algorithms Overview | View |
| Complexity Theory | View |