Skip to content

Algorithm Analysis

1.1 Asymptotic Notation

Definition. f(n)=O(g(n))f(n) = O(g(n)) if there exist constants c>0c > 0 and n0n_0 such that f(n)cg(n)f(n) \leq c \cdot g(n) for all nn0n \geq n_0.

Definition. f(n)=Ω(g(n))f(n) = \Omega(g(n)) if there exist constants c>0c > 0 and n0n_0 such that f(n)cg(n)f(n) \geq c \cdot g(n) for all nn0n \geq n_0.

Definition. f(n)=Θ(g(n))f(n) = \Theta(g(n)) if f(n)=O(g(n))f(n) = O(g(n)) and f(n)=Ω(g(n))f(n) = \Omega(g(n)).

Definition. f(n)=o(g(n))f(n) = o(g(n)) if limnf(n)/g(n)=0\lim_{n \to \infty} f(n)/g(n) = 0.

Definition. f(n)=ω(g(n))f(n) = \omega(g(n)) if limng(n)/f(n)=0\lim_{n \to \infty} g(n)/f(n) = 0.

Theorem 1.1. f(n)=O(g(n))f(n) = O(g(n)) if and only if g(n)=Ω(f(n))g(n) = \Omega(f(n)).

Proof. Suppose f(n)=O(g(n))f(n) = O(g(n)). Then there exist c,n0c, n_0 such that f(n)cg(n)f(n) \leq c \cdot g(n) for all nn0n \geq n_0Hence g(n)(1/c)f(n)g(n) \geq (1/c) \cdot f(n) for all nn0n \geq n_0So g(n)=Ω(f(n))g(n) = \Omega(f(n)). The converse follows by symmetry. \blacksquare

Theorem 1.2. f(n)=Θ(g(n))f(n) = \Theta(g(n)) if and only if there exist constants c1,c2>0c_1, c_2 > 0 and n0n_0 such that c1g(n)f(n)c2g(n)c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n) for all nn0n \geq n_0.

Proof. By definition, f(n)=Θ(g(n))f(n) = \Theta(g(n)) means f(n)=O(g(n))f(n) = O(g(n)) and f(n)=Ω(g(n))f(n) = \Omega(g(n)). The former gives f(n)c2g(n)f(n) \leq c_2 \cdot g(n) for some c2>0c_2 > 0And the latter gives f(n)c1g(n)f(n) \geq c_1 \cdot g(n) for some c1>0c_1 > 0. Combining yields the stated inequality. \blacksquare

Theorem 1.3 (Limit Rule). If limnf(n)/g(n)=c\lim_{n \to \infty} f(n)/g(n) = c where 0<c<0 < c < \inftyThen f(n)=Θ(g(n))f(n) = \Theta(g(n)). If c=0c = 0Then f(n)=O(g(n))f(n) = O(g(n)). If c=c = \inftyThen g(n)=O(f(n))g(n) = O(f(n)).

Proof. If c=0c = 0Then for any ε>0\varepsilon > 0There exists n0n_0 such that f(n)/g(n)<εf(n)/g(n) < \varepsilon for all nn0n \geq n_0So f(n)εg(n)f(n) \leq \varepsilon \cdot g(n)Establishing f(n)=O(g(n))f(n) = O(g(n)). If 0<c<0 < c < \inftyTake ε=c/2\varepsilon = c/2; then (c/2)g(n)f(n)(3c/2)g(n)(c/2) \cdot g(n) \leq f(n) \leq (3c/2) \cdot g(n) for sufficiently large nnGiving Θ\Theta. The c=c = \infty case is symmetric. \blacksquare

Proposition 1.4. Asymptotic notation is transitive: if f=O(g)f = O(g) and g=O(h)g = O(h)Then f=O(h)f = O(h).

Proof. There exist c1,n1c_1, n_1 with f(n)c1g(n)f(n) \leq c_1 g(n) for nn1n \geq n_1And c2,n2c_2, n_2 with g(n)c2h(n)g(n) \leq c_2 h(n) for nn2n \geq n_2. Then f(n)c1c2h(n)f(n) \leq c_1 c_2 h(n) for nmax(n1,n2)n \geq \max(n_1, n_2). \blacksquare

Proposition 1.5. OO-notation is reflexive: f=O(f)f = O(f) for all ff.

Proof. Take c=1c = 1 and n0=1n_0 = 1. Then f(n)1f(n)f(n) \leq 1 \cdot f(n) for all n1n \geq 1. \blacksquare

Worked Example: Proving $n^2 + 3n + 1 = O(n^2)$

We must find c>0c > 0 and n0n_0 such that n2+3n+1cn2n^2 + 3n + 1 \leq c \cdot n^2 for all nn0n \geq n_0.

Note that n2+3n+1n2+3n2+n2=5n2n^2 + 3n + 1 \leq n^2 + 3n^2 + n^2 = 5n^2 for all n1n \geq 1 (since n1n \geq 1 implies 3n3n23n \leq 3n^2 and 1n21 \leq n^2).

So take c=5c = 5 and n0=1n_0 = 1.

To show tightness, note n2+3n+1n2n^2 + 3n + 1 \geq n^2 for all n0n \geq 0So n2+3n+1=Θ(n2)n^2 + 3n + 1 = \Theta(n^2).

Worked Example: Proving $2^n \neq O(n^k)$ for any constant $k$

By the limit rule: limn2n/nk=\lim_{n \to \infty} 2^n / n^k = \infty for any fixed kk (this follows from repeated application of L”Hôpital’s rule, or from the fact that log(2n)=nlog2\log(2^n) = n \log 2 grows faster than log(nk)=klogn\log(n^k) = k \log n). Therefore 2n=ω(nk)2^n = \omega(n^k) for all kkAnd in particular 2nO(nk)2^n \neq O(n^k).

Worked Example: Sum of Harmonic Series and Its Use in Analysis

The kk-th harmonic number is Hk=i=1k1/iH_k = \sum_{i=1}^{k} 1/i. We show Hk=Θ(logk)H_k = \Theta(\log k).

Upper bound: Hk=i=1k1/i1+1k1xdx=1+lnk=O(logk)H_k = \sum_{i=1}^{k} 1/i \leq 1 + \int_1^k \frac{1}{x}\, dx = 1 + \ln k = O(\log k).

Lower bound: Hk1k+11xdx=ln(k+1)=Ω(logk)H_k \geq \int_1^{k+1} \frac{1}{x}\, dx = \ln(k+1) = \Omega(\log k).

Therefore Hk=Θ(logk)H_k = \Theta(\log k). This is used extensively in analysing algorithms like binary search on unbounded domains and the expected depth of elements in a random BST.

Worked Example: Analysing a Nested Loop Algorithm

Consider the algorithm:

for i = 1 to n:
for j = i to n:
constant-time work

The total number of iterations is i=1n(ni+1)=k=1nk=n(n+1)/2=Θ(n2)\sum_{i=1}^{n}(n - i + 1) = \sum_{k=1}^{n} k = n(n+1)/2 = \Theta(n^2).

Now consider:

for i = 1 to n:
j = 1
while j < n:
j = j * 2
constant-time work

The inner loop runs log2n+1=O(logn)\lfloor \log_2 n \rfloor + 1 = O(\log n) times, and the outer loop runs nn times. Total: O(nlogn)O(n \log n).

Now consider the triple loop:

for i = 1 to n:
for j = 1 to i:
for k = 1 to j:
constant-time work

Total iterations: i=1nj=1ij=i=1ni(i+1)2=Θ(n3)\sum_{i=1}^{n} \sum_{j=1}^{i} j = \sum_{i=1}^{n} \frac{i(i+1)}{2} = \Theta(n^3).

1.2 Common Complexity Classes

ClassNameExample
O(1)O(1)ConstantArray access by index
O(logn)O(\log n)LogarithmicBinary search
O(n)O(n)LinearLinear search
O(nlogn)O(n \log n)Log-linearMerge sort, heap sort
O(n2)O(n^2)QuadraticBubble sort, insertion sort
O(n3)O(n^3)CubicNaive matrix multiplication
O(2n)O(2^n)ExponentialNaive Fibonacci, subset problems
O(n!)O(n!)FactorialGenerating all permutations

Proposition 1.6. These classes form a strict hierarchy: O(1)O(logn)O(n)O(nlogn)O(n2)O(n3)O(2n)O(n!)O(1) \subsetneq O(\log n) \subsetneq O(n) \subsetneq O(n \log n) \subsetneq O(n^2) \subsetneq O(n^3) \subsetneq O(2^n) \subsetneq O(n!)

1.3 Recurrences and the Master Theorem

Many divide-and-conquer algorithms yield recurrences of the form:

T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n)

Where a1a \geq 1 is the number of subproblems, b>1b > 1 is the factor by which the input is divided, and f(n)f(n) is the cost of dividing and combining.

Theorem 1.7 (Master Theorem). Let a1a \geq 1 and b>1b > 1 be constants, let f(n)f(n) be a function, and let T(n)T(n) be defined on the nonnegative integers by the recurrence T(n)=aT(n/b)+f(n)T(n) = aT(n/b) + f(n)Where we interpret n/bn/b to mean either n/b\lfloor n/b \rfloor or n/b\lceil n/b \rceil. Let c=logbac = \log_b a. Then:

  1. If f(n)=O(ncε)f(n) = O(n^{c - \varepsilon}) for some ε>0\varepsilon > 0Then T(n)=Θ(nc)T(n) = \Theta(n^c).
  2. If f(n)=Θ(nclogkn)f(n) = \Theta(n^c \log^k n) for some k0k \geq 0Then T(n)=Θ(nclogk+1n)T(n) = \Theta(n^c \log^{k+1} n).
  3. If f(n)=Ω(nc+ε)f(n) = \Omega(n^{c + \varepsilon}) for some ε>0\varepsilon > 0And if af(n/b)qf(n)a f(n/b) \leq q f(n) for some constant q<1q < 1 and all sufficiently large nn (the regularity condition), then T(n)=Θ(f(n))T(n) = \Theta(f(n)).

Proof (Case 1 sketch). The recursion tree has depth logbn\log_b n. At level iiThere are aia^i nodes, each costing f(n/bi)f(n/b^i). Since f(n)=O(ncε)f(n) = O(n^{c - \varepsilon})The cost at level ii is ai(n/bi)cε=ncε(a/bcε)ia^i \cdot (n/b^i)^{c - \varepsilon} = n^{c - \varepsilon} \cdot (a / b^{c - \varepsilon})^i. The total cost is dominated by the leaves (level logbn\log_b n), which contribute alogbn=nca^{\log_b n} = n^c. The internal levels contribute a geometric series with ratio a/bcε=bε>1a / b^{c - \varepsilon} = b^\varepsilon > 1So the leaf level dominates. \blacksquare

Worked Example: Merge Sort Recurrence

Merge sort divides into 2 subproblems of size n/2n/2 and combines in O(n)O(n) time.

T(n)=2T(n/2)+Θ(n)T(n) = 2T(n/2) + \Theta(n)

Here a=2a = 2, b=2b = 2So c=log22=1c = \log_2 2 = 1. We have f(n)=Θ(n)=Θ(nclog0n)f(n) = \Theta(n) = \Theta(n^c \log^0 n)Which is Case 2 with k=0k = 0.

Therefore T(n)=Θ(n1log1n)=Θ(nlogn)T(n) = \Theta(n^1 \log^1 n) = \Theta(n \log n).

Worked Example: Binary Search Recurrence

T(n)=T(n/2)+O(1)T(n) = T(n/2) + O(1)

Here a=1a = 1, b=2b = 2So c=log21=0c = \log_2 1 = 0. We have f(n)=O(1)=O(n0)f(n) = O(1) = O(n^0). This matches Case 2 with k=0k = 0.

Therefore T(n)=Θ(logn)T(n) = \Theta(\log n).

Worked Example: Strassen's Matrix Multiplication

Strassen’s algorithm divides into 7 subproblems of size n/2n/2 and combines in O(n2)O(n^2) time.

T(n)=7T(n/2)+O(n2)T(n) = 7T(n/2) + O(n^2)

Here a=7a = 7, b=2b = 2So c=log272.807c = \log_2 7 \approx 2.807. We have f(n)=O(n2)=O(ncε)f(n) = O(n^2) = O(n^{c - \varepsilon}) with ε=c20.807\varepsilon = c - 2 \approx 0.807Which is Case 1.

Therefore T(n)=Θ(nlog27)=Θ(n2.807)T(n) = \Theta(n^{\log_2 7}) = \Theta(n^{2.807}).

1.4 Amortised Analysis

Amortised analysis gives the average cost per operation over a sequence of operations, even when Individual operations may be expensive.

Methods:

  1. Aggregate analysis: Compute total cost of nn operations, divide by nn.
  2. Accounting method: Assign an amortised cost to each operation; the sum of amortised costs must be an upper bound on the total actual cost.
  3. Potential method: Define a potential function Φ\Phi; the amortised cost of the ii-th operation is c^i=ci+Φ(Di)Φ(Di1)\hat{c}_i = c_i + \Phi(D_i) - \Phi(D_{i-1}).

Theorem 1.8 (Potential Method). If Φ(Di)Φ(D0)\Phi(D_i) \geq \Phi(D_0) for all i1i \geq 1Then the total amortised cost i=1nc^i\sum_{i=1}^{n} \hat{c}_i is an upper bound on the total actual cost i=1nci\sum_{i=1}^{n} c_i.

Proof. i=1nc^i=i=1n(ci+Φ(Di)Φ(Di1))=i=1nci+Φ(Dn)Φ(D0)i=1nci\sum_{i=1}^{n} \hat{c}_i = \sum_{i=1}^{n} (c_i + \Phi(D_i) - \Phi(D_{i-1})) = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_0) \geq \sum_{i=1}^{n} c_i. \blacksquare

Example (Dynamic Array). A dynamic array doubles in size when full. Insertion is O(1)O(1) Amortised: most insertions cost O(1)O(1); occasional resizing costs O(n)O(n)But is paid for by the Surplus from previous O(1)O(1) insertions.

Worked Example: Dynamic Array Amortised Analysis (Accounting Method)

Assign an amortised cost of 3 per insertion. The actual cost of insertion without resizing is 1. We store the surplus of 2 as “credit” on each element in the array.

When the array of capacity kk is full and we must resize to capacity 2k2k:

  • The resizing cost is kk (copy all kk elements).
  • We have 2k2k credits stored (2 per element).
  • We spend kk credits on the copy, leaving kk credits.
  • The new array has kk elements, so we need 2k2k more credits, but we already have kk.
  • Subsequent insertions build up credits again.

Since the total credits are always non-negative, the amortised cost of 3 per insertion covers all actual costs.

Worked Example: Dynamic Array Amortised Analysis (Potential Method)

Define the potential function Φ(D)=2nm\Phi(D) = 2n - m where nn is the number of elements and mm is the capacity. We require mnm \geq nSo Φ0\Phi \geq 0 (since 2nn=n02n - n = n \geq 0).

Case 1: No resize. c^=1+(2(n+1)m)(2nm)=1+2=3\hat{c} = 1 + (2(n+1) - m) - (2n - m) = 1 + 2 = 3.

Case 2: Resize from m=nm = n to m=2nm = 2n. c^=(1+n)+(2(n+1)2n)(2nn)=(n+1)+2n=3\hat{c} = (1 + n) + (2(n+1) - 2n) - (2n - n) = (n + 1) + 2 - n = 3.

In both cases, the amortised cost is O(1)O(1).

:::caution Common Pitfall Amortised analysis gives a guarantee over a sequence of operations, not a per-operation worst-case bound. A single operation can still be expensive (e.g., a resize in a dynamic array costs O(n)O(n)). Amortised bounds are meaningful only when the sequence length is not bounded by a constant.

:::