Skip to content

Fundamental Data Structures

2.1 Arrays and Linked Lists

Array. Contiguous memory, O(1)O(1) access by index, O(n)O(n) insertion/deletion (shift required).

OperationArraySingly Linked ListDoubly Linked List
Access by indexO(1)O(1)O(n)O(n)O(n)O(n)
Search (by value)O(n)O(n)O(n)O(n)O(n)O(n)
Insert at headO(n)O(n)O(1)O(1)O(1)O(1)
Insert at tailO(1)O(1)O(n)O(n)O(1)O(1)*
Delete at headO(n)O(n)O(1)O(1)O(1)O(1)
Delete at tailO(n)O(n)O(n)O(n)O(1)O(1)*
Insert at positionO(n)O(n)O(n)O(n)O(1)O(1)*
Space overhead00O(n)O(n)O(n)O(n)

*Given a pointer to the node or its predecessor.

Linked List. Each node stores data and a pointer to the next node. O(1)O(1) insertion/deletion at Head (given pointer), O(n)O(n) access by position.

Doubly Linked List. Each node has pointers to both next and previous nodes. O(1)O(1) insertion And deletion at any position (given pointers to the node and its neighbours).

Worked Example: Reversing a Singly Linked List

To reverse a singly linked list in O(n)O(n) time and O(1)O(1) space:

prev = null
curr = head
while curr != null:
next = curr.next
curr.next = prev
prev = curr
curr = next
head = prev

At each iteration, we move one node to the front of the reversed list. After nn iterations, all nn nodes are reversed.

2.2 Stacks and Queues

Stack. Last-in, first-out (LIFO). Operations: push, pop, peek, all O(1)O(1).

Queue. First-in, first-out (FIFO). Operations: enqueue, dequeue, peek, all O(1)O(1).

Implementation. Both can be implemented with arrays (with circular buffer) or linked lists.

Worked Example: Implementing a Queue with Two Stacks

A queue can be implemented using two stacks in_stack and out_stack.

  • enqueue(x): push xx onto in_stack. O(1)O(1).
  • dequeue(): if out_stack is empty, pop all elements from in_stack and push onto out_stack. Then pop from out_stack.

The dequeue operation is O(1)O(1) amortised: each element is moved from in_stack to out_stack at most once across all operations. Over nn operations, the total number of element moves is at most nnGiving O(1)O(1) amortised per dequeue.

2.3 Hash Tables

A hash table maps keys to values using a hash function h:K0,1,,m1h : K \to \\{0, 1, \ldots, m - 1\\}.

Collision resolution:

  • Chaining: Each bucket is a linked list. Average case: O(1+α)O(1 + \alpha) where α=n/m\alpha = n/m is the load factor.
  • Open addressing (linear probing): Insert into the next available slot. Average case: O(1/(1α))O(1/(1-\alpha)).

Theorem 2.1 (Uniform Hashing). Under the assumption of simple uniform hashing, the expected Length of a chain in a hash table with chaining is α=n/m\alpha = n/m.

Proof. Under simple uniform hashing, each of the nn keys is equally likely to hash to any of the mm slots. For any given key kkThe expected number of other keys that hash to the same slot as kk is (n1)/m<α(n - 1)/m < \alpha. Including kk itself, the expected chain length is 1+(n1)/m1+α1 + (n-1)/m \leq 1 + \alpha. \blacksquare

Theorem 2.2 (Successful Search with Chaining). Under simple uniform hashing, the expected time for a successful search in a hash table with chaining is O(1+α/2)O(1 + \alpha/2).

Proof. The expected length of the list containing the searched element is 1+(n1)/m1 + (n - 1)/mSince n1n - 1 other elements are distributed uniformly. On average, the searched element is halfway through its list, giving (1+(n1)/m)/2+1/2(1 + (n-1)/m)/2 + 1/2 comparisons (we examine elements before the target plus the target itself). This equals 1+α/21/(2m)=O(1+α)1 + \alpha/2 - 1/(2m) = O(1 + \alpha). \blacksquare

Double Hashing. Uses two hash functions: h(k,i)=(h1(k)+ih2(k))modmh(k, i) = (h_1(k) + i \cdot h_2(k)) \bmod mWhere h2(k)h_2(k) is relatively prime to mm. This avoids the clustering problems of linear probing.

Worked Example: Choosing a Good Hash Function

For integer keys, a common choice is the multiplication method: h(k)=m(kAmod1)h(k) = \lfloor m \cdot (k \cdot A \bmod 1) \rfloor where A=(51)/20.618A = (\sqrt{5} - 1)/2 \approx 0.618 (the inverse golden ratio).

For string keys, the polynomial rolling hash: h(s)=(i=0k1s[i]pk1i)modmh(s) = \left(\sum_{i=0}^{k-1} s[i] \cdot p^{k-1-i}\right) \bmod m where pp is a prime (e.g., 31 or 127).

Both avoid clustering better than simple modulo hashing when the input distribution is adversarial.

2.4 Trees

Binary Search Tree (BST). For each node: left subtree values <\lt node value, right subtree values >\gt node value.

  • Search, insert, delete: O(h)O(h) where hh is the height.
  • For a balanced BST, h=O(logn)h = O(\log n).

Theorem 2.3. The expected height of a randomly built BST with nn distinct keys is O(logn)O(\log n).

Proof. Let XnX_n be the height of a BST built from nn random keys. Let Yn=2XnY_n = 2^{X_n}. Then E[Yn]14i=0n1(ni)E[Yi]E[Yn1i]/n\mathrm{E}[Y_n] \leq \frac{1}{4} \sum_{i=0}^{n-1} \binom{n}{i} \mathrm{E}[Y_i] \mathrm{E}[Y_{n-1-i}] / n. Using the indicator random variable technique, E[Yn]cn+1n3/2i=0n1i3/2(n1i)3/2cicn1icn3/2\mathrm{E}[Y_n] \leq \frac{c^{n+1}}{n^{3/2}} \sum_{i=0}^{n-1} \frac{i^{3/2}(n-1-i)^{3/2}}{c^i c^{n-1-i}} \leq c \cdot n^{3/2} for some constant cc. Taking logs gives E[Xn]=E[logYn]logE[Yn]=O(logn)\mathrm{E}[X_n] = \mathrm{E}[\log Y_n] \leq \log \mathrm{E}[Y_n] = O(\log n) by Jensen”s inequality. \blacksquare

2.4.1 AVL Trees

Definition. An AVL tree is a BST where for every node, the heights of its left and right subtrees differ by at most 1. The balance factor of a node is the height of its left subtree minus the height of its right subtree; valid balance factors are 1,0,1\\{-1, 0, 1\\}.

Rotations. After insertion or deletion, the balance factor may become ±2\pm 2. This is fixed by rotations:

  • Right rotation (LL case): Balance factor +2+2 at node AABalance factor +1+1 at left child BB.

    A B
    / \ / \
    B C → D A
    / \ / \
    D E E C
  • Left rotation (RR case): Mirror of right rotation.

  • Left-Right rotation (LR case): Balance factor +2+2 at AABalance factor 1-1 at left child BB. First left-rotate BBThen right-rotate AA.

  • Right-Left rotation (RL case): Mirror of LR case.

Theorem 2.4. An AVL tree with nn nodes has height h1.4404log2(n+2)1.3277h \leq 1.4404 \cdot \log_2(n + 2) - 1.3277.

Proof. Let N(h)N(h) be the minimum number of nodes in an AVL tree of height hh. We have N(0)=1N(0) = 1, N(1)=2N(1) = 2And N(h)=1+N(h1)+N(h2)N(h) = 1 + N(h-1) + N(h-2) for h2h \geq 2. This is the Fibonacci recurrence, giving N(h)=Fh+31N(h) = F_{h+3} - 1. Using Fh=ϕhϕ^h5F_h = \frac{\phi^h - \hat{\phi}^h}{\sqrt{5}} where ϕ=1+52\phi = \frac{1+\sqrt{5}}{2}We get N(h)>ϕh/51N(h) > \phi^h / \sqrt{5} - 1So h<logϕ(5(n+1))1.4404log2(n+1)h \lt \log_\phi(\sqrt{5}(n + 1)) \approx 1.4404 \log_2(n + 1). \blacksquare

Corollary 2.5. All AVL tree operations (search, insert, delete) run in O(logn)O(\log n) time.

Proof. Each operation visits O(h)=O(logn)O(h) = O(\log n) nodes along a root-to-leaf path, and each rotation is O(1)O(1). Since at most O(logn)O(\log n) rotations are needed (at most 2 per level), the total cost is O(logn)O(\log n). \blacksquare

2.4.2 Red-Black Trees

Definition. A red-black tree is a BST satisfying:

  1. Every node is either red or black.
  2. The root is black.
  3. Every leaf (NIL sentinel) is black.
  4. If a node is red, both its children are black.
  5. For every node, all simple paths from the node to descendant leaves contain the same number of black nodes (the black-height).

Theorem 2.6. A red-black tree with nn internal nodes has height h2log2(n+1)h \leq 2 \log_2(n + 1).

Proof. Let rr be the root. By property 5, at least half the nodes on any root-to-leaf path are black (by property 4, no two reds are adjacent), so the black-height bh(r)h/2bh(r) \geq h/2. Let S(h,bh)S(h, bh) be the minimum number of internal nodes in a subtree of height hh and black-height bhbh. Then S(h,bh)2bh1S(h, bh) \geq 2^{bh} - 1 (proved by induction on bhbhUsing the fact that each child has black-height at least bh1bh - 1 and the root is black). Since bh(r)h/2bh(r) \geq h/2 and n2bh(r)12h/21n \geq 2^{bh(r)} - 1 \geq 2^{h/2} - 1We get h2log2(n+1)h \leq 2 \log_2(n + 1). \blacksquare

Insertion. Insert as in a standard BST (colour the new node red), then fix violations by recolouring and rotating up to O(logn)O(\log n) times.

Deletion. More complex than insertion. When a black node is removed, we may need to fix the black-height property by recolouring and rotating. The fix-up procedure takes O(logn)O(\log n) time.

2.4.3 B-Trees

Definition. A B-tree of minimum degree t2t \geq 2 is a rooted tree satisfying:

  1. Every node has at most 2t12t - 1 keys.
  2. Every non-root node has at least t1t - 1 keys (hence at least tt children).
  3. The root has at least 1 key.
  4. All leaves appear at the same depth.

Theorem 2.7. A B-tree with nn keys and minimum degree tt has height hlogtn+12h \leq \log_t \frac{n+1}{2}.

Proof. The root has at least 1 key and 2 children. At depth 1, each node has at least t1t - 1 keys and tt children. At depth d1d \geq 1The number of nodes is at least 2td12t^{d-1}Each with at least t1t - 1 keys. So the total number of keys is at least 1+2(t1)i=0h1ti=1+2(t1)th1t1=2th11 + 2(t-1) \sum_{i=0}^{h-1} t^i = 1 + 2(t-1) \cdot \frac{t^h - 1}{t - 1} = 2t^h - 1. Setting n2th1n \geq 2t^h - 1 gives hlogt((n+1)/2)h \leq \log_t((n+1)/2). \blacksquare

Worked Example: Red-Black Tree Insertion Trace

Insert keys 7, 3, 18, 10, 22, 8, 11, 26 into an initially empty red-black tree.

Insert 7: Root is coloured black. T=[7(B)]T = [7(B)]

Insert 3: Left child of 7, coloured red. T=[7(B)3(R)]T = [7(B) \leftarrow 3(R)]. Valid (root is black, no red-red violation).

Insert 18: Right child of 7, coloured red. T=[3(R)7(B)18(R)]T = [3(R) \leftarrow 7(B) \rightarrow 18(R)]. Valid.

Insert 10: Insert as left child of 18, coloured red. Now 18 has red child 10. Uncle of 10 is 3(R) — Case 1 (uncle is red): recolour uncle 3 black, parent 18 black, grandparent 7 red. Grandparent 7 is root, so recolour 7 black.

Result: T=[3(B)7(B)18(B)]T = [3(B) \leftarrow 7(B) \rightarrow 18(B)] with 10(R)10(R) under 1818.

Insert 22: Right child of 18, coloured red. Uncle of 22 is 10(R) — Case 1 again: recolour 10 black, 22 black, 18 red. Parent 18 is not root, grandparent is 7(B). Now 7 has right child 18(R). Check 7’s left child: 3(B). No violation.

Insert 8: Left child of 10, coloured red. Uncle of 8 is 22(R) — Case 1: recolour 8 black, 22 black, 10 red. Now 10(R) is left child of 18(B). Check grandparent 7: right child 18(B) with left child 10(R). No red-red violation.

Insert 11: Right child of 10, coloured red. Now 10(R) has both children red — wait, 10 is red and 11 is red: violation. Uncle of 11 is 8(R) — Case 1: recolour 8 black, 11 black, 10 red. Grandparent 18(B): 10(R) is left child, 22(B) is right child. No violation.

Insert 26: Right child of 22, coloured red. Uncle of 26 is 10(R) — Case 1: recolour 10 black, 26 black, 22 red. Now 22(R) is right child of 18(B). Check left child of 18: 10(B). Valid.

2.5 Skip Lists

A skip list is a probabilistic data structure that provides O(logn)O(\log n) expected-time search, insertion, and deletion. It consists of multiple levels of linked lists.

Structure:

  • Level 0 is a standard linked list containing all elements in sorted order.
  • Each element is promoted to level 1 with probability pp ( p=1/2p = 1/2).
  • Each level-ii element is promoted to level i+1i+1 independently with probability pp.
  • A sentinel head node links to the first element at every level.

Theorem 2.8. The expected number of levels in a skip list with nn elements is O(logn)O(\log n).

Proof. An element reaches level ii with probability pip^i. The expected number of elements at level ii is npinp^i. The highest non-empty level is approximately log1/pn\log_{1/p} n. \blacksquare

Theorem 2.9. Search, insert, and delete in a skip list take O(logn)O(\log n) expected time.

Proof. The search path at each level ii moves right to find an element whose successor at level ii is greater than the target, then drops to level i1i - 1. At each level, the expected number of rightward steps is at most 1/p1/p (a geometric distribution argument). With O(log1/pn)O(\log_{1/p} n) levels, the total expected work is O(logn/p)=O(logn)O(\log n / p) = O(\log n). \blacksquare

Worked Example: Skip List Search

Search for key 25 in a skip list containing keys [3,7,12,19,25,31,42][3, 7, 12, 19, 25, 31, 42] with the following level structure (probability p=1/2p = 1/2):

Level 3: [HEAD] -------------------------> [31] ---------->
Level 2: [HEAD] ---------> [12] -------------------------->
Level 1: [HEAD] -> [7] -> [12] -> [19] -> [25] -> [31] -> [42]
Level 0: [HEAD] -> [7] -> [12] -> [19] -> [25] -> [31] -> [42]

Search starts at HEAD, level 3:

  • Move right to 31. 31>2531 > 25Drop to level 2.
  • At level 2, move right to 12. 122512 \leq 25Move right — next is NIL. Drop to level 1.
  • At level 1, move right to 19. 192519 \leq 25Move right to 25. 25=2525 = 25. Found!

The search examined keys 31, 12, 19, 25 — 4 comparisons across 3 levels.

Worked Example: Skip List Insertion

Insert key 15 into the skip list above.

  1. Search for insertion position: traverse levels 3, 2, 1 to find that 15 goes between 12 and 19.
  2. Insert 15 at level 0 (always). Randomly promote: suppose coin flip says promote to level 1. Insert at level 1. Flip again: promote to level 2. Insert at level 2. Flip again: don’t promote. Stop.
Level 3: [HEAD] -------------------------> [31] ---------->
Level 2: [HEAD] ---------> [12] -> [15] ------------------>
Level 1: [HEAD] -> [7] -> [12] -> [15] -> [19] -> [25] -> [31] -> [42]
Level 0: [HEAD] -> [7] -> [12] -> [15] -> [19] -> [25] -> [31] -> [42]

Insertion takes O(logn)O(\log n) expected time (search path + insertions at promoted levels).

2.6 Bloom Filters

A Bloom filter is a space-efficient probabilistic data structure for set membership testing. It may produce false positives but never false negatives.

Structure: A bit array of mm bits (initially all 0) and kk independent hash functions h1,,hkh_1, \ldots, h_k.

Operations:

  • Insert xx: set bits h1(x),h2(x),,hk(x)h_1(x), h_2(x), \ldots, h_k(x) to 1.
  • Query xx: return true if all h1(x),,hk(x)h_1(x), \ldots, h_k(x) are set to 1.

Theorem 2.10. After inserting nn elements into a Bloom filter of size mm with kk hash functions, the probability of a false positive is approximately (1ekn/m)k\left(1 - e^{-kn/m}\right)^k.

Proof. After inserting nn elements, each of the mm bits is set to 0 with probability (11/m)knekn/m(1 - 1/m)^{kn} \approx e^{-kn/m}. The probability that all kk bits for a query element are set is (1ekn/m)k(1 - e^{-kn/m})^k. \blacksquare

Optimal number of hash functions. The false positive rate is minimised when k=(m/n)ln2k = (m/n) \ln 2Giving a minimum rate of approximately (1/2)k=0.6185m/n(1/2)^k = 0.6185^{m/n}.

Worked Example: Bloom Filter Configuration

We want to store n=10000n = 10000 URLs with a false positive rate of at most 1%.

From the formula: (1ekn/m)k0.01(1 - e^{-kn/m})^k \leq 0.01.

Optimal k=(m/n)ln2k = (m/n) \ln 2. At the optimal point, the false positive rate is (1/2)k0.6185m/n(1/2)^k \approx 0.6185^{m/n}.

We need 0.6185m/n0.010.6185^{m/n} \leq 0.01So m/nln(0.01)/ln(0.6185)9.585m/n \geq \ln(0.01) / \ln(0.6185) \approx 9.585.

Take m=95850m = 95850 bits 11.5\approx 11.5 KB. Then k=(95850/10000)ln26.64k = (95850 / 10000) \ln 2 \approx 6.64So use k=7k = 7 hash functions.

This requires only 11.5 KB of memory versus approximately 600 KB for storing the URLs as 60-byte strings.

Worked Example: Bloom Filter Operations Trace

A Bloom filter has m=10m = 10 bits, k=2k = 2 hash functions: h1(x)=xmod10h_1(x) = x \bmod 10, h2(x)=(x3)mod10h_2(x) = (x \cdot 3) \bmod 10.

Insert 15: h1(15)=5h_1(15) = 5, h2(15)=53mod10=5h_2(15) = 5 \cdot 3 \bmod 10 = 5. Set bit 5. Bit array: 0000010000

Insert 22: h1(22)=2h_1(22) = 2, h2(22)=223mod10=6h_2(22) = 22 \cdot 3 \bmod 10 = 6. Set bits 2, 6. Bit array: 0010010100

Insert 37: h1(37)=7h_1(37) = 7, h2(37)=373mod10=1h_2(37) = 37 \cdot 3 \bmod 10 = 1. Set bits 1, 7. Bit array: 0110010110

Query 22: h1(22)=2h_1(22) = 2 (set), h2(22)=6h_2(22) = 6 (set). Result: possibly in set (true positive).

Query 42: h1(42)=2h_1(42) = 2 (set), h2(42)=423mod10=6h_2(42) = 42 \cdot 3 \bmod 10 = 6 (set). Result: possibly in set (false positive!).

Query 50: h1(50)=0h_1(50) = 0 (not set). Result: definitely not in set (true negative).

2.7 Heaps

A binary heap is a complete binary tree satisfying the heap property:

  • Max-heap: parent \geq children.
  • Min-heap: parent \leq children.

Implemented as an array: parent of node ii is at (i1)/2\lfloor(i-1)/2\rfloor; children at 2i+12i+1 and 2i+22i+2.

Operations: insert O(logn)O(\log n)Extract-max/min O(logn)O(\log n)Peek O(1)O(1).

Theorem 2.11 (Build-Heap). buildHeap on an array of nn elements runs in O(n)O(n) time.

Proof. buildHeap calls heapify on each of the n/2\lfloor n/2 \rfloor non-leaf nodes. The cost of heapify at a node of height hh is O(h)O(h). The total cost is h=0lognn/2h+1O(h)=O(nh=0h/2h)=O(n)\sum_{h=0}^{\lfloor \log n \rfloor} \lceil n / 2^{h+1} \rceil \cdot O(h) = O\left(n \sum_{h=0}^{\infty} h / 2^h\right) = O(n)Since h=0h/2h=2\sum_{h=0}^{\infty} h / 2^h = 2. \blacksquare

Worked Example: Build-Heap Step by Step

Build a max-heap from the array [4,10,3,5,1,8,2][4, 10, 3, 5, 1, 8, 2].

The array represents the tree:

4
/ \
10 3
/ \ / \
5 1 8 2

Non-leaf nodes (indices 7/21=2\lfloor 7/2 \rfloor - 1 = 2 down to 0): nodes 3, 10, 4.

Heapify index 2 (value 3): Children are 8 and 2. Swap 3 with 8. Array: [4,10,8,5,1,3,2][4, 10, 8, 5, 1, 3, 2].

4
/ \
10 8
/ \ / \
5 1 3 2

Heapify index 1 (value 10): Children are 5 and 1. 10 is largest. No swap.

Heapify index 0 (value 4): Children are 10 and 8. Swap with 10. Array: [10,4,8,5,1,3,2][10, 4, 8, 5, 1, 3, 2].

10
/ \
4 8
/ \ / \
5 1 3 2

Now heapify index 1 (value 4): Children are 5 and 1. Swap with 5. Array: [10,5,8,4,1,3,2][10, 5, 8, 4, 1, 3, 2].

10
/ \
5 8
/ \ / \
4 1 3 2

This is a valid max-heap. Total operations: 3 swaps = O(n)O(n).

2.8 Union-Find (Disjoint Set Union)

The Union-Find data structure maintains a partition of a set into disjoint subsets, supporting:

  • MakeSet(xx): Create a singleton set x\\{x\\}.
  • Find(xx): Return the representative of the set containing xx.
  • Union(x,yx, y): Merge the sets containing xx and yy.

Implementation with path compression and union by rank:

  • Each set is represented as a rooted tree.
  • Find follows parent pointers and compresses the path (makes every node on the path point directly to the root).
  • Union attaches the shorter tree under the root of the taller tree (by rank/size).

Theorem 2.12. With both path compression and union by rank, the amortised time per operation is O(α(n))O(\alpha(n))Where α(n)\alpha(n) is the inverse Ackermann function.

Proof (outline). The inverse Ackermann function α(n)\alpha(n) grows so slowly that α(n)4\alpha(n) \leq 4 for all practical values of nn (n22265536n \leq 2^{2^{2^{65536}}}). The …/1-number-and-algebra/3proof-and-logic uses a potential function argument on the levels of the tree. Each node is assigned a _level based on its rank. The key insight is that path compression prevents the number of nodes at high levels from growing. The total number of Find operations that can charge to any level-\ell node is bounded by O(nF1())O(n \cdot F^{-1}(\ell)) where FF is a fast-growing function. Summing over all levels gives O(nα(n))O(n \alpha(n)) total, hence O(α(n))O(\alpha(n)) amortised per operation. \blacksquare

Worked Example: Union-Find for Connected Components

Given an undirected graph with nn vertices and mm edges, determine the connected components.

for each vertex v:
MakeSet(v)
for each edge (u, v):
if Find(u) != Find(v):
Union(u, v)

After processing all edges, vertices in the same connected component have the same representative. Total time: O(n+mα(n))O(n+m)O(n + m \alpha(n)) \approx O(n + m).

2.9 Graphs

A graph G=(V,E)G = (V, E) can be represented by:

  • Adjacency matrix: Aij=1A_{ij} = 1 if (i,j)E(i,j) \in E. Space: O(V2)O(V^2). Edge lookup: O(1)O(1).
  • Adjacency list: For each vertex, store a list of neighbours. Space: O(V+E)O(V + E). Iterating over neighbours: O(deg(v))O(\mathrm{deg}(v)).
OperationAdjacency MatrixAdjacency List
SpaceO(V2)O(V^2)O(V+E)O(V + E)
Check edge (u,v)(u,v)O(1)O(1)O(deg(u))O(\mathrm{deg}(u))
Iterate neighboursO(V)O(V)O(deg(v))O(\mathrm{deg}(v))
Add edgeO(1)O(1)O(1)O(1)
Remove edgeO(1)O(1)O(deg(u))O(\mathrm{deg}(u))

:::caution Common Pitfall Choosing the wrong graph representation can make an algorithm asymptotically slower. Use adjacency matrices for dense graphs (EV2E \approx V^2) and adjacency lists for sparse graphs (EV2E \ll V^2). For example, BFS with an adjacency matrix takes O(V2)O(V^2) but with adjacency lists takes O(V+E)O(V + E).

:::