Skip to content

Memory Management

5.1 Contiguous Memory Allocation

Fixed partitioning. Memory divided into fixed-size partitions at boot. Internal fragmentation.

Variable partitioning. Partitions created dynamically. External fragmentation.

Allocation strategies:

StrategyDescription
First-fitAllocate the first hole large enough
Best-fitAllocate the smallest hole large enough
Worst-fitAllocate the largest hole
Next-fitContinue searching from the last allocation

Theorem 5.1. First-fit and best-fit have roughly equivalent utilisation; first-fit is faster.

5.2 Paging

Divide physical memory into fixed-size frames and logical memory into same-size pages. A page table maps page numbers to frame numbers.

physicaladdress=PT[p]×F+f\mathrm{physical} address = \mathrm{PT}[p] \times F + f

Where PT[p]\mathrm{PT}[p] is the frame number for page pp and FF is the frame size.

Page table entry fields:

FieldPurpose
Frame numberPhysical frame containing this page
Valid bitPage is in memory (1) or not (0)
Dirty bitPage has been modified
Reference bitPage accessed recently
ProtectionRead/write/execute permissions

Multi-level page tables. A single page table for a 64-bit address space is infeasible. A Two-level scheme uses a page directory indexed by the outer page number, pointing to inner Page tables:

Address:p1outerp2innerdoffset\mathrm{Address}: \underbrace{p_1}_{\mathrm{outer} \mid \underbrace{p_2}_{\mathrm{inner} \mid \underbrace{d}_{\mathrm{offset}}}}

X86-64 with 48-bit virtual addresses uses four-level page tables.

Inverted page table. One entry per physical frame (not per virtual page). Reduces memory Overhead but makes searching for a specific virtual page expensive; solved by hashing.

5.3 Segmentation

Segments divide the address space into logical units (code, data, stack, heap). Each segment has a base and limit:

physicaladdress=base+offset,ifoffset<limit\mathrm{physical} address = \mathrm{base} + \mathrm{offset}, \quad \mathrm{if} \mathrm{offset} \lt \mathrm{limit}

Advantages: Reflects program structure; supports sharing individual segments. Disadvantages: External fragmentation (variable-size segments).

Worked Example 5.1 — Segmented Address Translation

A process has three segments with the following segment table:

SegmentBaseLimit
Code40962048
Data122884096
Stack204802048

Translate logical address (segment = 1, offset = 1500):

  1. Check: offset 1500<1500 \lt limit 40964096. Valid.
  2. Physical address = base + offset = 12288+1500=1378812288 + 1500 = 13788.

Translate (segment = 2, offset = 2500):

  1. Check: offset 2500<2500 \lt limit 20482048. Invalid — segmentation fault.

5.4 Segmented Paging

Combine segmentation and paging: the segment offset is divided into page number and page offset.

Address:ssegmentppagedoffset\mathrm{Address}: \underbrace{s}_{\mathrm{segment} \mid \underbrace{p}_{\mathrm{page} \mid \underbrace{d}_{\mathrm{offset}}}}

Used by x86 (segmentation + paging).

5.5 Virtual Memory

Virtual memory allows processes to use more memory than physically available by keeping only active Pages in RAM; the rest reside on disk (swap space).

Demand paging. Pages loaded on access, not at process start.

Page fault handling:

  1. CPU generates logical address; page table entry has valid bit = 0.
  2. Trap to OS page fault handler.
  3. Find a free frame (possibly evicting a victim page).
  4. Read the page from disk into the frame.
  5. Update page table entry (valid bit = 1, frame number).
  6. Restart the faulting instruction.

Effective access time (EAT):

EAT=(1p)×ma+p×pf\mathrm{EAT} = (1 - p) \times \mathrm{ma} + p \times \mathrm{pf}

Where pp = page fault rate, ma\mathrm{ma} = memory access time, pf\mathrm{pf} = page fault service Time.

For p=0.001p = 0.001, ma=100\mathrm{ma} = 100 ns, pf=8\mathrm{pf} = 8 ms:

EAT=0.999×100+0.001×8000000=8.1  μs\mathrm{EAT} = 0.999 \times 100 + 0.001 \times 8\,000\,000 = 8.1 \; \mu\mathrm{s}

This is roughly 80×80\times slower than pure memory access, illustrating why a low page fault rate Is critical.

5.6 Translation Lookaside Buffer (TLB)

A TLB is a hardware cache of recently used page table entries, avoiding an extra memory access Per translation.

EAT=h×(TLB+ma)+(1h)×(TLB+ma+ma)\mathrm{EAT} = h \times (\mathrm{TLB} + \mathrm{ma}) + (1 - h) \times (\mathrm{TLB} + \mathrm{ma} + \mathrm{ma})

Where hh is the TLB hit ratio.

For h=0.99h = 0.99, TLB=2\mathrm{TLB} = 2 ns, ma=100\mathrm{ma} = 100 ns:

EAT=0.99×102+0.01×202=103  ns\mathrm{EAT} = 0.99 \times 102 + 0.01 \times 202 = 103 \; \mathrm{ns}

TLB coherence. When the OS modifies a page table entry, it must invalidate the corresponding TLB entry. On x86-64: invlpg for single-entry invalidation, or reload CR3 to flush the entire TLB.

Worked Example 5.2 — TLB + Page Fault EAT Calculation

A system with TLB access time = 2 ns, memory access time = 100 ns, page fault service time = 8 ms, TLB hit ratio = 0.80, and page fault rate = 0.0005.

Case 1: TLB hit, no page fault (0.80×0.9995=0.79960.80 \times 0.9995 = 0.7996): Time = TLB+ma=2+100=102\mathrm{TLB} + \mathrm{ma} = 2 + 100 = 102 ns.

Case 2: TLB miss, no page fault (0.20×0.9995=0.19990.20 \times 0.9995 = 0.1999): Time = TLB+2×ma=2+200=202\mathrm{TLB} + 2 \times \mathrm{ma} = 2 + 200 = 202 ns.

Case 3: Page fault (0.00050.0005): Time = TLB+ma+pf=2+100+8×106=8000002\mathrm{TLB} + \mathrm{ma} + \mathrm{pf} = 2 + 100 + 8 \times 10^6 = 8000002 ns.

EAT=0.7996×102+0.1999×202+0.0005×8000102\mathrm{EAT} = 0.7996 \times 102 + 0.1999 \times 202 + 0.0005 \times 8\,000\,102 EAT=81.56+40.38+4000.05=4121.99  ns4.12  μs\mathrm{EAT} = 81.56 + 40.38 + 4000.05 = 4121.99 \;\mathrm{ns} \approx 4.12 \;\mu\mathrm{s}

This is roughly 41×41\times slower than pure memory access, driven almost entirely by the page fault Component.

5.7 Page Replacement Algorithms

When physical memory is full, a victim page must be selected for eviction.

Optimal (OPT/MIN). Replace the page not used for the longest time in the future. Provably Optimal but requires future knowledge; used only as a benchmark.

Theorem 5.2 (Belady”s Optimality). OPT yields the fewest page faults for any reference string.

Proof. If OPT replaces page xx (used furthest in future) and another algorithm replaces page yy (used sooner), the other algorithm faults at least once more by the time yy is next Referenced. \blacksquare

First-In, First-Out (FIFO). Replace the oldest page. Simple but may evict heavily used pages. Suffers from Belady’s anomaly: increasing frames can increase faults.

Example of Belady’s anomaly. Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5. With 3 Frames: 9 faults. With 4 frames: 10 faults.

Least Recently Used (LRU). Replace the page not used for the longest time. Approximates OPT Well. Does not suffer from Belady’s anomaly.

Theorem 5.3. LRU is a stack algorithm: the set of pages in n+1n+1 frames is a superset of the Set in nn frames, for any reference string.

Proof. When adding a frame, LRU keeps the nn most recently used pages (same as before) plus the New frame. No page in the nn-frame set can be more recently used than a page outside it, so the n+1n+1 set must contain the nn-frame set. \blacksquare

Clock (Second Chance). Pages in a circular list with a reference bit. On replacement:

  1. If reference bit is 0, replace the page.
  2. If reference bit is 1, clear it and advance.

Approximates LRU with O(1)O(1) per operation.

LFU (Least Frequently Used). Replace the page with the smallest access count. May fail to adapt To changing access patterns.

Approximating LRU in practice. Most OSes use a variant of Clock. Linux uses an LRU-like Approximation with active and inactive lists: pages on the active list are protected; pages Not accessed are demoted to the inactive list; eviction targets the inactive list.

Worked Example 5.3 — Optimal Page Replacement

Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1. Three frames.

RefFrame 1Frame 2Frame 3Fault?Victim
77Yes
070Yes
1701Yes
2201Yes7 (used at 18)
0201No
3231Yes0 (used at 10)
0230Yes1 (used at 14)
4240Yes3 (used at 11)
2240No
3340Yes2 (used at 13)
0340No
3340No
2240Yes3 (used at 11, already past)
1210Yes4 (used at \infty)
2210No
0210No
1210No
7710Yes2 (used at 13, already past)
0710No
1710No

Total page faults: 9. This is the theoretical minimum.

Worked Example 5.4 — LRU Page Replacement

Same reference string, three frames. LRU replaces the page whose last use was furthest in the past.

RefF1F2F3Fault?Victim (least recent)
77Yes
070Yes
1701Yes
2201Yes7
0201No
3231Yes0
0031Yes2
4041Yes3
2042Yes1
3342Yes0
0302Yes4
3302No
2302No
1102Yes3
2102No
0102No
1102No
7107Yes2
0107No
1107No

Total page faults: 12. LRU produces 33% more faults than optimal, but does not require future Knowledge.

Worked Example 5.5 — Clock (Second Chance) Replacement

Same reference string, three frames. Clock hand starts at frame 0. R = reference bit.

RefState (R bits)Fault?Action
7[7,R] [-] [-]YesLoad 7 into F0
0[7,R] [0,R] [-]YesLoad 0 into F1
1[7,R] [0,R] [1,R]YesLoad 1 into F2
2[2,R] [0,R] [1,R]YesF0: R=1, clear, advance; F1: R=1, clear, advance; F2: R=1, clear, advance; F0: R=0, replace with 2
0[2,R] [0,R] [1,R]NoSet R on F1
3[2,R] [3,R] [1,R]YesF0: R=1, clear, advance; F1: R=1, clear, advance; F2: R=1, clear, advance; F0: R=0, replace with 3
0[0,R] [3,R] [1,R]YesF0: R=0, replace with 0
4[0,R] [4,R] [1,R]YesF0: R=1, clear, advance; F1: R=0, replace with 4
2[2,R] [4,R] [1,R]YesF0: R=0, replace with 2
3[2,R] [3,R] [1,R]YesF0: R=1, clear, advance; F1: R=0, replace with 3
0[0,R] [3,R] [1,R]YesF0: R=0, replace with 0
3[0,R] [3,R] [1,R]NoSet R on F1
2[2,R] [3,R] [1,R]YesF0: R=0, replace with 2
1[2,R] [3,R] [1,R]NoSet R on F2
2[2,R] [3,R] [1,R]NoSet R on F0
0[2,R] [0,R] [1,R]YesF0: R=1, clear, advance; F1: R=1, clear, advance; F2: R=1, clear, advance; F0: R=0, replace with 0
1[0,R] [1,R] [1,R]NoSet R on F2
7[7,R] [1,R] [1,R]YesF0: R=0, replace with 7
0[7,R] [0,R] [1,R]YesF0: R=1, clear, advance; F1: R=0, replace with 0
1[7,R] [0,R] [1,R]NoSet R on F2

Total page faults: 15. Clock performs worse than LRU here but requires only O(1)O(1) per operation And no global ordering of references.

:::caution Common Pitfall Belady’s anomaly applies to FIFO but not to LRU or Optimal. Adding more memory does not always Reduce page faults for non-stack algorithms.

5.8 Thrashing

Thrashing occurs when a process spends more time paging than executing. This happens when the Total working set of all active processes exceeds physical memory.

Working set model. W(t,Δ)W(t, \Delta) is the set of pages referenced in the last Δ\Delta references. If Wi>\sum W_i \gt available frames, thrashing occurs.

Solutions:

  1. Working set strategy: Admit a process only if its working set fits.
  2. Page fault frequency (PFF): Adjust resident set size based on observed fault rate.
  3. Local replacement: Restrict eviction to the process’s own frames.
Worked Example 5.6 — Thrashing Analysis

A system has 64 frames of physical memory. Four processes with the following working set sizes (Δ=5\Delta = 5 references):

ProcessWorking Set Size
P1P_115
P2P_220
P3P_318
P4P_425
Total78

Total working set = 78 frames, but only 64 available. Thrashing occurs.

Solution 1 (working set admission): Suspend P4P_4 (largest working set). Remaining: 15+20+18=536415 + 20 + 18 = 53 \leq 64. Thrashing eliminated.

Solution 2 (PFF): If P3P_3‘s page fault rate exceeds the upper threshold, increase its resident Set size by stealing frames from P4P_4. If P4P_4‘s fault rate drops below the lower threshold, Decrease its allocation.

Theorem 5.4. If the sum of all working sets exceeds the number of physical frames, at least one Process must thrash.

Proof. By the pigeonhole principle, if i=1nWi>F\sum_{i=1}^{n} \lvert W_i \rvert \gt F where FF is the Total number of frames, at least one process cannot hold its entire working set in memory. That Process will repeatedly evict pages it needs, causing its page fault rate to dominate its execution Time. \blacksquare

5.9 Copy-on-Write (COW)

Copy-on-Write is an optimisation for fork(). Instead of copying all pages, parent and child Share physical frames (marked read-only). On a write to a shared page, a fault triggers a copy of Just that page.

  • fork() becomes nearly O(1)O(1) instead of O(n)O(n) where nn is the number of pages.
  • If the child immediately calls exec()No copies are ever made.

:::