Skip to content

Transaction Management

6.1 ACID Properties

A transaction is a logical unit of work that must satisfy:

PropertyMeaning
AtomicityAll or nothing: either all operations commit or none
ConsistencyTransforms the database from one consistent state to another
IsolationConcurrent transactions do not interfere with each other
DurabilityOnce committed, the effects survive failures

6.2 Isolation Levels

SQL defines four isolation levels with increasing strictness:

LevelDirty ReadNon-repeatable ReadPhantom Read
Read UncommittedPossiblePossiblePossible
Read CommittedPreventedPossiblePossible
Repeatable ReadPreventedPreventedPossible
SerializablePreventedPreventedPrevented
  • Dirty read: Reading uncommitted data from another transaction.
  • Non-repeatable read: Same query returns different results within the same transaction due to updates by another transaction.
  • Phantom read: Same range query returns new rows due to inserts by another transaction.

Default levels: PostgreSQL: Read Committed. MySQL InnoDB: Repeatable Read. SQL Server: Read Committed. Oracle: Read Committed.

6.3 Serialisability

A schedule is serialisable if its effect is equivalent to some serial schedule (where Transactions execute one after another).

Conflict serialisability. Two operations conflict if they belong to different transactions, Access the same data item, and at least one is a write. A schedule is conflict-serialisable if its precedence graph (also called serialisation graph) is acyclic.

  • Precedence graph: Nodes are transactions. Draw a directed edge TiTjT_i \to T_j if an operation of TiT_i conflicts with and precedes an operation of TjT_j.

Theorem 6.1. A schedule is conflict-serialisable if and only if its precedence graph is acyclic.

Proof sketch. If the precedence graph has a cycle, no serial ordering can respect all the Precedence constraints, so the schedule is not conflict-serialisable. Conversely, a topological sort Of an acyclic graph gives a serial order equivalent to the schedule. \blacksquare

View serialisability. A schedule SS is view-serialisable if it is view-equivalent to a Serial schedule S"S". View equivalence requires:

  1. Initial read: If TiT_i reads the initial value of QQ in SSIt does so in SS'.
  2. Updated read: If TiT_i reads the value of QQ written by TjT_j in SSIt does so in SS'.
  3. Final write: If TiT_i performs the final write of QQ in SSIt does so in SS'.

Every conflict-serialisable schedule is view-serialisable, but the converse does not hold. Testing For view serialisability is NP-complete.

Worked Example 6.1: Testing Conflict Serialisability

Schedule: S:r1(A),w1(A),r2(A),w2(A),r1(B),w1(B),r2(B),w2(B)S: r_1(A), w_1(A), r_2(A), w_2(A), r_1(B), w_1(B), r_2(B), w_2(B).

Build precedence graph:

  • w1(A)w_1(A) before r2(A)r_2(A): edge T1T2T_1 \to T_2.
  • r2(A)r_2(A) before w1(A)w_1(A): No — r2(A)r_2(A) is after w1(A)w_1(A).
  • w2(A)w_2(A) before… Nothing comes after w2(A)w_2(A) on AA.
  • w1(B)w_1(B) before r2(B)r_2(B): edge T1T2T_1 \to T_2.
  • w2(B)w_2(B) before… Nothing comes after w2(B)w_2(B) on BB.

Precedence graph: T1T2T_1 \to T_2 (acyclic).

Equivalent serial schedule: T1,T2T_1, T_2 (topological order).

Now consider: S:r1(A),w1(A),r2(B),w2(B),r1(B),w1(B),r2(A),w2(A)S': r_1(A), w_1(A), r_2(B), w_2(B), r_1(B), w_1(B), r_2(A), w_2(A).

  • w1(A)w_1(A) before r2(A)r_2(A): T1T2T_1 \to T_2.
  • w2(B)w_2(B) before r1(B)r_1(B): T2T1T_2 \to T_1.

Precedence graph has cycle: T1T2T1T_1 \to T_2 \to T_1. Not conflict-serialisable.

6.4 Concurrency Control

Two-Phase Locking (2PL).

Protocol:

  1. Growing phase: Acquire locks; do not release any.
  2. Shrinking phase: Release locks; do not acquire any.

Theorem 6.2. 2PL guarantees conflict serialisability.

Proof. By the protocol, transaction TiT_i acquires all its locks before releasing any. If TjT_j Requests a lock held by TiT_i, TjT_j waits. This creates a total ordering on conflicting Operations, which corresponds to a serial schedule. \blacksquare

Variants:

  • Strict 2PL: All locks held until commit/abort. Prevents cascading aborts.
  • Rigorous 2PL: All locks held until commit/abort (same as strict 2PL for exclusive locks).

Problem with 2PL: Deadlocks. Detection (wait-for graph) or prevention (timestamp ordering).

Timestamp Ordering (TO). Each transaction receives a timestamp TS(T)TS(T). For conflicting Operations:

  • TiT_i reads QQ: if QQ was last written by TjT_j with TS(Tj)>TS(Ti)TS(T_j) \gt TS(T_i)Abort TiT_i.
  • TiT_i writes QQ: if QQ was last read by TjT_j with TS(Tj)>TS(Ti)TS(T_j) \gt TS(T_i)Abort TiT_i.

No deadlocks (no waiting), but may abort transactions unnecessarily.

Optimistic Concurrency Control (OCC). Assumes conflicts are rare. Three phases:

  1. Read phase: Transaction reads data and writes to private workspace. No locks acquired.
  2. Validation phase: Before commit, check that no data read by this transaction has been modified by a committed transaction since the read phase began.
  3. Write phase: If validation passes, apply private writes to the database. Otherwise, abort and restart.

Forward validation: Check against committed transactions. For each data item QQ read by TiT_iVerify that the transaction that last wrote QQ committed before TiT_i‘s read phase began. Backward validation: Check against active transactions.

OCC performs well when conflicts are rare but degrades under high contention (many restarts).

Worked Example 6.2: Optimistic Concurrency Control

Scenario: Two transactions T1T_1 and T2T_2 both read and update account balances.

  • T1T_1: Read A=100A = 100Read B=200B = 200Write A=150A = 150Write B=150B = 150 (transfer 50 from BB to AA).
  • T2T_2: Read A=100A = 100Write A=75A = 75 (withdraw 25).

Execution:

  1. T1T_1 read phase: Reads A=100A = 100, B=200B = 200 into private workspace.
  2. T2T_2 read phase: Reads A=100A = 100 into private workspace.
  3. T2T_2 validation phase: Checks if AA was modified by any committed transaction since T2T_2 started. No committed transactions modified AA. Validation passes.
  4. T2T_2 write phase: Writes A=75A = 75. T2T_2 commits.
  5. T1T_1 validation phase: Checks if AA or BB was modified by any committed transaction since T1T_1 started. T2T_2 committed and modified AA. Validation fails.
  6. T1T_1 is aborted and restarted. On restart, T1T_1 reads A=75A = 75, B=200B = 200And correctly computes A=125A = 125, B=150B = 150.

Multiversion Concurrency Control (MVCC). Maintain multiple versions of each data item.

  • Read operations read a consistent snapshot (a version visible at the start of the transaction).
  • Write operations create a new version; the old version remains for concurrent readers.
  • Conflicts between readers and writers are avoided (no read-write contention).

MVCC pseudocode (simplified):

def read(txn, item):
version = find_version(item, txn.start_ts)
return version.value
def write(txn, item, value):
if has_conflicting_write(item, txn.start_ts, txn.id):
abort(txn)
create_version(item, value, txn.id, txn.start_ts)

MVCC implementations:

  • PostgreSQL: Snapshot Isolation. Each transaction sees a snapshot of committed data at transaction start. Uses xmin/xmax on each tuple.
  • MySQL InnoDB: MVCC with undo logs. Each row has hidden columns for transaction ID and rollback pointer.
  • Oracle: Snapshot Isolation at the statement level by default.

MVCC anomaly: Write Skew. Two concurrent transactions read overlapping data and update Non-overlapping parts based on what they read. Serializable Snapshot Isolation (SSI) in PostgreSQL 9.1+ Detects and prevents this.

6.5 Log-Based Recovery

Write-Ahead Logging (WAL). Before writing a modified page to disk, write the log record describing The modification to stable storage.

WAL protocol:

  1. Before a data page is flushed to disk, all log records pertaining to that page must be on stable storage.
  2. Before a transaction commits, all its log records must be on stable storage.

Log records: [LSN, TxnID, PageID, BeforeImage, AfterImage, Commit/Abort].

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics). The standard recovery Algorithm used in IBM DB2, PostgreSQL, SQL Server, and others:

  1. Analysis phase: Scan the log from the last checkpoint. Identify dirty pages and active transactions. Reconstruct the transaction table (which transactions were active at crash time) and the dirty page table (which pages may not have been written to disk).
  2. Redo phase: Scan forward from the earliest LSN in the dirty page table. Redo all logged updates (from both committed and uncommitted transactions) to restore the database to the exact state at the time of the crash. A record is redone only if the page’s on-disk LSN is less than the record’s LSN (ensuring idempotency).
  3. Undo phase: Scan backward from the end of the log. Undo all updates from transactions that were active at crash time (did not commit). Write compensation log records (CLRs) so that undo operations themselves are logged.

Key ARIES properties:

  • Idempotent: Recovery can be restarted safely (re-processing log records has no additional effect).
  • Steal/no-force: Uncommitted pages can be flushed to disk (steal), and committed pages need not be flushed (no-force). This improves performance at the cost of more recovery work.
  • Fine-grained logging: Log records track individual page modifications, enabling precise recovery.

Checkpoints. Periodically flush dirty pages to disk and write a checkpoint record. Reduces the Amount of log to scan during recovery.

// Simplified checkpoint pseudocode
void checkpoint() {
flush_all_dirty_pages();
write_log_record(CHECKPOINT, dirty_page_list, active_txns);
flush_log();
}

6.6 Buffer Management

The buffer pool caches frequently accessed disk pages in memory. Replacement policies include LRU, Clock, and variants. PostgreSQL uses a Clock sweep algorithm; MySQL InnoDB uses an LRU variant with a Midpoint insertion strategy to avoid scan pollution.