Skip to content

Database Systems

1. Introduction to Database Systems

1.1 What is a Database

A database is an organised collection of structured data, managed by a database management System (DBMS). A DBMS provides:

  • Data definition (schema creation and modification).
  • Data manipulation (query, insert, update, delete).
  • Concurrency control (multiple users accessing data simultaneously).
  • Recovery (restoring data after failures).
  • Security and access control.

1.2 Database Architecture

Three-schema architecture (ANSI-SPARC):

The ANSI-SPARC (American National Standards Institute, Standards Planning and Requirements Committee) Architecture defines three levels of abstraction:

  1. External schema (view level): How different users/applications see the data. Each user group may have a different view. A view may hide or rename columns, join tables, or aggregate data.
  2. Conceptual schema (logical level): The logical structure of the entire database (tables, relationships, constraints). Describes entities, attributes, relationships, integrity constraints, and security information. This level is database-wide and is designed by the DBA.
  3. Internal schema (physical level): How data is stored on disk (indexes, file organisation, compression, encryption). Includes data structures, access paths, and storage allocation.

The DBMS maps between levels via the external/conceptual mapping (translates external views to The conceptual schema) and the conceptual/internal mapping (translates the conceptual schema to Internal storage).

Data independence:

  • Logical data independence: The conceptual schema can change without affecting external views. Example: adding a new column to a table does not require modifying existing views that do not reference it.
  • Physical data independence: The internal schema can change without affecting the conceptual schema. Example: changing from a B+ tree index to a hash index does not require modifying queries or table definitions.

1.3 Data Models

  • Relational model: Data organised into tables (relations) with rows (tuples) and columns (attributes). Based on relational algebra and calculus.
  • Entity-Relationship (ER) model: Conceptual design tool using entities, relationships, and attributes.
  • Object-relational model: Extends the relational model with object-oriented features (inheritance, complex types, methods). Example: PostgreSQL supports table inheritance and array types.
  • NoSQL models: Document, key-value, graph, column-family (covered in Section 8).

1.4 Comparison of Data Models

FeatureRelationalObject-RelationalDocumentKey-ValueGraphColumn-Family
Data structureTablesTables + objectsJSON/BSONKey-value pairsNodes + edgesColumn families
SchemaFixedFixed + extensibleFlexibleSchemalessFlexibleFlexible per row
Query languageSQLSQL + extensionsVariesGet/Set/ScanCypher/GremlinCQL/SQL-like
ACID supportFullFullLimitedLimitedPer-nodeTunable
ScalingVerticalVerticalHorizontalHorizontalHorizontalHorizontal
Best forStructured data, complex queriesComplex types, inheritanceContent management, APIsCaching, sessionsSocial networks, recommendationsTime series, logs, analytics

Choosing a model. Relational databases remain the default for structured data with complex Queries and strong consistency requirements. NoSQL databases excel when horizontal scalability or Flexible schemas are paramount. The choice depends on the workload, not on a blanket preference.

2. Relational Model

2.1 Relations, Tuples, Attributes

A relation RR over attributes A1,,AnA_1, \ldots, A_n is a set of tuples (a1,,an)(a_1, \ldots, a_n) where Each aia_i is drawn from the domain of AiA_i. A relation is a subset of D1×D2××DnD_1 \times D_2 \times \cdots \times D_n.

Properties of relations:

  • Each relation has a unique name.
  • Each attribute has a unique name within a relation.
  • Tuples are unordered (a relation is a set).
  • All tuples are distinct (no duplicates in a mathematical relation, though SQL tables allow them).
  • Attributes are unordered.

2.2 Keys

Let RR be a relation schema.

  • Superkey: A set of attributes KRK \subseteq R such that no two distinct tuples agree on all attributes in KK.
  • Candidate key: A minimal superkey (no proper subset is also a superkey).
  • Primary key: One of the candidate keys, chosen by the designer.
  • Foreign key: An attribute (or set) FKFK in relation R1R_1 that references the primary key PKPK of relation R2R_2. Enforces referential integrity: every value of FKFK must appear as a value of PKPK in R2R_2Or be NULL.

Example. Student(ID, Name, Email). ID is a candidate key (unique). If Email is also unique, It is another candidate key. One is chosen as the primary key.

Theorem 2.1. Every relation has at least one candidate key (possibly the entire set of attributes).

2.3 Relational Algebra

Relational algebra provides a formal query language based on operations on relations.

Selection σθ(R)\sigma_{\theta}(R): Return tuples from RR satisfying condition θ\theta.

σdept="CS(Student)\sigma_{\mathrm{dept} = \mathrm{"CS'}(\mathrm{Student})}

Projection πA1,,Ak(R)\pi_{A_1, \ldots, A_k}(R): Return a relation containing only attributes A1,,AkA_1, \ldots, A_k.

πname,gpa(Student)\pi_{\mathrm{name}, \mathrm{gpa}(\mathrm{Student})}

Union RSR \cup S: All tuples in RR or SS (both must be union-compatible: same arity and Attribute domains).

Difference RSR - S: Tuples in RR but not in SS.

Cartesian product R×SR \times S: All pairs (r,s)(r, s) where rRr \in R and sSs \in S.

Rename ρx(R)\rho_{x}(R): Rename relation RR to xx.

Natural join RSR \bowtie S: Combine tuples from RR and SS that agree on all common attributes.

RS=πRS(σR.common=S.common(R×S))R \bowtie S = \pi_{R \cup S}(\sigma_{R.\mathrm{common} = S.\mathrm{common}(R \times S))}

Theta join RθSR \bowtie_{\theta} S: σθ(R×S)\sigma_{\theta}(R \times S).

Equi-join RR.A=S.BSR \bowtie_{R.A = S.B} S: A theta join where θ\theta is an equality on specific Attributes. Keeps both join columns.

Left outer join RleftSR \bowtie_{\mathrm{left} S}: All tuples from RRMatched with SS where possible; NULL-padded otherwise.

Right outer join RrightSR \bowtie_{\mathrm{right} S}: All tuples from SSMatched with RR.

Full outer join RfullSR \bowtie_{\mathrm{full} S}: All tuples from both RR and SS.

Division R÷SR \div S: Tuples tt in πRS(R)\pi_{R-S}(R) such that for every tuple sSs \in S, (t,s)R(t, s) \in R.

R÷S=πRS(R)πRS((πRS(R)×S)R)R \div S = \pi_{R-S}(R) - \pi_{R-S}\Bigl(\bigl(\pi_{R-S}(R) \times S\bigr) - R\Bigr)

Example. Find students who have taken all courses:

πsid,cid(Takes)÷πcid(Course)\pi_{\mathrm{sid}, \mathrm{cid}(\mathrm{Takes}) \div \pi_{\mathrm{cid}(\mathrm{Course})}}

Worked Example 2.1: Complex Relational Algebra Query

Query: Find the names of students who have enrolled in every course offered by the CS department.

Schema: Student(sid, name, dept)``Course(cid, title, dept)``Takes(sid, cid, semester).

Step 1. Get CS course IDs:

CCS=πcid(σdept=CS(Course))C_{\mathrm{CS} = \pi_{\mathrm{cid}\bigl(\sigma_{\mathrm{dept} = \mathrm{'CS'}(\mathrm{Course})\bigr)}}}

Step 2. Get student-course pairs from enrolments:

T=πsid,cid(Takes)T = \pi_{\mathrm{sid}, \mathrm{cid}(\mathrm{Takes})}

Step 3. Students who have taken all CS courses (division):

Sall=T÷CCSS_{\mathrm{all} = T \div C_{\mathrm{CS}}}

Step 4. Get names:

πname(SallStudent)\pi_{\mathrm{name}(S_{\mathrm{all} \bowtie \mathrm{Student})}}

Combined:

πname((πsid,cid(Takes)÷πcid(σdept=CS(Course)))Student)\pi_{\mathrm{name}\Bigl(\bigl(\pi_{\mathrm{sid}, \mathrm{cid}(\mathrm{Takes}) \div \pi_{\mathrm{cid}(\sigma_{\mathrm{dept}=\mathrm{'CS'}(\mathrm{Course}))\bigr) \bowtie \mathrm{Student}\Bigr)}}}}

Worked Example 2.2: Relational Algebra with Outer Joins

Query: Find all students and the number of CS courses they have taken, including students who have Taken no CS courses (count should be 0).

Step 1. Filter enrolments to CS courses:

ECS=πsid,cid(Takesσdept=CS(Course))E_{\mathrm{CS} = \pi_{\mathrm{sid}, \mathrm{cid}\bigl(\mathrm{Takes} \bowtie \sigma_{\mathrm{dept}=\mathrm{'CS'}(\mathrm{Course})\bigr)}}}

Step 2. Left outer join with Student to include those with no CS courses:

Result=StudentleftECS\mathrm{Result} = \mathrm{Student} \bowtie_{\mathrm{left} E_{\mathrm{CS}}}

Note: aggregation over outer join results handles NULL values (they are excluded from COUNT).

In SQL, this translates directly to:

SELECT S.sid, S.name, COUNT(E.cid) AS num_cs_courses
FROM Student S
LEFT JOIN Takes T ON S.sid = T.sid
LEFT JOIN Course C ON T.cid = C.cid AND C.dept = 'CS'
GROUP BY S.sid, S.name;

2.4 Relational Calculus

Tuple relational calculus. A query has the form tP(t)\\{t \mid P(t)\\} where tt is a tuple variable And PP is a well-formed formula. The formula is built from:

  • Atoms: tRt \in R (tuple tt is in relation RR), t[A]ops[A]t[A] \mathbin{\mathrm{op} s[A]} (comparison), t[A]opct[A] \mathbin{\mathrm{op} c} (comparison with constant), where op=,,<,>,,\mathrm{op} \in \\{=, \neq, \lt, \gt, \le, \ge\\}.
  • Logical connectives: \land (and), \lor (or), ¬\lnot (not).
  • Quantifiers: t\exists t (there exists), t\forall t (for all).

{tsTakes(t[name]=s[name]s[grade]=A)}\{t \mid \exists s \in \mathrm{Takes}(t[\mathrm{name}] = s[\mathrm{name}] \land s[\mathrm{grade}] = \mathrm{'A'})\}

Domain relational calculus. Variables range over individual attribute domains (not entire tuples). A query has the form x1,,xkP(x1,,xk)\\{\langle x_1, \ldots, x_k \rangle \mid P(x_1, \ldots, x_k)\\}.

{ns,g  (Takes(s,CS101,g)Student(s,n,)g=A)}\{ \langle n \rangle \mid \exists s, g \;(\mathrm{Takes}(s, \mathrm{'CS101'}, g) \land \mathrm{Student}(s, n, \ldots) \land g = \mathrm{'A'})\}

Safety. A calculus expression is safe if it yields a finite relation. The expression t¬(tR)\\{t \mid \lnot(t \in R)\\} is unsafe (it includes every tuple not in RRAn infinite set). We Restrict variables to domains of the relations appearing in the query.

Theorem 2.2 (Codd). Relational algebra and relational calculus (both tuple and domain) are Equally expressive: every query expressible in one is expressible in the other.

Worked Example 2.3: Tuple Relational Calculus

Query: Find students who have taken no CS course.

Using tuple relational calculus:

{ttStudent¬sTakes(s[sid]=t[sid]cCourse(c[cid]=s[cid]c[dept]=CS)}\{t \mid t \in \mathrm{Student} \land \lnot \exists s \in \mathrm{Takes}\bigl(s[\mathrm{sid}] = t[\mathrm{sid}] \land \exists c \in \mathrm{Course}(c[\mathrm{cid}] = s[\mathrm{cid}] \land c[\mathrm{dept}] = \mathrm{'CS'}\bigr)\}

Translation to relational algebra:

πname(Student)πname(StudentTakesσdept=CS(Course))\pi_{\mathrm{name}(\mathrm{Student}) - \pi_{\mathrm{name}(\mathrm{Student} \bowtie \mathrm{Takes} \bowtie \sigma_{\mathrm{dept}=\mathrm{'CS'}(\mathrm{Course}))}}}

2.5 Equivalence Rules for Relational Algebra

The following rules allow the optimiser to transform queries without changing their meaning. Each rule States that two expressions produce the same relation for any input relations.

Rule 1 (Cascade of selections). σθ1θ2(R)σθ1(σθ2(R))\sigma_{\theta_1 \land \theta_2}(R) \equiv \sigma_{\theta_1}(\sigma_{\theta_2}(R)).

Proof. A tuple satisfies θ1θ2\theta_1 \land \theta_2 if and only if it satisfies both θ1\theta_1 and θ2\theta_2. Applying σθ2\sigma_{\theta_2} first removes tuples failing θ2\theta_2; then σθ1\sigma_{\theta_1} removes tuples failing θ1\theta_1 among the remainder. The result is exactly the Tuples satisfying both conditions. \blacksquare

Rule 2 (Commutativity of selections). σθ1(σθ2(R))σθ2(σθ1(R))\sigma_{\theta_1}(\sigma_{\theta_2}(R)) \equiv \sigma_{\theta_2}(\sigma_{\theta_1}(R)).

Proof. Immediate from the commutativity of logical conjunction (θ1θ2θ2θ1\theta_1 \land \theta_2 \equiv \theta_2 \land \theta_1). \blacksquare

Rule 3 (Selection pushdown through cross product). If θ\theta involves only attributes of RR Then σθ(R×S)σθ(R)×S\sigma_{\theta}(R \times S) \equiv \sigma_{\theta}(R) \times S.

Proof. For each pair (r,s)(r, s) with rRr \in R and sSs \in SThe condition θ\theta depends only on rr. Filtering (r,s)(r, s) by θ\theta on R×SR \times S is equivalent to first filtering RR by θ\theta And then forming the cross product, since ss does not affect the result of θ\theta. \blacksquare

Rule 4 (Selection pushdown through join). If θ\theta involves only attributes of RRThen σθ(RS)σθ(R)S\sigma_{\theta}(R \bowtie S) \equiv \sigma_{\theta}(R) \bowtie S.

Proof. The join RSR \bowtie S combines matching pairs from RR and SS. Applying σθ\sigma_{\theta} After the join filters these pairs by θ\theta on RR‘s attributes. Filtering RR first removes Non-matching RR-tuples before the join, yielding the same final set of pairs. \blacksquare

Rule 5 (Commutativity of joins). RSSRR \bowtie S \equiv S \bowtie R.

Proof. The natural join combines tuples agreeing on common attributes. This relation is symmetric In RR and SS. \blacksquare

Rule 6 (Associativity of joins). (RS)TR(ST)(R \bowtie S) \bowtie T \equiv R \bowtie (S \bowtie T).

Proof. Both expressions produce tuples from R×S×TR \times S \times T that agree on all attributes Shared by any pair of the three relations. \blacksquare

Rule 7 (Projection pushdown). If L1L2L_1 \subseteq L_2 and L2L_2 includes all attributes used in Join conditions, then πL1(RS)πL1(πL2(R)πL2(S))\pi_{L_1}(R \bowtie S) \equiv \pi_{L_1}(\pi_{L_2}(R) \bowtie \pi_{L_2}(S)).

Proof. Projecting after the join retains only attributes in L1L_1. Since L2L_2 includes all Attributes needed for the join, performing the join on projected versions of RR and SS yields the Same join result, from which L1L_1 is then projected. \blacksquare

These rules form the foundation of heuristic query optimisation (see Section 7).

3. SQL

3.1 Data Definition Language (DDL)

CREATE TABLE Student (
ID INT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
Email VARCHAR(255) UNIQUE,
GPA DECIMAL(3,2) CHECK (GPA >= 0.0 AND GPA <= 4.0),
Dept VARCHAR(50) DEFAULT 'Undeclared'
);
CREATE TABLE Enrolment (
StudentID INT REFERENCES Student(ID) ON DELETE CASCADE,
CourseID VARCHAR(10) REFERENCES Course(CourseID),
Grade CHAR(2),
Semester VARCHAR(6),
PRIMARY KEY (StudentID, CourseID, Semester)
);
CREATE INDEX idx_student_dept ON Student(Dept);
CREATE INDEX idx_enrolment_student ON Enrolment(StudentID);

Constraints:

ConstraintEnforcement
PRIMARY KEYUnique, not null
FOREIGN KEYReferential integrity
UNIQUENo duplicate values
NOT NULLNo null values
CHECKArbitrary boolean condition
DEFAULTDefault value if none specified

3.2 Data Manipulation Language (DML)

INSERT INTO Student (ID, Name, Email, GPA)
VALUES (1, 'Alice', 'alice@univ.edu', 3.9);
UPDATE Student SET GPA = 3.95 WHERE ID = 1;
DELETE FROM Student WHERE GPA < 1.0;

3.3 Queries

SELECT S.Name, S.GPA, C.Title
FROM Student S
JOIN Enrolment E ON S.ID = E.StudentID
JOIN Course C ON E.CourseID = C.CourseID
WHERE E.Grade IN ('A', 'A-')
AND C.Dept = 'CS'
AND S.GPA > 3.5
ORDER BY S.GPA DESC;

Aggregate functions: COUNT``SUM``AVG``MIN``MAX.

SELECT Dept, COUNT(*) AS NumStudents, AVG(GPA) AS AvgGPA
FROM Student
GROUP BY Dept
HAVING COUNT(*) > 10
ORDER BY AvgGPA DESC;

WHERE vs. HAVING: WHERE filters rows before grouping; HAVING filters groups after.

3.4 Joins

-- Inner join: only matching rows
SELECT * FROM Student S JOIN Enrolment E ON S.ID = E.StudentID;
-- Left join: all students, with enrolment data where available
SELECT * FROM Student S LEFT JOIN Enrolment E ON S.ID = E.StudentID;
-- Self-join: find students with the same name
SELECT S1.Name, S1.ID, S2.ID
FROM Student S1 JOIN Student S2 ON S1.Name = S2.Name AND S1.ID < S2.ID;
-- Cross join: Cartesian product
SELECT * FROM Student CROSS JOIN Course;

3.5 Subqueries

-- Scalar subquery
SELECT Name FROM Student
WHERE GPA > (SELECT AVG(GPA) FROM Student);
-- Correlated subquery
SELECT S.Name, S.Dept
FROM Student S
WHERE GPA = (
SELECT MAX(GPA) FROM Student S2 WHERE S2.Dept = S.Dept
);
-- EXISTS / NOT EXISTS
SELECT Name FROM Student S
WHERE EXISTS (
SELECT 1 FROM Enrolment E
WHERE E.StudentID = S.ID AND E.Grade = 'A+'
);
-- IN subquery
SELECT Name FROM Student
WHERE ID IN (SELECT StudentID FROM Enrolment WHERE CourseID = 'CS101');
-- WITH (Common Table Expression)
WITH DeptAvg AS (
SELECT Dept, AVG(GPA) AS AvgGPA FROM Student GROUP BY Dept
)
SELECT S.Name, S.GPA, D.AvgGPA
FROM Student S JOIN DeptAvg D ON S.Dept = D.Dept
WHERE S.GPA > D.AvgGPA;

3.6 Window Functions

Window functions compute values across a set of rows related to the current row, without collapsing Rows (unlike GROUP BY).

SELECT Name, Dept, GPA,
RANK() OVER (PARTITION BY Dept ORDER BY GPA DESC) AS DeptRank,
DENSE_RANK() OVER (ORDER BY GPA DESC) AS OverallRank,
AVG(GPA) OVER (PARTITION BY Dept) AS DeptAvgGPA,
LAG(GPA, 1) OVER (PARTITION BY Dept ORDER BY GPA DESC) AS PrevGPA
FROM Student;

RANK() vs. DENSE_RANK(): RANK() leaves gaps after ties; DENSE_RANK() does not.

Other window functions: ROW_NUMBER()``LEAD()``FIRST_VALUE()``LAST_VALUE() NTILE(n)``SUM() OVER(...)``COUNT() OVER(...).

Worked Example 3.1: Running Totals and Percentiles

Running total of enrolments per department over time:

SELECT Dept, Semester, COUNT(*) AS Enrolments,
SUM(COUNT(*)) OVER (PARTITION BY Dept ORDER BY Semester
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM Enrolment E JOIN Course C ON E.CourseID = C.CourseID
GROUP BY Dept, Semester
ORDER BY Dept, Semester;

Percentile rank of each student’s GPA within their department:

SELECT Name, Dept, GPA,
PERCENT_RANK() OVER (PARTITION BY Dept ORDER BY GPA) AS Percentile,
NTILE(4) OVER (PARTITION BY Dept ORDER BY GPA) AS Quartile
FROM Student;

PERCENT_RANK() returns a value in [0,1][0, 1] representing the fractional rank. NTILE(4) divides The partition into 4 approximately equal groups (quartiles).

3.7 Views

CREATE VIEW CSStudents AS
SELECT ID, Name, GPA FROM Student WHERE Dept = 'CS';
CREATE VIEW DeptStats AS
SELECT Dept, COUNT(*) AS N, AVG(GPA) AS AvgGPA
FROM Student GROUP BY Dept;

Views are virtual tables. Updatable views (single-table, no aggregation) allow INSERT UPDATE``DELETE through the view. Materialised views store the result physically and must be Refreshed.

3.8 Recursive Common Table Expressions

Recursive CTEs allow queries on hierarchical or graph-structured data. A recursive CTE consists of A base case (non-recursive term) and a recursive step (referencing the CTE itself), joined By UNION ALL.

Worked Example 3.2: Employee-Manager Hierarchy

Schema: Employee(emp_id, name, manager_id).

WITH RECURSIVE OrgChart AS (
SELECT emp_id, name, manager_id, 1 AS level
FROM Employee
WHERE manager_id IS NULL
UNION ALL
SELECT e.emp_id, e.name, e.manager_id, o.level + 1
FROM Employee e
JOIN OrgChart o ON e.manager_id = o.emp_id
)
SELECT level, name, emp_id
FROM OrgChart
ORDER BY level, name;

The base case selects the CEO (no manager). Each recursive step finds all employees whose manager Appears in the previous iteration. The recursion terminates when no new employees are found.

Worked Example 3.3: Bill of Materials (Parts Explosion)

Schema: Part(part_id, name) and BOM(assembly_id, component_id, quantity).

Find all components (direct and indirect) of assembly A100 and the total quantity needed:

WITH RECURSIVE PartsList AS (
SELECT component_id AS part_id, quantity
FROM BOM
WHERE assembly_id = 'A100'
UNION ALL
SELECT b.component_id, p.quantity * b.quantity
FROM PartsList p
JOIN BOM b ON p.part_id = b.assembly_id
)
SELECT part_id, SUM(quantity) AS total_qty
FROM PartsList
GROUP BY part_id
ORDER BY total_qty DESC;

The recursive step multiplies the quantity at each level to accumulate the total needed quantity for Each sub-component.

3.9 SQL Injection Prevention

SQL injection occurs when user input is concatenated directly into a SQL string, allowing an Attacker to manipulate the query structure.

Vulnerable code (Python):

cursor.execute(f"SELECT * FROM Student WHERE name = '{user_input}'")

If user_input is ' OR '1'='1The query becomes SELECT * FROM Student WHERE name = '' OR '1'='1' Returning all rows.

Types of SQL injection:

TypeDescription
In-band (error-based)Attacker extracts data from error messages
In-band (UNION-based)Attacker appends UNION SELECT to extract other tables
Blind (boolean)Attacker infers data by asking true/false questions
Blind (time-based)Attacker infers data by measuring response delays
Out-of-bandAttacker triggers the DBMS to send data externally

Prevention:

  1. Parameterised queries (prepared statements): The DBMS treats parameters as data, never as SQL code.
cursor.execute("SELECT * FROM Student WHERE name = %s", (user_input,))
  1. Input validation: Reject or sanitise input that does not match expected patterns (e.g., email format, numeric ID).
  2. Least privilege: Application accounts should have only the permissions they need.
  3. ORM frameworks: Use an ORM (e.g., SQLAlchemy, Django ORM) that generates parameterised queries by default.

:::caution Common Pitfall Even if you escape single quotes in user input, this is not a reliable defence against SQL injection. Use parameterised queries instead. Escape-based defences are fragile and have been bypassed by Encoding tricks (e.g., multibyte character exploits). :::

3.10 Query Optimisation Basics

The SQL query optimiser automatically selects an execution plan, but understanding the basics helps Developers write efficient queries.

Using EXPLAIN:

EXPLAIN ANALYZE
SELECT S.Name FROM Student S
JOIN Enrolment E ON S.ID = E.StudentID
WHERE E.CourseID = 'CS101';

EXPLAIN shows the estimated plan; EXPLAIN ANALYZE also runs the query and shows actual timings. Key information: sequential scan vs. Index scan, join algorithm used, estimated vs. Actual row counts, And execution time.

Common optimisation guidelines:

  • Create appropriate indexes on columns used in WHERE``JOIN``ORDER BYAnd GROUP BY clauses.
  • Avoid SELECT *; retrieve only the columns needed.
  • Avoid N+1 queries: Fetching related data row-by-row instead of with a single join or batch query.
  • Use LIMIT during development and testing to avoid processing large result sets.
  • Prefer EXISTS over IN for subqueries when checking for existence (often more efficient).
  • Keep statistics updated: ANALYZE (PostgreSQL) or ANALYZE TABLE (MySQL) updates the statistics the optimiser uses for cost estimation.
Worked Example 3.4: Identifying and Fixing a Slow Query

Slow query:

SELECT S.Name, S.Dept, C.Title
FROM Student S, Enrolment E, Course C
WHERE S.ID = E.StudentID
AND E.CourseID = C.CourseID
AND C.Dept = 'CS'
AND S.GPA > 3.5
ORDER BY S.Name;

Problems identified via EXPLAIN ANALYZE:

  1. Implicit cross-join syntax (FROM S, E, C) instead of explicit JOIN. While the optimiser handles this, explicit joins are clearer and less error-prone.
  2. No index on C.Dept. The optimiser performs a full scan on Course.
  3. No index on E.StudentID or E.CourseID. Nested-loop joins scan Enrolment for each row.

Improved version:

SELECT S.Name, S.Dept, C.Title
FROM Student S
JOIN Enrolment E ON S.ID = E.StudentID
JOIN Course C ON E.CourseID = C.CourseID
WHERE C.Dept = 'CS' AND S.GPA > 3.5
ORDER BY S.Name;
CREATE INDEX idx_course_dept ON Course(Dept);
CREATE INDEX idx_enrolment_student ON Enrolment(StudentID);
CREATE INDEX idx_enrolment_course ON Enrolment(CourseID);

With these indexes, the optimiser can use an index scan on Course(Dept) to find CS courses, then Use nested-loop index joins to find matching enrolments and students, avoiding full table scans.

4. Normalisation

4.1 Functional Dependencies

A functional dependency XYX \to Y holds on relation RR if for any two tuples t1,t2t_1, t_2 in RR t1[X]=t2[X]t_1[X] = t_2[X] implies t1[Y]=t2[Y]t_1[Y] = t_2[Y].

Armstrong’s axioms:

  1. Reflexivity: If YXY \subseteq XThen XYX \to Y.
  2. Augmentation: If XYX \to YThen XZYZXZ \to YZ.
  3. Transitivity: If XYX \to Y and YZY \to ZThen XZX \to Z.

Derived rules:

  1. Union: If XYX \to Y and XZX \to ZThen XYZX \to YZ.
  2. Decomposition: If XYZX \to YZThen XYX \to Y and XZX \to Z.
  3. Pseudotransitivity: If XYX \to Y and YWZYW \to ZThen XWZXW \to Z.

Attribute closure. X+X^+ is the set of all attributes functionally determined by XX. Computed By starting with XX and repeatedly applying Armstrong’s axioms until no new attributes can be added.

Theorem 4.1. XYX \to Y holds if and only if YX+Y \subseteq X^+.

4.2 Normal Forms

First Normal Form (1NF). All attribute values are atomic (indivisible). No repeating groups or Nested relations.

Second Normal Form (2NF). In 1NF, and no non-prime attribute is partially dependent on any Candidate key. Equivalently: every non-prime attribute is fully functionally dependent on every Candidate key.

A partial dependency is ABA \to B where AA is a proper subset of a candidate key and BB is Non-prime.

Third Normal Form (3NF). In 2NF, and for every non-trivial FD XAX \to A in RREither XX is a Superkey or AA is a prime attribute.

A prime attribute is an attribute that belongs to some candidate key.

Boyce-Codd Normal Form (BCNF). For every non-trivial FD XAX \to A in RR, XX is a superkey.

Theorem 4.2. Every relation in BCNF is in 3NF, but the converse does not hold.

Proof. BCNF requires XX to be a superkey for every non-trivial FD XAX \to A. 3NF allows XX to be A superkey or AA to be prime. Since the BCNF condition is stricter, every BCNF relation is in 3NF. For the converse, consider R(A,B,C)R(A, B, C) with FDs ABCAB \to C and CBC \to B. Candidate keys: AB\\{AB\\}, AC\\{AC\\}. CBC \to B does not violate 3NF (BB is prime) but violates BCNF (CC is not a Superkey). \blacksquare

4.3 Decomposition

A decomposition of RR into R1,R2,,RkR_1, R_2, \ldots, R_k must satisfy:

  1. Lossless-join: R=R1R2RkR = R_1 \bowtie R_2 \bowtie \cdots \bowtie R_k. No information is lost.
  2. Dependency preservation: Every FD in the original schema is implied by the FDs in the decomposed schemas.

Theorem 4.3 (Lossless-join test). A decomposition of RR into R1,R2R_1, R_2 is lossless if and Only if R1R2R1R_1 \cap R_2 \to R_1 or R1R2R2R_1 \cap R_2 \to R_2.

Proof. Let rr be an instance of RR and let r1=πR1(r)r_1 = \pi_{R_1}(r), r2=πR2(r)r_2 = \pi_{R_2}(r). We must Show r=r1r2r = r_1 \bowtie r_2 under the given condition. Since r1r_1 and r2r_2 are projections of rr Every tuple in r1r2r_1 \bowtie r_2 agrees with some tuple of rr on every attribute. It suffices to show That no spurious tuple is produced. Suppose (t1,t2)r1r2(t_1, t_2) \in r_1 \bowtie r_2 where t1r1t_1 \in r_1 and t2r2t_2 \in r_2. Since t1t_1 and t2t_2 agree on R1R2R_1 \cap R_2And by the condition R1R2R1R_1 \cap R_2 \to R_1 (WLOG), there exists TrT' \in r with T[R1R2]=t1[R1R2]=t2[R1R2]T'[R_1 \cap R_2] = t_1[R_1 \cap R_2] = t_2[R_1 \cap R_2]. Since T[R1R2]=t1[R1R2]T'[R_1 \cap R_2] = t_1[R_1 \cap R_2] and R1R2R1R_1 \cap R_2 \to R_1, we have t[R1]=t1[R1]t'[R_1] = t_1[R_1]. Similarly t[R2]=t2[R2]t'[R_2] = t_2[R_2]. Thus (t1,t2)(t_1, t_2) corresponds to a real tuple trt' \in rSo no spurious tuple exists. \blacksquare

Dependency preservation. Let FF be the set of FDs for RR. The decomposition R1,,RkR_1, \ldots, R_k Preserves dependencies if the closure of i=1kFi+\bigcup_{i=1}^{k} F_i^+ (where FiF_i is the restriction Of FF to RiR_i) equals F+F^+. In practice, we check that every FD in a minimal cover of FF can be Tested within a single RiR_i.

Dependency preservation is important for efficient constraint checking: when updating RiR_iThe DBMS can verify relevant FDs locally without joining all decomposed relations.

4.4 Normalisation Examples

Example 1. Enrolment(StudentID, CourseID, StudentName, Dept, Grade) with FDs: StudentIDStudentName\mathrm{StudentID} \to \mathrm{StudentName}, StudentIDDept\mathrm{StudentID} \to \mathrm{Dept} StudentID,CourseIDGrade\\{\mathrm{StudentID}, \mathrm{CourseID}\\} \to \mathrm{Grade}.

  • Candidate key: StudentID,CourseID\\{\mathrm{StudentID}, \mathrm{CourseID}\\}.
  • 1NF: satisfied (atomic values).
  • 2NF violation: StudentIDStudentName\mathrm{StudentID} \to \mathrm{StudentName} is a partial dependency (StudentID is a proper subset of the key).
  • Decompose: Student(StudentID, StudentName, Dept) and Enrolment(StudentID, CourseID, Grade). Both are in 3NF and BCNF.

Example 2 (3NF but not BCNF). CourseOffering(Course, Instructor, Room, Time) with FDs: Course,TimeInstructor,Room\\{\mathrm{Course}, \mathrm{Time}\\} \to \\{\mathrm{Instructor}, \mathrm{Room}\\} InstructorRoom\mathrm{Instructor} \to \mathrm{Room}.

  • Candidate key: Course,Time\\{\mathrm{Course}, \mathrm{Time}\\}.
  • 3NF: InstructorRoom\mathrm{Instructor} \to \mathrm{Room} — Instructor is not a superkey, but Room is not prime. Wait — Room is not prime (not in any candidate key). So this actually violates 3NF too.

Let us correct: CourseOffering(Course, Instructor, Textbook) with FDs: CourseInstructor\mathrm{Course} \to \mathrm{Instructor} InstructorTextbook\mathrm{Instructor} \to \mathrm{Textbook}.

  • Candidate key: Course\\{\mathrm{Course}\\} (since Course determines everything transitively).
  • 3NF: InstructorTextbook\mathrm{Instructor} \to \mathrm{Textbook}. Instructor is not a superkey. Textbook is not prime. Violates 3NF.

Better example. Class(Course, Instructor, Student) with FDs: Course,StudentInstructor\\{\mathrm{Course}, \mathrm{Student}\\} \to \mathrm{Instructor} InstructorCourse\mathrm{Instructor} \to \mathrm{Course}.

  • Candidate keys: Course,Student\\{\mathrm{Course}, \mathrm{Student}\\}, Instructor,Student\\{\mathrm{Instructor}, \mathrm{Student}\\}.
  • 3NF check for InstructorCourse\mathrm{Instructor} \to \mathrm{Course}: Instructor is not a superkey. But Course is prime (in candidate key Course,Student\\{\mathrm{Course}, \mathrm{Student}\\}). So 3NF is satisfied.
  • BCNF check: InstructorCourse\mathrm{Instructor} \to \mathrm{Course} violates BCNF (Instructor is not a superkey).

Decompose: Teaches(Instructor, Course) and Attends(Instructor, Student). This is lossless (Instructor\mathrm{Instructor} is common and InstructorCourse\mathrm{Instructor} \to \mathrm{Course} holds in Teaches). But the dependency Course,StudentInstructor\\{\mathrm{Course}, \mathrm{Student}\\} \to \mathrm{Instructor} is Not preserved.

Theorem 4.4. Not every relation can be decomposed into BCNF while preserving dependencies. 3NF Is the strongest normal form guaranteeing dependency-preserving, lossless-join decomposition.

Worked Example 4.1: BCNF Decomposition Algorithm

Relation: R(A,B,C,D,E)R(A, B, C, D, E) with FDs F=AB,BCE,EDAF = \\{A \to B, BC \to E, ED \to A\\}.

Step 1. Find candidate keys. Compute A+=A,BA^+ = \\{A, B\\}. BC+=B,C,E,D,A=A,B,C,D,EBC^+ = \\{B, C, E, D, A\\} = \\{A, B, C, D, E\\}. So BCBC is a candidate key. Check EDED: ED+=E,D,A,B,C=A,B,C,D,EED^+ = \\{E, D, A, B, C\\} = \\{A, B, C, D, E\\}. So EDED is Also a candidate key. Prime attributes: B,C,D,EB, C, D, E.

Step 2. Check BCNF. ABA \to B: AA is not a superkey (since A+RA^+ \neq R). Violates BCNF. Decompose on ABA \to B: R1=A,BR_1 = \\{A, B\\} and R2=A,C,D,ER_2 = \\{A, C, D, E\\}.

Step 3. R1=A,BR_1 = \\{A, B\\} is in BCNF (FD ABA \to B with AA as superkey).

For R2=A,C,D,ER_2 = \\{A, C, D, E\\} with FDs restricted to R2R_2: BCEBC \to E involves BB which is not in R2R_2 So it is dropped. EDAED \to A holds. Also, ABA \to B involves BB not in R2R_2So it is dropped. Check for implied FDs: ABA \to B in RR implies ACBCAC \to BC (augmentation), so ACEAC \to E in RR (since BCEBC \to E). But ACAC is not a superkey of R2R_2… Wait, let us recheck.

Actually, from BCEBC \to E and the fact that BB is not in R2R_2We cannot directly use BCEBC \to E In R2R_2. We should compute the projection of FF onto R2R_2: F2=EDA,AF_2 = \\{ED \to A, A \to \varnothing\\}. So the only non-trivial FD in R2R_2 is EDAED \to A. Is EDED a superkey of R2R_2? ED+=E,D,AA,C,D,EED^+ = \\{E, D, A\\} \neq \\{A, C, D, E\\} (missing CC). So EDED is not a superkey of R2R_2.

Hmm, we need to be more careful. Let us check if CC is determined by EDED in the original RR. ED+ED^+ in R=E,D,A,BR = \\{E, D, A, B\\}Which does not include CC. So CC is not determined.

This means R2=A,C,D,ER_2 = \\{A, C, D, E\\} has no non-trivial FDs that hold (other than keys determining all Attributes). Check: candidate keys of R2R_2 must be superkeys. Since EDAED \to A holds but EDED does Not determine CCWe need EDCEDC as a key: EDC+=E,D,C,A,B=REDC^+ = \\{E, D, C, A, B\\} = R (all of RR). So in R2R_2, EDCEDC is a candidate key (since it determines all attributes of R2R_2: EDCAEDC \to A and AA \to \varnothing in R2R_2, so EDC+=A,C,D,EEDC^+ = \\{A, C, D, E\\}). R2R_2 is in BCNF since the only non-trivial FD is EDAED \to A and we need to check if EDED is a superkey of R2R_2. Since ED+R2=A,D,ER2ED^+ \cap R_2 = \\{A, D, E\\} \neq R_2, EDED is not a superkey.

But wait — there are no other non-trivial FDs in R2R_2. The only one is EDAED \to AWhich violates BCNF. Decompose: R2a=E,D,AR_{2a} = \\{E, D, A\\} and R2b=A,C,D,EE,D,A=CR_{2b} = \\{A, C, D, E\\} \setminus \\{E, D, A\\} = \\{C\\}.

Hmm, C\\{C\\} alone is a relation with no non-trivial FDs, so it is in BCNF. R2a=E,D,AR_{2a} = \\{E, D, A\\} With EDAED \to A: EDED is a superkey (it determines all three attributes). BCNF.

Final decomposition: R1=A,BR_1 = \\{A, B\\}, R2a=A,D,ER_{2a} = \\{A, D, E\\}, R2b=CR_{2b} = \\{C\\}.

Lossless check: R1R2a=AR_1 \cap R_{2a} = \\{A\\}, ABA \to B (from FF), so R1R2aR_1 \bowtie R_{2a} is lossless. R2aR2b=R_{2a} \cap R_{2b} = \varnothing… This is problematic. R2b=CR_{2b} = \\{C\\} shares no attributes with The others.

The issue is that CC is a “dangling” attribute. This is correct — CC appears only in the key BCBC Of the original relation but is not functionally determined by anything except the full key. The Decomposition is technically correct but R2b=CR_{2b} = \\{C\\} by itself cannot be joined losslessly with The others.

This illustrates a limitation of BCNF decomposition: it may produce a relation with no common Attributes, making lossless join impossible to verify with the pairwise test. In practice, the 3NF Synthesis algorithm avoids this issue.

:::caution Common Pitfall Do not confuse partial dependency (2NF violation) with transitive dependency (3NF violation). A Partial dependency involves a proper subset of a candidate key determining a non-prime attribute. A transitive dependency involves a non-key attribute determining another non-prime attribute. :::

4.5 Multivalued Dependencies and 4NF

A multivalued dependency (MVD) XYX \twoheadrightarrow Y holds on relation RR if for any two Tuples t1,t2Rt_1, t_2 \in R with t1[X]=t2[X]t_1[X] = t_2[X]There exists a tuple t3Rt_3 \in R such that:

  • t3[X]=t1[X]t_3[X] = t_1[X]
  • t3[Y]=t1[Y]t_3[Y] = t_1[Y]
  • t3[RXY]=t2[RXY]t_3[R - X - Y] = t_2[R - X - Y]

: XX determines YY independently of RXYR - X - Y.

Trivial MVDs: XYX \twoheadrightarrow Y where YXY \subseteq X or XY=RX \cup Y = R.

Fourth Normal Form (4NF). RR is in 4NF if for every non-trivial MVD XYX \twoheadrightarrow Y That holds on RR, XX is a superkey.

Theorem 4.5. Every relation in 4NF is in BCNF.

Proof. Every FD XYX \to Y implies the MVD XYX \twoheadrightarrow Y. In 4NF, every non-trivial MVD XYX \twoheadrightarrow Y requires XX to be a superkey. Therefore, every non-trivial FD XYX \to Y Also requires XX to be a superkey, which is the BCNF condition. \blacksquare

4NF decomposition. Given RR with MVD XYX \twoheadrightarrow Y where XX is not a superkey, Decompose into R1=XYR_1 = X \cup Y and R2=RYR_2 = R - Y. The decomposition is lossless-join.

Worked Example 4.2: Decomposition to 4NF

Relation: CourseInstructor(Course, Instructor, Textbook) where each course can have multiple Instructors and multiple textbooks, independently.

MVDs: CourseInstructor\mathrm{Course} \twoheadrightarrow \mathrm{Instructor} CourseTextbook\mathrm{Course} \twoheadrightarrow \mathrm{Textbook}.

Sample data:

CourseInstructorTextbook
CS101SmithDB Systems
CS101LeeDB Systems
CS101SmithAlgorithm Design
CS101LeeAlgorithm Design

The redundancy is clear: each instructor-textbook pair is repeated for each course.

4NF check: CourseInstructor\mathrm{Course} \twoheadrightarrow \mathrm{Instructor} is non-trivial, and Course\mathrm{Course} is not a superkey. Violates 4NF.

Decompose:

  • CI(Course, Instructor) with MVD CourseInstructor\mathrm{Course} \twoheadrightarrow \mathrm{Instructor}
  • CT(Course, Textbook) with MVD CourseTextbook\mathrm{Course} \twoheadrightarrow \mathrm{Textbook}

Both are in 4NF (the determining attribute Course is a candidate key in each).

5. Indexing

5.1 B-Trees and B+ Trees

B-Tree of order mm. A balanced search tree where each internal node has between m/2\lceil m/2 \rceil and MM children. All leaves are at the same depth.

Properties:

  • Root: at least 2 children (unless it is a leaf).
  • Internal node with kk children has k1k - 1 keys.
  • All leaves at the same depth hh.
  • Height: h=O(logmn)h = O(\log_m n).

Operations:

  • Search: O(logmn)O(\log_m n). At each node, binary search the keys and follow the appropriate child pointer.
  • Insert: Find the correct leaf. If the leaf has room, insert. If full, split the leaf and propagate the median key up. May cascade splits to the root.
  • Delete: Find the key. If in an internal node, replace with predecessor/successor and delete from the leaf. If a node underflows (fewer than m/21\lceil m/2 \rceil - 1 keys), redistribute from a sibling or merge with a sibling.

B+ Tree. Variant where:

  • All data records are stored only in leaf nodes (internal nodes store only keys for routing).
  • Leaf nodes are linked in a linked list for efficient range queries.

This makes B+ trees the standard for database indexing. The linked leaf list allows range scans in O(logmn+k)O(\log_m n + k) where kk is the number of records in the range.

B+ Tree insert pseudocode:

def bplus_insert(root, key, value):
leaf = find_leaf(root, key)
insert_into_leaf(leaf, key, value)
if len(leaf.keys) > MAX_KEYS:
new_leaf, median = split_leaf(leaf)
insert_into_parent(leaf.parent, median, new_leaf)
Worked Example 5.1: B+ Tree Insertion

Insert keys 10, 20, 5, 15, 25, 30 into a B+ tree with maximum 2 keys per leaf node and maximum 2 Keys per internal node (order m=3m = 3).

Insert 10. Leaf: [10][10]

Insert 20. Leaf: [10,20][10, 20]

Insert 5. Leaf would be [5,10,20][5, 10, 20] — overflow (max 2 keys). Split: left leaf [5][5]Right Leaf [10,20][10, 20]. Push median (10) to new internal node.

Internal: [10]
├── Leaf: [5]
└── Leaf: [10, 20]

Insert 15. Goes to right leaf. Leaf would be [10,15,20][10, 15, 20] — overflow. Split: [10,15][10, 15] and [20][20]. Push 15 up.

Internal: [10, 15]
├── Leaf: [5]
├── Leaf: [10, 15]
└── Leaf: [20]

Insert 25. Goes to rightmost leaf. Leaf: [20,25][20, 25].

Insert 30. Leaf would be [20,25,30][20, 25, 30] — overflow. Split: [20,25][20, 25] and [30][30]. Push 25 up. Internal would be [10,15,25][10, 15, 25] — overflow (max 2 keys). Split internal: push 15 to new root.

Root: [15]
├── Internal: [10]
│ ├── Leaf: [5]
│ └── Leaf: [10, 15]
└── Internal: [25]
├── Leaf: [20, 25]
└── Leaf: [30]
Worked Example 5.2: B+ Tree Deletion

Starting from the tree in Example 5.1, delete key 25.

Delete 25. Found in rightmost leaf of the right internal node. Leaf becomes [30][30] (1 key, Minimum is 2/2=1\lceil 2/2 \rceil = 1). No underflow. Update internal node: key 25 changes to 30 (to maintain correct routing).

Root: [15]
├── Internal: [10]
│ ├── Leaf: [5]
│ └── Leaf: [10, 15]
└── Internal: [30]
├── Leaf: [20]
└── Leaf: [30]

Now delete 15. Leaf [10,15][10, 15] becomes [10][10]. No underflow. Internal node key 15 changes to 10. But wait — the internal node [10][10] would need to distinguish between leaves [5][5] and [10][10]. Since the left child contains keys <10\lt 10 and the right child contains keys 10\geq 10This Is still correct.

Now delete 10 from the left subtree’s right leaf. Leaf becomes empty — underflow.

Redistribution: Sibling leaf [5][5] has 1 key (at minimum). Cannot redistribute. Merge: Merge empty leaf with [5][5]. The merged leaf is [5][5]. The internal node [10][10] now has Only one child (the merged leaf), so it underflows.

Since the internal node is a child of the root, and the root has two children, we can merge: remove The internal node and promote its remaining child to be a direct child of the root.

Root: [30]
├── Leaf: [5]
├── Leaf: [10] (originally contained 10 before deletion)
└── Leaf: [30]

Wait — after deleting 10, the leaf [10][10] should become [][]And merging with [5][5] gives [5][5]. Let me retrace. The point is that deletion can trigger cascading merges up the tree.

5.2 Hash Indexes

Static hashing. A hash function h:K0,,B1h : K \to \\{0, \ldots, B - 1\\} maps keys to BB buckets. Average lookup: O(1)O(1) under uniform hashing. No support for range queries.

Collision handling. When two keys hash to the same bucket, the DBMS must resolve the collision:

  1. Separate chaining: Each bucket contains a linked list of entries. Lookup requires traversing the chain. Average chain length under uniform hashing is n/Bn / B.
  2. Open addressing (linear probing): If bucket h(k)h(k) is full, try h(k)+1h(k)+1, h(k)+2h(k)+2Etc. (mod BB). Prone to primary clustering: consecutive occupied slots increase the average probe length.
  3. Open addressing (quadratic probing): Try h(k)+12,h(k)+22,h(k)+32h(k) + 1^2, h(k) + 2^2, h(k) + 3^2Etc. Reduces clustering but may not probe all buckets.
  4. Open addressing (double hashing): Use a second hash function h2h_2: probe sequence is h(k)+ih2(k)h(k) + i \cdot h_2(k) for i=0,1,2,i = 0, 1, 2, \ldots. Minimises clustering.

Extendible hashing. Dynamically grows the number of buckets. Uses a global depth dd and Per-bucket local depth dd'.

  • Hash key to dd bits.
  • If the bucket is full and its local depth equals the global depth, double the directory (global depth increases by 1) and split the bucket.
  • If the bucket’s local depth is less than the global depth, split the bucket without doubling the directory.

Linear hashing. Gradually grows one bucket at a time using a pointer to the next bucket to split. Simpler than extendible hashing but may have slightly higher overflow probability.

Comparison:

PropertyB+ TreeHash Index
Point queryO(logmn)O(\log_m n)O(1)O(1) average
Range queryO(logmn+k)O(\log_m n + k)Not supported
Insert/DeleteO(logmn)O(\log_m n)O(1)O(1) average
SpaceNodes may be half-emptyMay have overflow chains
OrderMaintains key orderNo ordering

5.3 Bitmap Indexes

A bitmap index creates one bitmap per distinct value of an attribute. For a table with nn rows And attribute AA with values v1,,vk\\{v_1, \ldots, v_k\\}Store kk bitmaps of nn bits each, where Bitmap ii has a 1 in position jj if row jj has A=viA = v_i.

Use case: Low-cardinality columns (gender, status, country). Bitmap indexes support fast bitwise AND/OR for multi-criteria queries.

Bitwise operations for query evaluation:

QueryBitmap operation
A=v1A = v_1 AND B=v2B = v_2\mathrm{bitmap_}{A,v_1} AND \mathrm{bitmap_}{B,v_2}
A=v1A = v_1 OR A=v2A = v_2\mathrm{bitmap_}{A,v_1} OR \mathrm{bitmap_}{A,v_2}
Av1A \neq v_1NOT \mathrm{bitmap_}{A,v_1}

Compression. For columns with many distinct values, run-length encoding (WAH or BBC) compresses Bitmaps effectively while still supporting bitwise operations.

5.4 Query Processing Cost Estimation

The cost model estimates the number of disk I/O operations for a query plan. We assume the buffer Pool has BB pages and each disk page access costs one I/O.

Selection cost estimation:

Access methodCost (I/Os)
Full table scannR/B\lceil n_R / B \rceil (or nRn_R if BB pages available)
B+ tree equality searchlogf(nR)\log_f(n_R) leaf + 1 data page
B+ tree range searchlogf(nR)\log_f(n_R) leaf + rangepages\lvert\mathrm{range} pages\rvert
Hash equality search1 (ideal)

Where ff is the fanout (average number of children per internal node).

Join cost estimation (from Section 7):

AlgorithmCost
Block nested-loopnR+nR/(B2)nSn_R + \lceil n_R/(B-2)\rceil \cdot n_S
Sort-merge2nRlogB1(nR)+2nSlogB1(nS)+nR+nS2n_R \log_{B-1}(n_R) + 2n_S \log_{B-1}(n_S) + n_R + n_S
Hash join3(nR+nS)3(n_R + n_S) (if build relation fits in BB pages)
Worked Example 5.3: Comparing Access Methods

Scenario: Student table with 50000 tuples, 2500 pages (20 tuples per page). Buffer pool has 52 Pages. Attribute ID has a B+ tree index of height 3 (fanout 100). Query: SELECT * FROM Student WHERE ID = 12345.

Method 1: Full table scan. Cost = 2500 I/Os.

Method 2: B+ tree index on ID. Follow root \to internal \to internal \to leaf: 3 I/Os For the index traversal. Then 1 I/O for the data page. Total: 4 I/Os.

The B+ tree index is 625×625\times faster for a single equality lookup. However, for a query selecting 50% of the table, a full scan (2500 I/Os) would be faster than 25000 individual B+ tree lookups (25000×4=10000025000 \times 4 = 100000 I/Os). This is why the optimiser chooses between index and scan based on Selectivity estimates.

6. Transaction Management

6.1 ACID Properties

A transaction is a logical unit of work that must satisfy:

PropertyMeaning
AtomicityAll or nothing: either all operations commit or none
ConsistencyTransforms the database from one consistent state to another
IsolationConcurrent transactions do not interfere with each other
DurabilityOnce committed, the effects survive failures

6.2 Isolation Levels

SQL defines four isolation levels with increasing strictness:

LevelDirty ReadNon-repeatable ReadPhantom Read
Read UncommittedPossiblePossiblePossible
Read CommittedPreventedPossiblePossible
Repeatable ReadPreventedPreventedPossible
SerializablePreventedPreventedPrevented
  • Dirty read: Reading uncommitted data from another transaction.
  • Non-repeatable read: Same query returns different results within the same transaction due to updates by another transaction.
  • Phantom read: Same range query returns new rows due to inserts by another transaction.

Default levels: PostgreSQL: Read Committed. MySQL InnoDB: Repeatable Read. SQL Server: Read Committed. Oracle: Read Committed.

6.3 Serialisability

A schedule is serialisable if its effect is equivalent to some serial schedule (where Transactions execute one after another).

Conflict serialisability. Two operations conflict if they belong to different transactions, Access the same data item, and at least one is a write. A schedule is conflict-serialisable if its precedence graph (also called serialisation graph) is acyclic.

  • Precedence graph: Nodes are transactions. Draw a directed edge TiTjT_i \to T_j if an operation of TiT_i conflicts with and precedes an operation of TjT_j.

Theorem 6.1. A schedule is conflict-serialisable if and only if its precedence graph is acyclic.

Proof sketch. If the precedence graph has a cycle, no serial ordering can respect all the Precedence constraints, so the schedule is not conflict-serialisable. Conversely, a topological sort Of an acyclic graph gives a serial order equivalent to the schedule. \blacksquare

View serialisability. A schedule SS is view-serialisable if it is view-equivalent to a Serial schedule SS'. View equivalence requires:

  1. Initial read: If TiT_i reads the initial value of QQ in SSIt does so in SS'.
  2. Updated read: If TiT_i reads the value of QQ written by TjT_j in SSIt does so in SS'.
  3. Final write: If TiT_i performs the final write of QQ in SSIt does so in SS'.

Every conflict-serialisable schedule is view-serialisable, but the converse does not hold. Testing For view serialisability is NP-complete.

Worked Example 6.1: Testing Conflict Serialisability

Schedule: S:r1(A),w1(A),r2(A),w2(A),r1(B),w1(B),r2(B),w2(B)S: r_1(A), w_1(A), r_2(A), w_2(A), r_1(B), w_1(B), r_2(B), w_2(B).

Build precedence graph:

  • w1(A)w_1(A) before r2(A)r_2(A): edge T1T2T_1 \to T_2.
  • r2(A)r_2(A) before w1(A)w_1(A): No — r2(A)r_2(A) is after w1(A)w_1(A).
  • w2(A)w_2(A) before… Nothing comes after w2(A)w_2(A) on AA.
  • w1(B)w_1(B) before r2(B)r_2(B): edge T1T2T_1 \to T_2.
  • w2(B)w_2(B) before… Nothing comes after w2(B)w_2(B) on BB.

Precedence graph: T1T2T_1 \to T_2 (acyclic).

Equivalent serial schedule: T1,T2T_1, T_2 (topological order).

Now consider: S:r1(A),w1(A),r2(B),w2(B),r1(B),w1(B),r2(A),w2(A)S': r_1(A), w_1(A), r_2(B), w_2(B), r_1(B), w_1(B), r_2(A), w_2(A).

  • w1(A)w_1(A) before r2(A)r_2(A): T1T2T_1 \to T_2.
  • w2(B)w_2(B) before r1(B)r_1(B): T2T1T_2 \to T_1.

Precedence graph has cycle: T1T2T1T_1 \to T_2 \to T_1. Not conflict-serialisable.

6.4 Concurrency Control

Two-Phase Locking (2PL).

Protocol:

  1. Growing phase: Acquire locks; do not release any.
  2. Shrinking phase: Release locks; do not acquire any.

Theorem 6.2. 2PL guarantees conflict serialisability.

Proof. By the protocol, transaction TiT_i acquires all its locks before releasing any. If TjT_j Requests a lock held by TiT_i, TjT_j waits. This creates a total ordering on conflicting Operations, which corresponds to a serial schedule. \blacksquare

Variants:

  • Strict 2PL: All locks held until commit/abort. Prevents cascading aborts.
  • Rigorous 2PL: All locks held until commit/abort (same as strict 2PL for exclusive locks).

Problem with 2PL: Deadlocks. Detection (wait-for graph) or prevention (timestamp ordering).

Timestamp Ordering (TO). Each transaction receives a timestamp TS(T)TS(T). For conflicting Operations:

  • TiT_i reads QQ: if QQ was last written by TjT_j with TS(Tj)>TS(Ti)TS(T_j) \gt TS(T_i)Abort TiT_i.
  • TiT_i writes QQ: if QQ was last read by TjT_j with TS(Tj)>TS(Ti)TS(T_j) \gt TS(T_i)Abort TiT_i.

No deadlocks (no waiting), but may abort transactions unnecessarily.

Optimistic Concurrency Control (OCC). Assumes conflicts are rare. Three phases:

  1. Read phase: Transaction reads data and writes to private workspace. No locks acquired.
  2. Validation phase: Before commit, check that no data read by this transaction has been modified by a committed transaction since the read phase began.
  3. Write phase: If validation passes, apply private writes to the database. Otherwise, abort and restart.

Forward validation: Check against committed transactions. For each data item QQ read by TiT_iVerify that the transaction that last wrote QQ committed before TiT_i‘s read phase began. Backward validation: Check against active transactions.

OCC performs well when conflicts are rare but degrades under high contention (many restarts).

Worked Example 6.2: Optimistic Concurrency Control

Scenario: Two transactions T1T_1 and T2T_2 both read and update account balances.

  • T1T_1: Read A=100A = 100Read B=200B = 200Write A=150A = 150Write B=150B = 150 (transfer 50 from BB to AA).
  • T2T_2: Read A=100A = 100Write A=75A = 75 (withdraw 25).

Execution:

  1. T1T_1 read phase: Reads A=100A = 100, B=200B = 200 into private workspace.
  2. T2T_2 read phase: Reads A=100A = 100 into private workspace.
  3. T2T_2 validation phase: Checks if AA was modified by any committed transaction since T2T_2 started. No committed transactions modified AA. Validation passes.
  4. T2T_2 write phase: Writes A=75A = 75. T2T_2 commits.
  5. T1T_1 validation phase: Checks if AA or BB was modified by any committed transaction since T1T_1 started. T2T_2 committed and modified AA. Validation fails.
  6. T1T_1 is aborted and restarted. On restart, T1T_1 reads A=75A = 75, B=200B = 200And correctly computes A=125A = 125, B=150B = 150.

Multiversion Concurrency Control (MVCC). Maintain multiple versions of each data item.

  • Read operations read a consistent snapshot (a version visible at the start of the transaction).
  • Write operations create a new version; the old version remains for concurrent readers.
  • Conflicts between readers and writers are avoided (no read-write contention).

MVCC pseudocode (simplified):

def read(txn, item):
version = find_version(item, txn.start_ts)
return version.value
def write(txn, item, value):
if has_conflicting_write(item, txn.start_ts, txn.id):
abort(txn)
create_version(item, value, txn.id, txn.start_ts)

MVCC implementations:

  • PostgreSQL: Snapshot Isolation. Each transaction sees a snapshot of committed data at transaction start. Uses xmin/xmax on each tuple.
  • MySQL InnoDB: MVCC with undo logs. Each row has hidden columns for transaction ID and rollback pointer.
  • Oracle: Snapshot Isolation at the statement level by default.

MVCC anomaly: Write Skew. Two concurrent transactions read overlapping data and update Non-overlapping parts based on what they read. Serializable Snapshot Isolation (SSI) in PostgreSQL 9.1+ Detects and prevents this.

6.5 Log-Based Recovery

Write-Ahead Logging (WAL). Before writing a modified page to disk, write the log record describing The modification to stable storage.

WAL protocol:

  1. Before a data page is flushed to disk, all log records pertaining to that page must be on stable storage.
  2. Before a transaction commits, all its log records must be on stable storage.

Log records: [LSN, TxnID, PageID, BeforeImage, AfterImage, Commit/Abort].

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics). The standard recovery Algorithm used in IBM DB2, PostgreSQL, SQL Server, and others:

  1. Analysis phase: Scan the log from the last checkpoint. Identify dirty pages and active transactions. Reconstruct the transaction table (which transactions were active at crash time) and the dirty page table (which pages may not have been written to disk).
  2. Redo phase: Scan forward from the earliest LSN in the dirty page table. Redo all logged updates (from both committed and uncommitted transactions) to restore the database to the exact state at the time of the crash. A record is redone only if the page’s on-disk LSN is less than the record’s LSN (ensuring idempotency).
  3. Undo phase: Scan backward from the end of the log. Undo all updates from transactions that were active at crash time (did not commit). Write compensation log records (CLRs) so that undo operations themselves are logged.

Key ARIES properties:

  • Idempotent: Recovery can be restarted safely (re-processing log records has no additional effect).
  • Steal/no-force: Uncommitted pages can be flushed to disk (steal), and committed pages need not be flushed (no-force). This improves performance at the cost of more recovery work.
  • Fine-grained logging: Log records track individual page modifications, enabling precise recovery.

Checkpoints. Periodically flush dirty pages to disk and write a checkpoint record. Reduces the Amount of log to scan during recovery.

// Simplified checkpoint pseudocode
void checkpoint() {
flush_all_dirty_pages();
write_log_record(CHECKPOINT, dirty_page_list, active_txns);
flush_log();
}

6.6 Buffer Management

The buffer pool caches frequently accessed disk pages in memory. Replacement policies include LRU, Clock, and variants. PostgreSQL uses a Clock sweep algorithm; MySQL InnoDB uses an LRU variant with a Midpoint insertion strategy to avoid scan pollution.

7. Query Optimisation

7.1 Query Processing Pipeline

SQLparseASTrewriteLogicalplanoptimisePhysicalplanexecuteResult\mathrm{SQL} \xrightarrow{\mathrm{parse} \mathrm{AST} \xrightarrow{\mathrm{rewrite} \mathrm{Logical} plan \xrightarrow{\mathrm{optimise} \mathrm{Physical} plan \xrightarrow{\mathrm{execute} \mathrm{Result}}}}}

7.2 Cost-Based Optimisation

The optimiser estimates the cost of alternative execution plans and chooses the cheapest.

Cost model. Cost = I/O cost (disk page accesses) + CPU cost. For disk-bound queries, I/O Dominates.

Catalog statistics: Table cardinality (nn), attribute value cardinality, number of distinct Values, histogram of value distribution, index information.

Selectivity estimation. For a predicate σA=v(R)\sigma_{A = v}(R)The selectivity is approximately 1/V(A,R)1 / V(A, R) where V(A,R)V(A, R) is the number of distinct values of AA in RR.

Predicate typeSelectivity estimate
A=vA = v1/V(A,R)1 / V(A, R)
A>vA \gt v(max(A)v)/(max(A)min(A))(\max(A) - v) / (\max(A) - \min(A))
A1=v1A2=v2A_1 = v_1 \land A_2 = v_21/V(A1)×1/V(A2)1 / V(A_1) \times 1 / V(A_2)
A1=v1A2=v2A_1 = v_1 \lor A_2 = v_21/V(A1)+1/V(A2)1/(V(A1)×V(A2))1/V(A_1) + 1/V(A_2) - 1/(V(A_1) \times V(A_2))

7.3 Join Algorithms

Nested-loop join. For each tuple in RRScan all of SS.

Cost=nRnSpageaccesses(worstcase)\mathrm{Cost} = n_R \cdot n_S \mathrm{ page} accesses (worst case)

If one relation fits in memory, buffer it and scan the other: cost = nR+nSn_R + n_S.

Block nested-loop join. Use BB buffer pages. Load blocks of RR into B2B - 2 buffers, scan SS With the remaining buffer.

Cost=nR+nR/(B2)nS\mathrm{Cost} = n_R + \lceil n_R / (B - 2) \rceil \cdot n_S

Sort-merge join. Sort both relations on the join attribute, then merge.

Cost=2nRlogB1(nR)+2nSlogB1(nS)+nR+nS\mathrm{Cost} = 2 \cdot n_R \cdot \log_{B-1}(n_R) + 2 \cdot n_S \cdot \log_{B-1}(n_S) + n_R + n_S

Efficient for large relations, especially when both are already sorted.

Hash join. Build a hash table on the smaller relation (build phase), then probe with the larger (probe phase).

Cost=3(nR+nS)(ifbuildrelationfitsinmemory)\mathrm{Cost} = 3 \cdot (n_R + n_S) \mathrm{ (if build relation fits in memory)}

Best for equi-joins when one relation fits in memory.

Index nested-loop join. For each tuple in RRUse an index on SS to find matching tuples.

Cost=nR(indexlookupcost)\mathrm{Cost} = n_R \cdot (\mathrm{index} lookup cost)

Efficient if SS has an index on the join attribute and nRn_R is small.

7.4 Query Plan Selection

The optimiser explores the space of equivalent logical plans and physical implementations. For kk Joins, the number of join orderings is O(k!)O(k!) (left-deep trees) or O(3k)O(3^k) (bushy trees). Practical optimisers use dynamic programming with pruning.

Heuristic transformations:

  • Push selections down (reduce intermediate result sizes).
  • Push projections down (reduce column widths).
  • Convert cross products to joins when possible.
  • Reorder joins based on estimated cardinalities.

8. NoSQL Overview

8.1 Motivation

NoSQL databases address limitations of relational databases for certain workloads:

  • Horizontal scalability across commodity hardware.
  • Flexible schemas (semi-structured data).
  • High write throughput for simple access patterns.
  • Handling unstructured or polymorphic data.

8.2 Document Stores

Store data as JSON/BSON documents. Each document can have a different structure.

Example (MongoDB):

{
"_id": 1,
"name": "Alice",
"courses": ["CS101", "MATH201"],
"address": { "city": "Cambridge", "zip": "02139" }
}

Strengths: Flexible schema, nested data, good for content management. Weaknesses: No joins (requires embedding or application-level joins), eventual consistency in Distributed mode, limited transaction support.

8.3 Key-Value Stores

Simplest model: each item is a key-value pair. Values are opaque blobs.

Example (Redis):

SET user:1001 '{"name":"Alice","email":"alice@univ.edu"}'
GET user:1001
EXPIRE user:1001 3600

Strengths: Extremely fast (in-memory), simple API, caching, session management. Weaknesses: No queries beyond key lookup, limited data modelling.

8.4 Graph Databases

Store nodes and edges with properties. Optimised for traversing relationships.

Example (Neo4j Cypher):

CREATE (a:Student {name: "Alice"})-[:ENROLLED_IN {grade: "A"}]->(c:Course {title: "Databases"})
MATCH (s:Student)-[:ENROLLED_IN]->(c:Course {title: "Databases"})
RETURN s.name, c.title

Strengths: Efficient for complex relationship queries, social networks, recommendation engines, Knowledge graphs. Weaknesses: Not suitable for simple CRUD, horizontal scaling is harder.

8.5 Column-Family Stores

Store data in column families (groups of related columns). Each row can have different columns.

Example (Apache Cassandra):

CREATE TABLE grades (
student_id text,
course_id text,
grade text,
semester text,
PRIMARY KEY ((student_id), course_id)
);

Strengths: High write throughput, efficient column-level reads, horizontal scalability, Time-series data. Weaknesses: No joins, limited query flexibility, eventual consistency.

8.6 CAP Theorem

Theorem 8.1 (Brewer’s CAP Theorem). A distributed data store can provide at most two of the Following three guarantees simultaneously:

  • Consistency (C): Every read receives the most recent write.
  • Availability (A): Every request receives a non-error response.
  • Partition tolerance (P): The system continues to operate despite network partitions.

Since network partitions are inevitable in distributed systems, the real trade-off is between consistency and availability during a partition.

SystemChoiceNotes
Redis ClusterCPPartition: some nodes unreachable
CassandraAPTunable consistency per operation
MongoDBCPPrimary unavailable during partition
RDBMS (with replication)CANot partition-tolerant (single node)

PACELC. Extension of CAP: in the absence of partitions, the trade-off is between latency And consistency.

:::caution Common Pitfall “NoSQL” does not mean “no SQL.” It means “Not Only SQL.” Many NoSQL databases now support SQL-like Query languages (e.g., Cassandra CQL). The choice between relational and NoSQL depends on the Workload, not on a blanket preference. Relational databases remain the best choice for strongly Structured data with complex queries and transactional requirements.

9. Distributed Databases

9.1 Architecture and Motivation

A distributed database stores data across multiple nodes connected by a network. Motivations Include:

  • Performance: Data locality reduces latency; parallel query processing increases throughput.
  • Availability: Replication allows the system to survive node failures.
  • Scalability: Horizontal scaling by adding more nodes.

Shared-nothing architecture. Each node has its own CPU, memory, and disk. Nodes communicate Only via the network. This is the dominant architecture for distributed databases.

Data fragmentation:

  • Horizontal fragmentation: Each fragment is a subset of rows (tuples) defined by a selection predicate. Example: partition Student by department.
  • Vertical fragmentation: Each fragment is a subset of columns. Example: store frequently accessed columns on fast nodes.
  • Hybrid fragmentation: A combination of horizontal and vertical.

9.2 Distributed Transactions

A distributed transaction involves operations on multiple nodes. The challenge is ensuring atomicity Across nodes.

Two-Phase Commit (2PC).

  • Phase 1 (Voting): The coordinator sends a PREPARE message to all participants. Each participant writes its changes to a local log, votes YES (ready to commit) or NO (abort).
  • Phase 2 (Decision): If all participants voted YESThe coordinator sends COMMIT; otherwise it sends ABORT. Participants apply the decision and acknowledge.

Theorem 9.1. 2PC guarantees atomicity of distributed transactions if no participant crashes Permanently and the log is on stable storage.

Proof. If the coordinator crashes after phase 1, participants that voted YES are blocked — they Cannot decide without knowing the coordinator’s decision. Upon recovery, the coordinator reads its Log to determine the decision and notifies participants. Since each participant wrote its vote to Stable storage before responding, no vote is lost. \blacksquare

Blocking problem. If the coordinator crashes after phase 1, participants that voted YES must Wait indefinitely (they are blocked). This is a fundamental limitation of 2PC.

Three-Phase Commit (3PC). Adds a PRE-COMMIT phase to eliminate blocking under the fail-stop Model:

  1. CanCommit: Coordinator asks if participants can commit. Participants reply YES or NO.
  2. PreCommit: If all YESCoordinator sends PRE-COMMIT. Participants acknowledge.
  3. DoCommit: Coordinator sends COMMIT. Participants commit and acknowledge.

If the coordinator crashes after PRE-COMMITAny participant can contact others to determine the Outcome (since at least one participant must have received PRE-COMMIT and can coordinate).

Limitation of 3PC: Assumes reliable communication (no lost messages), which is unrealistic in Practice. 3PC is rarely used in production systems.

Worked Example 9.2: Two-Phase Commit Walkthrough

Scenario: A bank transfer deducts 100 from Account A (on Node 1) and adds 100 to Account B (on Node 2). The coordinator manages the distributed transaction.

Normal execution:

  1. Coordinator sends PREPARE to Node 1 and Node 2.
  2. Node 1: locks Account A, writes -100 to log, votes YES.
  3. Node 2: locks Account B, writes +100 to log, votes YES.
  4. All votes are YES. Coordinator sends COMMIT to both nodes.
  5. Node 1 applies the write (Account A -= 100), unlocks, sends ACK.
  6. Node 2 applies the write (Account B += 100), unlocks, sends ACK.
  7. Coordinator writes COMMIT to its log. Transaction complete.

Node 1 votes NO (insufficient funds):

  1. Coordinator sends PREPARE to Node 1 and Node 2.
  2. Node 1: locks Account A, discovers balance is 50, votes NO.
  3. Coordinator receives NOSends ABORT to both nodes.
  4. Node 1 unlocks Account A (no changes applied).
  5. Node 2 unlocks Account B (no changes applied).
  6. Coordinator writes ABORT to its log. Transaction aborted.

Coordinator crashes after Phase 1:

  1. Both nodes voted YES and are waiting for the decision.
  2. Nodes are blocked: they hold locks and cannot proceed.
  3. When the coordinator recovers, it reads its log, sees that all votes were YES but no decision was recorded, and sends COMMIT to all participants.

9.3 Replication

Synchronous replication. A write is not acknowledged until all replicas have applied it.

  • Guarantees strong consistency (all replicas identical).
  • High latency (wait for the slowest replica).
  • Reduced availability during partitions (cannot acknowledge writes if a replica is unreachable).

Asynchronous replication. A write is acknowledged after the primary applies it; replicas are Updated in the background.

  • Low latency (acknowledged immediately).
  • Eventual consistency (replicas may lag behind).
  • Higher availability during partitions.

Replication architectures:

ArchitectureWritesReadsConsistency
Single-leaderPrimary onlyAny replicaStrong (reads from primary) or eventual (from replicas)
Multi-leaderAny nodeAny nodeEventual (conflict resolution required)
LeaderlessAny nodeAny nodeTunable (quorum reads/writes)

Quorum-based consistency. For a system with NN replicas, define write quorum WW and read quorum RR such that W+R>NW + R \gt N. This guarantees that any read sees at least one replica with the Latest write.

9.4 Consistency Models

Consistency models define the guarantees a distributed system provides about the order and visibility Of updates.

Strong consistency models:

  • Linearisability: Every operation appears to execute atomically at a single point in real time. The strongest consistency model. Equivalent to “serialisable + real-time ordering.”
  • Sequential consistency: All operations appear in some total order that is consistent with the program order of each individual process. Weaker than linearisability (no real-time requirement).

Weak consistency models:

  • Causal consistency: Operations that are causally related are seen by all processes in the same order. Concurrent (non-causally-related) operations may be seen in different orders.
  • Read-your-writes consistency: A process always sees its own writes.
  • Eventual consistency: If no new updates are made, all replicas will eventually converge to the same state. The weakest model; provides no ordering guarantees.
Worked Example 9.1: Quorum Read/Write

Scenario: N=5N = 5 replicas. Choose W=3W = 3, R=3R = 3. Note W+R=6>5=NW + R = 6 \gt 5 = N.

Write: Client writes value vv. Primary sends write to all 5 replicas. At least 3 acknowledge (W=3W = 3). Write is considered successful.

Read: Client reads from 3 replicas (R=3R = 3). Since W+R>NW + R \gt NAny read quorum overlaps With the write quorum, so the reader is guaranteed to see at least one replica with the latest Value. The reader returns the most recent version among the 3 responses.

During a partition: If 2 replicas are unreachable, writes require W=3W = 3 available replicas (OK, 3 are available). Reads require R=3R = 3 (OK). If 3 replicas are unreachable, writes fail (only 2 Available, need 3). This is the consistency-availability trade-off from CAP.

10. Problem Set

Problems 1—3: Introduction and Data Models

  1. Explain the three levels of the ANSI-SPARC architecture with an example. Show how logical data independence allows a new column to be added to the conceptual schema without modifying existing external views.

  2. Compare the relational and graph data models. Give two concrete scenarios where a graph database would be preferable to a relational database, explaining why.

  3. A university uses a relational database for student records and a document store for course materials. Discuss the benefits and challenges of this polyglot persistence approach.

Problems 4—7: Relational Model and Algebra

  1. Given relation R(A, B, C, D, E) with FDs F=ABC,CD,DEF = \\{AB \to C, C \to D, D \to E\\}: (a) Find all candidate keys. (b) Compute the attribute closure A,B+\\{A, B\\}^+. (c) Compute C+\\{C\\}^+ and D+\\{D\\}^+.

  2. Given R(A, B, C) with tuples (1,2,3),(1,2,4),(1,3,5),(2,2,3),(2,3,4)\\{(1,2,3), (1,2,4), (1,3,5), (2,2,3), (2,3,4)\\} and S(B, C) with tuples (2,3),(3,5)\\{(2,3), (3,5)\\}: (a) Compute σB=2(R)\sigma_{B=2}(R). (b) Compute RSR \bowtie S. (c) Compute R÷SR \div S.

  3. Express the following query in relational algebra: “Find the names of students who have enrolled in at least two courses taught by the same instructor.”

  4. Prove that the cross product is commutative: R×SS×RR \times S \equiv S \times R. Prove that the cross product is associative: (R×S)×TR×(S×T)(R \times S) \times T \equiv R \times (S \times T).

Problems 8—10: SQL

  1. Write a recursive SQL query that computes the total cost of all parts (direct and indirect) required to assemble product P100Given a BOM(parent_part, child_part, quantity) table.

  2. Write a SQL query that, for each department, returns the student with the highest GPA, their GPA, and the difference between their GPA and the department average. Use window functions only (no subqueries).

  3. Consider the Python code:

    query = f"DELETE FROM Student WHERE dept = '{dept}' AND gpa < {min_gpa}"

    (a) Identify the SQL injection vulnerability. (b) Show an input that exploits it. (c) Rewrite using parameterised queries.

Problems 11—14: Normalisation

  1. Given R(A, B, C, D, E) with FDs F=ABC,CDE,BD,EAF = \\{A \to BC, CD \to E, B \to D, E \to A\\}: (a) Find all candidate keys. (b) Determine the highest normal form of RR. Justify your answer. (c) If RR is not in BCNF, decompose it into BCNF relations. Verify each decomposition is lossless.

  2. Given R(A, B, C, D) with FDs F=AB,BC,CD,DAF = \\{A \to B, B \to C, C \to D, D \to A\\}: (a) Find all candidate keys. (b) Decompose into BCNF. Is the decomposition dependency-preserving? (c) Show that RR is actually already in BCNF.

  3. Prove Theorem 4.5: every relation in 4NF is in BCNF.

  4. Given `R(A, B, C, D)withMVDswith MVDsA \twoheadrightarrow BandandA \twoheadrightarrow C,andFD, and FDA \to D:(a)Findallcandidatekeys.(b)Is: (a) Find all candidate keys. (b) IsR$ in 4NF? If not, decompose into 4NF.

Problems 15—16: Indexing

  1. Insert the keys 8, 5, 1, 7, 3, 12, 9, 6 into a B+ tree of order 3 (maximum 2 keys per node). Show the tree after each operation that causes a split. How many leaf-level and internal-level splits occur in total?

  2. A table has 100000 tuples stored in 5000 pages. The buffer pool has 101 pages. Compare the estimated I/O cost of answering SELECT * FROM T WHERE A = 42 using: (a) A full table scan. (b) A B+ tree index on AA of height 3. (c) A static hash index on AA with 500 buckets and average chain length 0.4. State any assumptions you make.

Problems 17—18: Transaction Management

  1. For each schedule below, determine if it is conflict-serialisable by drawing the precedence graph. If serialisable, give the equivalent serial order. (a) r1(A),r2(A),w1(A),w2(A),r1(B),w2(B),w1(B)r_1(A), r_2(A), w_1(A), w_2(A), r_1(B), w_2(B), w_1(B) (b) r1(A),w1(A),r2(A),r2(B),w2(A),w2(B),r1(B),w1(B)r_1(A), w_1(A), r_2(A), r_2(B), w_2(A), w_2(B), r_1(B), w_1(B) (c) r1(A),w1(A),r2(B),w2(B),r3(A),w3(A),r3(B),w3(B)r_1(A), w_1(A), r_2(B), w_2(B), r_3(A), w_3(A), r_3(B), w_3(B)

  2. Explain the difference between strict 2PL and rigorous 2PL. Give a schedule that is allowed by basic 2PL but not by strict 2PL. Explain why strict 2PL is preferred in practice.

Problems 19—20: Distributed Databases

  1. Explain the two-phase commit protocol. Describe what happens in each of the following failure scenarios: (a) A participant crashes after voting YES but before receiving the coordinator’s decision. (b) The coordinator crashes after sending COMMIT to some but not all participants. (c) The coordinator crashes after phase 1 (all votes received) but before sending any decision.

  2. A distributed key-value store uses N=7N = 7 replicas with quorum settings W=4W = 4 and R=4R = 4. (a) Show that any read is guaranteed to see the latest write. (b) What is the maximum number of node failures the system can tolerate while still serving both reads and writes? (c) How does the system behave during a network partition that isolates 3 nodes from the remaining 4?

11. Further Reading

  • Silberschatz, Korth, Sudarshan: Database System Concepts (7th ed.).
  • Ramakrishnan, Gehrke: Database Management Systems (3rd ed.).
  • Elmasri, Navathe: Fundamentals of Database Systems (7th ed.).
  • Kleppmann: Designing Data-Intensive Applications (2017).

Common Pitfalls

  1. Forgetting to convert between units (e.g., cm3\text{cm}^3 to dm3\text{dm}^3) when calculating concentrations.

  2. Confusing enthalpy of formation with enthalpy of combustion, or using the wrong sign convention.

  3. Forgetting to balance equations before performing calculations — always check that atoms and charges balance on both sides.

  4. Assuming that a strong acid always has a lower pH than a weak acid without considering concentration.

Worked Examples

Example 1: SQL Joins and Aggregation

Problem. Given tables students(id, name, major) and grades(student_id, course, score), find the average score per major for courses with more than 5 enrolled students.

Solution.

SELECT s.major, AVG(g.score) AS avg_score
FROM students s
JOIN grades g ON s.id = g.student_id
GROUP BY s.major
HAVING COUNT(DISTINCT g.student_id) > 5
ORDER BY avg_score DESC;

The HAVING clause filters groups after aggregation, unlike WHERE which filters rows before.

\blacksquare

Example 2: ER to Relational Mapping

Problem. Map an ER diagram with entity Student (sid, name), entity Course (cid, title), and M:N relationship Enrolls with attribute grade to relational tables.

Solution. Three tables:

Student(sid PK, name)
Course(cid PK, title)
Enrolls(sid FK, cid FK, grade, PRIMARY KEY (sid, cid))

The M:N relationship becomes its own table with foreign keys to both entities and a composite primary key.

\blacksquare

Summary

  • Relational model: data stored in tables (relations) with attributes; keys (primary, foreign) enforce integrity.
  • SQL fundamentals: SELECT, JOIN (inner, left, right, full), GROUP BY, HAVING, subqueries.
  • Normalisation: 1NF (atomic values), 2NF (no partial dependencies), 3NF (no transitive dependencies), BCNF.
  • ACID properties: Atomicity, Consistency, Isolation, Durability — guarantee reliable transactions.
  • Indexing: B+ tree indexes speed up point and range queries; trade-off between query speed and update overhead.

Cross-References

TopicSiteLink
Advanced DatabasesWyattsNotesView
Advanced SQLWyattsNotesView
Algorithms and Data StructuresWyattsNotesView
Database Systems — CMU 15-445CMUView

:::