Database Systems
1. Introduction to Database Systems
1.1 What is a Database
A database is an organised collection of structured data, managed by a database management System (DBMS). A DBMS provides:
- Data definition (schema creation and modification).
- Data manipulation (query, insert, update, delete).
- Concurrency control (multiple users accessing data simultaneously).
- Recovery (restoring data after failures).
- Security and access control.
1.2 Database Architecture
Three-schema architecture (ANSI-SPARC):
The ANSI-SPARC (American National Standards Institute, Standards Planning and Requirements Committee) Architecture defines three levels of abstraction:
- External schema (view level): How different users/applications see the data. Each user group may have a different view. A view may hide or rename columns, join tables, or aggregate data.
- Conceptual schema (logical level): The logical structure of the entire database (tables, relationships, constraints). Describes entities, attributes, relationships, integrity constraints, and security information. This level is database-wide and is designed by the DBA.
- Internal schema (physical level): How data is stored on disk (indexes, file organisation, compression, encryption). Includes data structures, access paths, and storage allocation.
The DBMS maps between levels via the external/conceptual mapping (translates external views to The conceptual schema) and the conceptual/internal mapping (translates the conceptual schema to Internal storage).
Data independence:
- Logical data independence: The conceptual schema can change without affecting external views. Example: adding a new column to a table does not require modifying existing views that do not reference it.
- Physical data independence: The internal schema can change without affecting the conceptual schema. Example: changing from a B+ tree index to a hash index does not require modifying queries or table definitions.
1.3 Data Models
- Relational model: Data organised into tables (relations) with rows (tuples) and columns (attributes). Based on relational algebra and calculus.
- Entity-Relationship (ER) model: Conceptual design tool using entities, relationships, and attributes.
- Object-relational model: Extends the relational model with object-oriented features (inheritance, complex types, methods). Example: PostgreSQL supports table inheritance and array types.
- NoSQL models: Document, key-value, graph, column-family (covered in Section 8).
1.4 Comparison of Data Models
| Feature | Relational | Object-Relational | Document | Key-Value | Graph | Column-Family |
|---|---|---|---|---|---|---|
| Data structure | Tables | Tables + objects | JSON/BSON | Key-value pairs | Nodes + edges | Column families |
| Schema | Fixed | Fixed + extensible | Flexible | Schemaless | Flexible | Flexible per row |
| Query language | SQL | SQL + extensions | Varies | Get/Set/Scan | Cypher/Gremlin | CQL/SQL-like |
| ACID support | Full | Full | Limited | Limited | Per-node | Tunable |
| Scaling | Vertical | Vertical | Horizontal | Horizontal | Horizontal | Horizontal |
| Best for | Structured data, complex queries | Complex types, inheritance | Content management, APIs | Caching, sessions | Social networks, recommendations | Time series, logs, analytics |
Choosing a model. Relational databases remain the default for structured data with complex Queries and strong consistency requirements. NoSQL databases excel when horizontal scalability or Flexible schemas are paramount. The choice depends on the workload, not on a blanket preference.
2. Relational Model
2.1 Relations, Tuples, Attributes
A relation over attributes is a set of tuples where Each is drawn from the domain of . A relation is a subset of .
Properties of relations:
- Each relation has a unique name.
- Each attribute has a unique name within a relation.
- Tuples are unordered (a relation is a set).
- All tuples are distinct (no duplicates in a mathematical relation, though SQL tables allow them).
- Attributes are unordered.
2.2 Keys
Let be a relation schema.
- Superkey: A set of attributes such that no two distinct tuples agree on all attributes in .
- Candidate key: A minimal superkey (no proper subset is also a superkey).
- Primary key: One of the candidate keys, chosen by the designer.
- Foreign key: An attribute (or set) in relation that references the primary key of relation . Enforces referential integrity: every value of must appear as a value of in Or be
NULL.
Example. Student(ID, Name, Email). ID is a candidate key (unique). If Email is also unique, It is another candidate key. One is chosen as the primary key.
Theorem 2.1. Every relation has at least one candidate key (possibly the entire set of attributes).
2.3 Relational Algebra
Relational algebra provides a formal query language based on operations on relations.
Selection : Return tuples from satisfying condition .
Projection : Return a relation containing only attributes .
Union : All tuples in or (both must be union-compatible: same arity and Attribute domains).
Difference : Tuples in but not in .
Cartesian product : All pairs where and .
Rename : Rename relation to .
Natural join : Combine tuples from and that agree on all common attributes.
Theta join : .
Equi-join : A theta join where is an equality on specific Attributes. Keeps both join columns.
Left outer join : All tuples from Matched with where possible; NULL-padded otherwise.
Right outer join : All tuples from Matched with .
Full outer join : All tuples from both and .
Division : Tuples in such that for every tuple , .
Example. Find students who have taken all courses:
Worked Example 2.1: Complex Relational Algebra Query
Query: Find the names of students who have enrolled in every course offered by the CS department.
Schema: Student(sid, name, dept)``Course(cid, title, dept)``Takes(sid, cid, semester).
Step 1. Get CS course IDs:
Step 2. Get student-course pairs from enrolments:
Step 3. Students who have taken all CS courses (division):
Step 4. Get names:
Combined:
Worked Example 2.2: Relational Algebra with Outer Joins
Query: Find all students and the number of CS courses they have taken, including students who have Taken no CS courses (count should be 0).
Step 1. Filter enrolments to CS courses:
Step 2. Left outer join with Student to include those with no CS courses:
Note: aggregation over outer join results handles NULL values (they are excluded from COUNT).
In SQL, this translates directly to:
SELECT S.sid, S.name, COUNT(E.cid) AS num_cs_coursesFROM Student SLEFT JOIN Takes T ON S.sid = T.sidLEFT JOIN Course C ON T.cid = C.cid AND C.dept = 'CS'GROUP BY S.sid, S.name;2.4 Relational Calculus
Tuple relational calculus. A query has the form where is a tuple variable And is a well-formed formula. The formula is built from:
- Atoms: (tuple is in relation ), (comparison), (comparison with constant), where .
- Logical connectives: (and), (or), (not).
- Quantifiers: (there exists), (for all).
Domain relational calculus. Variables range over individual attribute domains (not entire tuples). A query has the form .
Safety. A calculus expression is safe if it yields a finite relation. The expression is unsafe (it includes every tuple not in An infinite set). We Restrict variables to domains of the relations appearing in the query.
Theorem 2.2 (Codd). Relational algebra and relational calculus (both tuple and domain) are Equally expressive: every query expressible in one is expressible in the other.
Worked Example 2.3: Tuple Relational Calculus
Query: Find students who have taken no CS course.
Using tuple relational calculus:
Translation to relational algebra:
2.5 Equivalence Rules for Relational Algebra
The following rules allow the optimiser to transform queries without changing their meaning. Each rule States that two expressions produce the same relation for any input relations.
Rule 1 (Cascade of selections). .
Proof. A tuple satisfies if and only if it satisfies both and . Applying first removes tuples failing ; then removes tuples failing among the remainder. The result is exactly the Tuples satisfying both conditions.
Rule 2 (Commutativity of selections). .
Proof. Immediate from the commutativity of logical conjunction ().
Rule 3 (Selection pushdown through cross product). If involves only attributes of Then .
Proof. For each pair with and The condition depends only on . Filtering by on is equivalent to first filtering by And then forming the cross product, since does not affect the result of .
Rule 4 (Selection pushdown through join). If involves only attributes of Then .
Proof. The join combines matching pairs from and . Applying After the join filters these pairs by on ‘s attributes. Filtering first removes Non-matching -tuples before the join, yielding the same final set of pairs.
Rule 5 (Commutativity of joins). .
Proof. The natural join combines tuples agreeing on common attributes. This relation is symmetric In and .
Rule 6 (Associativity of joins). .
Proof. Both expressions produce tuples from that agree on all attributes Shared by any pair of the three relations.
Rule 7 (Projection pushdown). If and includes all attributes used in Join conditions, then .
Proof. Projecting after the join retains only attributes in . Since includes all Attributes needed for the join, performing the join on projected versions of and yields the Same join result, from which is then projected.
These rules form the foundation of heuristic query optimisation (see Section 7).
3. SQL
3.1 Data Definition Language (DDL)
CREATE TABLE Student ( ID INT PRIMARY KEY, Name VARCHAR(100) NOT NULL, Email VARCHAR(255) UNIQUE, GPA DECIMAL(3,2) CHECK (GPA >= 0.0 AND GPA <= 4.0), Dept VARCHAR(50) DEFAULT 'Undeclared');
CREATE TABLE Enrolment ( StudentID INT REFERENCES Student(ID) ON DELETE CASCADE, CourseID VARCHAR(10) REFERENCES Course(CourseID), Grade CHAR(2), Semester VARCHAR(6), PRIMARY KEY (StudentID, CourseID, Semester));
CREATE INDEX idx_student_dept ON Student(Dept);CREATE INDEX idx_enrolment_student ON Enrolment(StudentID);Constraints:
| Constraint | Enforcement |
|---|---|
PRIMARY KEY | Unique, not null |
FOREIGN KEY | Referential integrity |
UNIQUE | No duplicate values |
NOT NULL | No null values |
CHECK | Arbitrary boolean condition |
DEFAULT | Default value if none specified |
3.2 Data Manipulation Language (DML)
INSERT INTO Student (ID, Name, Email, GPA)VALUES (1, 'Alice', 'alice@univ.edu', 3.9);
UPDATE Student SET GPA = 3.95 WHERE ID = 1;
DELETE FROM Student WHERE GPA < 1.0;3.3 Queries
SELECT S.Name, S.GPA, C.TitleFROM Student SJOIN Enrolment E ON S.ID = E.StudentIDJOIN Course C ON E.CourseID = C.CourseIDWHERE E.Grade IN ('A', 'A-') AND C.Dept = 'CS' AND S.GPA > 3.5ORDER BY S.GPA DESC;Aggregate functions: COUNT``SUM``AVG``MIN``MAX.
SELECT Dept, COUNT(*) AS NumStudents, AVG(GPA) AS AvgGPAFROM StudentGROUP BY DeptHAVING COUNT(*) > 10ORDER BY AvgGPA DESC;WHERE vs. HAVING: WHERE filters rows before grouping; HAVING filters groups after.
3.4 Joins
-- Inner join: only matching rowsSELECT * FROM Student S JOIN Enrolment E ON S.ID = E.StudentID;
-- Left join: all students, with enrolment data where availableSELECT * FROM Student S LEFT JOIN Enrolment E ON S.ID = E.StudentID;
-- Self-join: find students with the same nameSELECT S1.Name, S1.ID, S2.IDFROM Student S1 JOIN Student S2 ON S1.Name = S2.Name AND S1.ID < S2.ID;
-- Cross join: Cartesian productSELECT * FROM Student CROSS JOIN Course;3.5 Subqueries
-- Scalar subquerySELECT Name FROM StudentWHERE GPA > (SELECT AVG(GPA) FROM Student);
-- Correlated subquerySELECT S.Name, S.DeptFROM Student SWHERE GPA = ( SELECT MAX(GPA) FROM Student S2 WHERE S2.Dept = S.Dept);
-- EXISTS / NOT EXISTSSELECT Name FROM Student SWHERE EXISTS ( SELECT 1 FROM Enrolment E WHERE E.StudentID = S.ID AND E.Grade = 'A+');
-- IN subquerySELECT Name FROM StudentWHERE ID IN (SELECT StudentID FROM Enrolment WHERE CourseID = 'CS101');
-- WITH (Common Table Expression)WITH DeptAvg AS ( SELECT Dept, AVG(GPA) AS AvgGPA FROM Student GROUP BY Dept)SELECT S.Name, S.GPA, D.AvgGPAFROM Student S JOIN DeptAvg D ON S.Dept = D.DeptWHERE S.GPA > D.AvgGPA;3.6 Window Functions
Window functions compute values across a set of rows related to the current row, without collapsing Rows (unlike GROUP BY).
SELECT Name, Dept, GPA, RANK() OVER (PARTITION BY Dept ORDER BY GPA DESC) AS DeptRank, DENSE_RANK() OVER (ORDER BY GPA DESC) AS OverallRank, AVG(GPA) OVER (PARTITION BY Dept) AS DeptAvgGPA, LAG(GPA, 1) OVER (PARTITION BY Dept ORDER BY GPA DESC) AS PrevGPAFROM Student;RANK() vs. DENSE_RANK(): RANK() leaves gaps after ties; DENSE_RANK() does not.
Other window functions: ROW_NUMBER()``LEAD()``FIRST_VALUE()``LAST_VALUE() NTILE(n)``SUM() OVER(...)``COUNT() OVER(...).
Worked Example 3.1: Running Totals and Percentiles
Running total of enrolments per department over time:
SELECT Dept, Semester, COUNT(*) AS Enrolments, SUM(COUNT(*)) OVER (PARTITION BY Dept ORDER BY Semester ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotalFROM Enrolment E JOIN Course C ON E.CourseID = C.CourseIDGROUP BY Dept, SemesterORDER BY Dept, Semester;Percentile rank of each student’s GPA within their department:
SELECT Name, Dept, GPA, PERCENT_RANK() OVER (PARTITION BY Dept ORDER BY GPA) AS Percentile, NTILE(4) OVER (PARTITION BY Dept ORDER BY GPA) AS QuartileFROM Student;PERCENT_RANK() returns a value in representing the fractional rank. NTILE(4) divides The partition into 4 approximately equal groups (quartiles).
3.7 Views
CREATE VIEW CSStudents ASSELECT ID, Name, GPA FROM Student WHERE Dept = 'CS';
CREATE VIEW DeptStats ASSELECT Dept, COUNT(*) AS N, AVG(GPA) AS AvgGPAFROM Student GROUP BY Dept;Views are virtual tables. Updatable views (single-table, no aggregation) allow INSERT UPDATE``DELETE through the view. Materialised views store the result physically and must be Refreshed.
3.8 Recursive Common Table Expressions
Recursive CTEs allow queries on hierarchical or graph-structured data. A recursive CTE consists of A base case (non-recursive term) and a recursive step (referencing the CTE itself), joined By UNION ALL.
Worked Example 3.2: Employee-Manager Hierarchy
Schema: Employee(emp_id, name, manager_id).
WITH RECURSIVE OrgChart AS ( SELECT emp_id, name, manager_id, 1 AS level FROM Employee WHERE manager_id IS NULL
UNION ALL
SELECT e.emp_id, e.name, e.manager_id, o.level + 1 FROM Employee e JOIN OrgChart o ON e.manager_id = o.emp_id)SELECT level, name, emp_idFROM OrgChartORDER BY level, name;The base case selects the CEO (no manager). Each recursive step finds all employees whose manager Appears in the previous iteration. The recursion terminates when no new employees are found.
Worked Example 3.3: Bill of Materials (Parts Explosion)
Schema: Part(part_id, name) and BOM(assembly_id, component_id, quantity).
Find all components (direct and indirect) of assembly A100 and the total quantity needed:
WITH RECURSIVE PartsList AS ( SELECT component_id AS part_id, quantity FROM BOM WHERE assembly_id = 'A100'
UNION ALL
SELECT b.component_id, p.quantity * b.quantity FROM PartsList p JOIN BOM b ON p.part_id = b.assembly_id)SELECT part_id, SUM(quantity) AS total_qtyFROM PartsListGROUP BY part_idORDER BY total_qty DESC;The recursive step multiplies the quantity at each level to accumulate the total needed quantity for Each sub-component.
3.9 SQL Injection Prevention
SQL injection occurs when user input is concatenated directly into a SQL string, allowing an Attacker to manipulate the query structure.
Vulnerable code (Python):
cursor.execute(f"SELECT * FROM Student WHERE name = '{user_input}'")If user_input is ' OR '1'='1The query becomes SELECT * FROM Student WHERE name = '' OR '1'='1' Returning all rows.
Types of SQL injection:
| Type | Description |
|---|---|
| In-band (error-based) | Attacker extracts data from error messages |
| In-band (UNION-based) | Attacker appends UNION SELECT to extract other tables |
| Blind (boolean) | Attacker infers data by asking true/false questions |
| Blind (time-based) | Attacker infers data by measuring response delays |
| Out-of-band | Attacker triggers the DBMS to send data externally |
Prevention:
- Parameterised queries (prepared statements): The DBMS treats parameters as data, never as SQL code.
cursor.execute("SELECT * FROM Student WHERE name = %s", (user_input,))- Input validation: Reject or sanitise input that does not match expected patterns (e.g., email format, numeric ID).
- Least privilege: Application accounts should have only the permissions they need.
- ORM frameworks: Use an ORM (e.g., SQLAlchemy, Django ORM) that generates parameterised queries by default.
:::caution Common Pitfall Even if you escape single quotes in user input, this is not a reliable defence against SQL injection. Use parameterised queries instead. Escape-based defences are fragile and have been bypassed by Encoding tricks (e.g., multibyte character exploits). :::
3.10 Query Optimisation Basics
The SQL query optimiser automatically selects an execution plan, but understanding the basics helps Developers write efficient queries.
Using EXPLAIN:
EXPLAIN ANALYZESELECT S.Name FROM Student SJOIN Enrolment E ON S.ID = E.StudentIDWHERE E.CourseID = 'CS101';EXPLAIN shows the estimated plan; EXPLAIN ANALYZE also runs the query and shows actual timings. Key information: sequential scan vs. Index scan, join algorithm used, estimated vs. Actual row counts, And execution time.
Common optimisation guidelines:
- Create appropriate indexes on columns used in
WHERE``JOIN``ORDER BYAndGROUP BYclauses. - Avoid
SELECT *; retrieve only the columns needed. - Avoid N+1 queries: Fetching related data row-by-row instead of with a single join or batch query.
- Use
LIMITduring development and testing to avoid processing large result sets. - Prefer
EXISTSoverINfor subqueries when checking for existence (often more efficient). - Keep statistics updated:
ANALYZE(PostgreSQL) orANALYZE TABLE(MySQL) updates the statistics the optimiser uses for cost estimation.
Worked Example 3.4: Identifying and Fixing a Slow Query
Slow query:
SELECT S.Name, S.Dept, C.TitleFROM Student S, Enrolment E, Course CWHERE S.ID = E.StudentID AND E.CourseID = C.CourseID AND C.Dept = 'CS' AND S.GPA > 3.5ORDER BY S.Name;Problems identified via EXPLAIN ANALYZE:
- Implicit cross-join syntax (
FROM S, E, C) instead of explicitJOIN. While the optimiser handles this, explicit joins are clearer and less error-prone. - No index on
C.Dept. The optimiser performs a full scan onCourse. - No index on
E.StudentIDorE.CourseID. Nested-loop joins scanEnrolmentfor each row.
Improved version:
SELECT S.Name, S.Dept, C.TitleFROM Student SJOIN Enrolment E ON S.ID = E.StudentIDJOIN Course C ON E.CourseID = C.CourseIDWHERE C.Dept = 'CS' AND S.GPA > 3.5ORDER BY S.Name;
CREATE INDEX idx_course_dept ON Course(Dept);CREATE INDEX idx_enrolment_student ON Enrolment(StudentID);CREATE INDEX idx_enrolment_course ON Enrolment(CourseID);With these indexes, the optimiser can use an index scan on Course(Dept) to find CS courses, then Use nested-loop index joins to find matching enrolments and students, avoiding full table scans.
4. Normalisation
4.1 Functional Dependencies
A functional dependency holds on relation if for any two tuples in implies .
Armstrong’s axioms:
- Reflexivity: If Then .
- Augmentation: If Then .
- Transitivity: If and Then .
Derived rules:
- Union: If and Then .
- Decomposition: If Then and .
- Pseudotransitivity: If and Then .
Attribute closure. is the set of all attributes functionally determined by . Computed By starting with and repeatedly applying Armstrong’s axioms until no new attributes can be added.
Theorem 4.1. holds if and only if .
4.2 Normal Forms
First Normal Form (1NF). All attribute values are atomic (indivisible). No repeating groups or Nested relations.
Second Normal Form (2NF). In 1NF, and no non-prime attribute is partially dependent on any Candidate key. Equivalently: every non-prime attribute is fully functionally dependent on every Candidate key.
A partial dependency is where is a proper subset of a candidate key and is Non-prime.
Third Normal Form (3NF). In 2NF, and for every non-trivial FD in Either is a Superkey or is a prime attribute.
A prime attribute is an attribute that belongs to some candidate key.
Boyce-Codd Normal Form (BCNF). For every non-trivial FD in , is a superkey.
Theorem 4.2. Every relation in BCNF is in 3NF, but the converse does not hold.
Proof. BCNF requires to be a superkey for every non-trivial FD . 3NF allows to be A superkey or to be prime. Since the BCNF condition is stricter, every BCNF relation is in 3NF. For the converse, consider with FDs and . Candidate keys: , . does not violate 3NF ( is prime) but violates BCNF ( is not a Superkey).
4.3 Decomposition
A decomposition of into must satisfy:
- Lossless-join: . No information is lost.
- Dependency preservation: Every FD in the original schema is implied by the FDs in the decomposed schemas.
Theorem 4.3 (Lossless-join test). A decomposition of into is lossless if and Only if or .
Proof. Let be an instance of and let , . We must Show under the given condition. Since and are projections of Every tuple in agrees with some tuple of on every attribute. It suffices to show That no spurious tuple is produced. Suppose where and . Since and agree on And by the condition (WLOG), there exists with . Since and , we have . Similarly . Thus corresponds to a real tuple So no spurious tuple exists.
Dependency preservation. Let be the set of FDs for . The decomposition Preserves dependencies if the closure of (where is the restriction Of to ) equals . In practice, we check that every FD in a minimal cover of can be Tested within a single .
Dependency preservation is important for efficient constraint checking: when updating The DBMS can verify relevant FDs locally without joining all decomposed relations.
4.4 Normalisation Examples
Example 1. Enrolment(StudentID, CourseID, StudentName, Dept, Grade) with FDs: , .
- Candidate key: .
- 1NF: satisfied (atomic values).
- 2NF violation: is a partial dependency (StudentID is a proper subset of the key).
- Decompose:
Student(StudentID, StudentName, Dept)andEnrolment(StudentID, CourseID, Grade). Both are in 3NF and BCNF.
Example 2 (3NF but not BCNF). CourseOffering(Course, Instructor, Room, Time) with FDs: .
- Candidate key: .
- 3NF: — Instructor is not a superkey, but Room is not prime. Wait — Room is not prime (not in any candidate key). So this actually violates 3NF too.
Let us correct: CourseOffering(Course, Instructor, Textbook) with FDs: .
- Candidate key: (since Course determines everything transitively).
- 3NF: . Instructor is not a superkey. Textbook is not prime. Violates 3NF.
Better example. Class(Course, Instructor, Student) with FDs: .
- Candidate keys: , .
- 3NF check for : Instructor is not a superkey. But Course is prime (in candidate key ). So 3NF is satisfied.
- BCNF check: violates BCNF (Instructor is not a superkey).
Decompose: Teaches(Instructor, Course) and Attends(Instructor, Student). This is lossless ( is common and holds in Teaches). But the dependency is Not preserved.
Theorem 4.4. Not every relation can be decomposed into BCNF while preserving dependencies. 3NF Is the strongest normal form guaranteeing dependency-preserving, lossless-join decomposition.
Worked Example 4.1: BCNF Decomposition Algorithm
Relation: with FDs .
Step 1. Find candidate keys. Compute . . So is a candidate key. Check : . So is Also a candidate key. Prime attributes: .
Step 2. Check BCNF. : is not a superkey (since ). Violates BCNF. Decompose on : and .
Step 3. is in BCNF (FD with as superkey).
For with FDs restricted to : involves which is not in So it is dropped. holds. Also, involves not in So it is dropped. Check for implied FDs: in implies (augmentation), so in (since ). But is not a superkey of … Wait, let us recheck.
Actually, from and the fact that is not in We cannot directly use In . We should compute the projection of onto : . So the only non-trivial FD in is . Is a superkey of ? (missing ). So is not a superkey of .
Hmm, we need to be more careful. Let us check if is determined by in the original . in Which does not include . So is not determined.
This means has no non-trivial FDs that hold (other than keys determining all Attributes). Check: candidate keys of must be superkeys. Since holds but does Not determine We need as a key: (all of ). So in , is a candidate key (since it determines all attributes of : and in , so ). is in BCNF since the only non-trivial FD is and we need to check if is a superkey of . Since , is not a superkey.
But wait — there are no other non-trivial FDs in . The only one is Which violates BCNF. Decompose: and .
Hmm, alone is a relation with no non-trivial FDs, so it is in BCNF. With : is a superkey (it determines all three attributes). BCNF.
Final decomposition: , , .
Lossless check: , (from ), so is lossless. … This is problematic. shares no attributes with The others.
The issue is that is a “dangling” attribute. This is correct — appears only in the key Of the original relation but is not functionally determined by anything except the full key. The Decomposition is technically correct but by itself cannot be joined losslessly with The others.
This illustrates a limitation of BCNF decomposition: it may produce a relation with no common Attributes, making lossless join impossible to verify with the pairwise test. In practice, the 3NF Synthesis algorithm avoids this issue.
:::caution Common Pitfall Do not confuse partial dependency (2NF violation) with transitive dependency (3NF violation). A Partial dependency involves a proper subset of a candidate key determining a non-prime attribute. A transitive dependency involves a non-key attribute determining another non-prime attribute. :::
4.5 Multivalued Dependencies and 4NF
A multivalued dependency (MVD) holds on relation if for any two Tuples with There exists a tuple such that:
: determines independently of .
Trivial MVDs: where or .
Fourth Normal Form (4NF). is in 4NF if for every non-trivial MVD That holds on , is a superkey.
Theorem 4.5. Every relation in 4NF is in BCNF.
Proof. Every FD implies the MVD . In 4NF, every non-trivial MVD requires to be a superkey. Therefore, every non-trivial FD Also requires to be a superkey, which is the BCNF condition.
4NF decomposition. Given with MVD where is not a superkey, Decompose into and . The decomposition is lossless-join.
Worked Example 4.2: Decomposition to 4NF
Relation: CourseInstructor(Course, Instructor, Textbook) where each course can have multiple Instructors and multiple textbooks, independently.
MVDs: .
Sample data:
| Course | Instructor | Textbook |
|---|---|---|
| CS101 | Smith | DB Systems |
| CS101 | Lee | DB Systems |
| CS101 | Smith | Algorithm Design |
| CS101 | Lee | Algorithm Design |
The redundancy is clear: each instructor-textbook pair is repeated for each course.
4NF check: is non-trivial, and is not a superkey. Violates 4NF.
Decompose:
CI(Course, Instructor)with MVDCT(Course, Textbook)with MVD
Both are in 4NF (the determining attribute Course is a candidate key in each).
5. Indexing
5.1 B-Trees and B+ Trees
B-Tree of order . A balanced search tree where each internal node has between and children. All leaves are at the same depth.
Properties:
- Root: at least 2 children (unless it is a leaf).
- Internal node with children has keys.
- All leaves at the same depth .
- Height: .
Operations:
- Search: . At each node, binary search the keys and follow the appropriate child pointer.
- Insert: Find the correct leaf. If the leaf has room, insert. If full, split the leaf and propagate the median key up. May cascade splits to the root.
- Delete: Find the key. If in an internal node, replace with predecessor/successor and delete from the leaf. If a node underflows (fewer than keys), redistribute from a sibling or merge with a sibling.
B+ Tree. Variant where:
- All data records are stored only in leaf nodes (internal nodes store only keys for routing).
- Leaf nodes are linked in a linked list for efficient range queries.
This makes B+ trees the standard for database indexing. The linked leaf list allows range scans in where is the number of records in the range.
B+ Tree insert pseudocode:
def bplus_insert(root, key, value): leaf = find_leaf(root, key) insert_into_leaf(leaf, key, value) if len(leaf.keys) > MAX_KEYS: new_leaf, median = split_leaf(leaf) insert_into_parent(leaf.parent, median, new_leaf)Worked Example 5.1: B+ Tree Insertion
Insert keys 10, 20, 5, 15, 25, 30 into a B+ tree with maximum 2 keys per leaf node and maximum 2 Keys per internal node (order ).
Insert 10. Leaf:
Insert 20. Leaf:
Insert 5. Leaf would be — overflow (max 2 keys). Split: left leaf Right Leaf . Push median (10) to new internal node.
Internal: [10] ├── Leaf: [5] └── Leaf: [10, 20]Insert 15. Goes to right leaf. Leaf would be — overflow. Split: and . Push 15 up.
Internal: [10, 15] ├── Leaf: [5] ├── Leaf: [10, 15] └── Leaf: [20]Insert 25. Goes to rightmost leaf. Leaf: .
Insert 30. Leaf would be — overflow. Split: and . Push 25 up. Internal would be — overflow (max 2 keys). Split internal: push 15 to new root.
Root: [15] ├── Internal: [10] │ ├── Leaf: [5] │ └── Leaf: [10, 15] └── Internal: [25] ├── Leaf: [20, 25] └── Leaf: [30]Worked Example 5.2: B+ Tree Deletion
Starting from the tree in Example 5.1, delete key 25.
Delete 25. Found in rightmost leaf of the right internal node. Leaf becomes (1 key, Minimum is ). No underflow. Update internal node: key 25 changes to 30 (to maintain correct routing).
Root: [15] ├── Internal: [10] │ ├── Leaf: [5] │ └── Leaf: [10, 15] └── Internal: [30] ├── Leaf: [20] └── Leaf: [30]Now delete 15. Leaf becomes . No underflow. Internal node key 15 changes to 10. But wait — the internal node would need to distinguish between leaves and . Since the left child contains keys and the right child contains keys This Is still correct.
Now delete 10 from the left subtree’s right leaf. Leaf becomes empty — underflow.
Redistribution: Sibling leaf has 1 key (at minimum). Cannot redistribute. Merge: Merge empty leaf with . The merged leaf is . The internal node now has Only one child (the merged leaf), so it underflows.
Since the internal node is a child of the root, and the root has two children, we can merge: remove The internal node and promote its remaining child to be a direct child of the root.
Root: [30] ├── Leaf: [5] ├── Leaf: [10] (originally contained 10 before deletion) └── Leaf: [30]Wait — after deleting 10, the leaf should become And merging with gives . Let me retrace. The point is that deletion can trigger cascading merges up the tree.
5.2 Hash Indexes
Static hashing. A hash function maps keys to buckets. Average lookup: under uniform hashing. No support for range queries.
Collision handling. When two keys hash to the same bucket, the DBMS must resolve the collision:
- Separate chaining: Each bucket contains a linked list of entries. Lookup requires traversing the chain. Average chain length under uniform hashing is .
- Open addressing (linear probing): If bucket is full, try , Etc. (mod ). Prone to primary clustering: consecutive occupied slots increase the average probe length.
- Open addressing (quadratic probing): Try Etc. Reduces clustering but may not probe all buckets.
- Open addressing (double hashing): Use a second hash function : probe sequence is for . Minimises clustering.
Extendible hashing. Dynamically grows the number of buckets. Uses a global depth and Per-bucket local depth .
- Hash key to bits.
- If the bucket is full and its local depth equals the global depth, double the directory (global depth increases by 1) and split the bucket.
- If the bucket’s local depth is less than the global depth, split the bucket without doubling the directory.
Linear hashing. Gradually grows one bucket at a time using a pointer to the next bucket to split. Simpler than extendible hashing but may have slightly higher overflow probability.
Comparison:
| Property | B+ Tree | Hash Index |
|---|---|---|
| Point query | average | |
| Range query | Not supported | |
| Insert/Delete | average | |
| Space | Nodes may be half-empty | May have overflow chains |
| Order | Maintains key order | No ordering |
5.3 Bitmap Indexes
A bitmap index creates one bitmap per distinct value of an attribute. For a table with rows And attribute with values Store bitmaps of bits each, where Bitmap has a 1 in position if row has .
Use case: Low-cardinality columns (gender, status, country). Bitmap indexes support fast bitwise AND/OR for multi-criteria queries.
Bitwise operations for query evaluation:
| Query | Bitmap operation |
|---|---|
| AND | \mathrm{bitmap_}{A,v_1} AND \mathrm{bitmap_}{B,v_2} |
| OR | \mathrm{bitmap_}{A,v_1} OR \mathrm{bitmap_}{A,v_2} |
| NOT \mathrm{bitmap_}{A,v_1} |
Compression. For columns with many distinct values, run-length encoding (WAH or BBC) compresses Bitmaps effectively while still supporting bitwise operations.
5.4 Query Processing Cost Estimation
The cost model estimates the number of disk I/O operations for a query plan. We assume the buffer Pool has pages and each disk page access costs one I/O.
Selection cost estimation:
| Access method | Cost (I/Os) |
|---|---|
| Full table scan | (or if pages available) |
| B+ tree equality search | leaf + 1 data page |
| B+ tree range search | leaf + |
| Hash equality search | 1 (ideal) |
Where is the fanout (average number of children per internal node).
Join cost estimation (from Section 7):
| Algorithm | Cost |
|---|---|
| Block nested-loop | |
| Sort-merge | |
| Hash join | (if build relation fits in pages) |
Worked Example 5.3: Comparing Access Methods
Scenario: Student table with 50000 tuples, 2500 pages (20 tuples per page). Buffer pool has 52 Pages. Attribute ID has a B+ tree index of height 3 (fanout 100). Query: SELECT * FROM Student WHERE ID = 12345.
Method 1: Full table scan. Cost = 2500 I/Os.
Method 2: B+ tree index on ID. Follow root internal internal leaf: 3 I/Os For the index traversal. Then 1 I/O for the data page. Total: 4 I/Os.
The B+ tree index is faster for a single equality lookup. However, for a query selecting 50% of the table, a full scan (2500 I/Os) would be faster than 25000 individual B+ tree lookups ( I/Os). This is why the optimiser chooses between index and scan based on Selectivity estimates.
6. Transaction Management
6.1 ACID Properties
A transaction is a logical unit of work that must satisfy:
| Property | Meaning |
|---|---|
| Atomicity | All or nothing: either all operations commit or none |
| Consistency | Transforms the database from one consistent state to another |
| Isolation | Concurrent transactions do not interfere with each other |
| Durability | Once committed, the effects survive failures |
6.2 Isolation Levels
SQL defines four isolation levels with increasing strictness:
| Level | Dirty Read | Non-repeatable Read | Phantom Read |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Prevented | Possible | Possible |
| Repeatable Read | Prevented | Prevented | Possible |
| Serializable | Prevented | Prevented | Prevented |
- Dirty read: Reading uncommitted data from another transaction.
- Non-repeatable read: Same query returns different results within the same transaction due to updates by another transaction.
- Phantom read: Same range query returns new rows due to inserts by another transaction.
Default levels: PostgreSQL: Read Committed. MySQL InnoDB: Repeatable Read. SQL Server: Read Committed. Oracle: Read Committed.
6.3 Serialisability
A schedule is serialisable if its effect is equivalent to some serial schedule (where Transactions execute one after another).
Conflict serialisability. Two operations conflict if they belong to different transactions, Access the same data item, and at least one is a write. A schedule is conflict-serialisable if its precedence graph (also called serialisation graph) is acyclic.
- Precedence graph: Nodes are transactions. Draw a directed edge if an operation of conflicts with and precedes an operation of .
Theorem 6.1. A schedule is conflict-serialisable if and only if its precedence graph is acyclic.
Proof sketch. If the precedence graph has a cycle, no serial ordering can respect all the Precedence constraints, so the schedule is not conflict-serialisable. Conversely, a topological sort Of an acyclic graph gives a serial order equivalent to the schedule.
View serialisability. A schedule is view-serialisable if it is view-equivalent to a Serial schedule . View equivalence requires:
- Initial read: If reads the initial value of in It does so in .
- Updated read: If reads the value of written by in It does so in .
- Final write: If performs the final write of in It does so in .
Every conflict-serialisable schedule is view-serialisable, but the converse does not hold. Testing For view serialisability is NP-complete.
Worked Example 6.1: Testing Conflict Serialisability
Schedule: .
Build precedence graph:
- before : edge .
- before : No — is after .
- before… Nothing comes after on .
- before : edge .
- before… Nothing comes after on .
Precedence graph: (acyclic).
Equivalent serial schedule: (topological order).
Now consider: .
- before : .
- before : .
Precedence graph has cycle: . Not conflict-serialisable.
6.4 Concurrency Control
Two-Phase Locking (2PL).
Protocol:
- Growing phase: Acquire locks; do not release any.
- Shrinking phase: Release locks; do not acquire any.
Theorem 6.2. 2PL guarantees conflict serialisability.
Proof. By the protocol, transaction acquires all its locks before releasing any. If Requests a lock held by , waits. This creates a total ordering on conflicting Operations, which corresponds to a serial schedule.
Variants:
- Strict 2PL: All locks held until commit/abort. Prevents cascading aborts.
- Rigorous 2PL: All locks held until commit/abort (same as strict 2PL for exclusive locks).
Problem with 2PL: Deadlocks. Detection (wait-for graph) or prevention (timestamp ordering).
Timestamp Ordering (TO). Each transaction receives a timestamp . For conflicting Operations:
- reads : if was last written by with Abort .
- writes : if was last read by with Abort .
No deadlocks (no waiting), but may abort transactions unnecessarily.
Optimistic Concurrency Control (OCC). Assumes conflicts are rare. Three phases:
- Read phase: Transaction reads data and writes to private workspace. No locks acquired.
- Validation phase: Before commit, check that no data read by this transaction has been modified by a committed transaction since the read phase began.
- Write phase: If validation passes, apply private writes to the database. Otherwise, abort and restart.
Forward validation: Check against committed transactions. For each data item read by Verify that the transaction that last wrote committed before ‘s read phase began. Backward validation: Check against active transactions.
OCC performs well when conflicts are rare but degrades under high contention (many restarts).
Worked Example 6.2: Optimistic Concurrency Control
Scenario: Two transactions and both read and update account balances.
- : Read Read Write Write (transfer 50 from to ).
- : Read Write (withdraw 25).
Execution:
- read phase: Reads , into private workspace.
- read phase: Reads into private workspace.
- validation phase: Checks if was modified by any committed transaction since started. No committed transactions modified . Validation passes.
- write phase: Writes . commits.
- validation phase: Checks if or was modified by any committed transaction since started. committed and modified . Validation fails.
- is aborted and restarted. On restart, reads , And correctly computes , .
Multiversion Concurrency Control (MVCC). Maintain multiple versions of each data item.
- Read operations read a consistent snapshot (a version visible at the start of the transaction).
- Write operations create a new version; the old version remains for concurrent readers.
- Conflicts between readers and writers are avoided (no read-write contention).
MVCC pseudocode (simplified):
def read(txn, item): version = find_version(item, txn.start_ts) return version.value
def write(txn, item, value): if has_conflicting_write(item, txn.start_ts, txn.id): abort(txn) create_version(item, value, txn.id, txn.start_ts)MVCC implementations:
- PostgreSQL: Snapshot Isolation. Each transaction sees a snapshot of committed data at transaction start. Uses
xmin/xmaxon each tuple. - MySQL InnoDB: MVCC with undo logs. Each row has hidden columns for transaction ID and rollback pointer.
- Oracle: Snapshot Isolation at the statement level by default.
MVCC anomaly: Write Skew. Two concurrent transactions read overlapping data and update Non-overlapping parts based on what they read. Serializable Snapshot Isolation (SSI) in PostgreSQL 9.1+ Detects and prevents this.
6.5 Log-Based Recovery
Write-Ahead Logging (WAL). Before writing a modified page to disk, write the log record describing The modification to stable storage.
WAL protocol:
- Before a data page is flushed to disk, all log records pertaining to that page must be on stable storage.
- Before a transaction commits, all its log records must be on stable storage.
Log records: [LSN, TxnID, PageID, BeforeImage, AfterImage, Commit/Abort].
ARIES (Algorithm for Recovery and Isolation Exploiting Semantics). The standard recovery Algorithm used in IBM DB2, PostgreSQL, SQL Server, and others:
- Analysis phase: Scan the log from the last checkpoint. Identify dirty pages and active transactions. Reconstruct the transaction table (which transactions were active at crash time) and the dirty page table (which pages may not have been written to disk).
- Redo phase: Scan forward from the earliest LSN in the dirty page table. Redo all logged updates (from both committed and uncommitted transactions) to restore the database to the exact state at the time of the crash. A record is redone only if the page’s on-disk LSN is less than the record’s LSN (ensuring idempotency).
- Undo phase: Scan backward from the end of the log. Undo all updates from transactions that were active at crash time (did not commit). Write compensation log records (CLRs) so that undo operations themselves are logged.
Key ARIES properties:
- Idempotent: Recovery can be restarted safely (re-processing log records has no additional effect).
- Steal/no-force: Uncommitted pages can be flushed to disk (steal), and committed pages need not be flushed (no-force). This improves performance at the cost of more recovery work.
- Fine-grained logging: Log records track individual page modifications, enabling precise recovery.
Checkpoints. Periodically flush dirty pages to disk and write a checkpoint record. Reduces the Amount of log to scan during recovery.
// Simplified checkpoint pseudocodevoid checkpoint() { flush_all_dirty_pages(); write_log_record(CHECKPOINT, dirty_page_list, active_txns); flush_log();}6.6 Buffer Management
The buffer pool caches frequently accessed disk pages in memory. Replacement policies include LRU, Clock, and variants. PostgreSQL uses a Clock sweep algorithm; MySQL InnoDB uses an LRU variant with a Midpoint insertion strategy to avoid scan pollution.
7. Query Optimisation
7.1 Query Processing Pipeline
7.2 Cost-Based Optimisation
The optimiser estimates the cost of alternative execution plans and chooses the cheapest.
Cost model. Cost = I/O cost (disk page accesses) + CPU cost. For disk-bound queries, I/O Dominates.
Catalog statistics: Table cardinality (), attribute value cardinality, number of distinct Values, histogram of value distribution, index information.
Selectivity estimation. For a predicate The selectivity is approximately where is the number of distinct values of in .
| Predicate type | Selectivity estimate |
|---|---|
7.3 Join Algorithms
Nested-loop join. For each tuple in Scan all of .
If one relation fits in memory, buffer it and scan the other: cost = .
Block nested-loop join. Use buffer pages. Load blocks of into buffers, scan With the remaining buffer.
Sort-merge join. Sort both relations on the join attribute, then merge.
Efficient for large relations, especially when both are already sorted.
Hash join. Build a hash table on the smaller relation (build phase), then probe with the larger (probe phase).
Best for equi-joins when one relation fits in memory.
Index nested-loop join. For each tuple in Use an index on to find matching tuples.
Efficient if has an index on the join attribute and is small.
7.4 Query Plan Selection
The optimiser explores the space of equivalent logical plans and physical implementations. For Joins, the number of join orderings is (left-deep trees) or (bushy trees). Practical optimisers use dynamic programming with pruning.
Heuristic transformations:
- Push selections down (reduce intermediate result sizes).
- Push projections down (reduce column widths).
- Convert cross products to joins when possible.
- Reorder joins based on estimated cardinalities.
8. NoSQL Overview
8.1 Motivation
NoSQL databases address limitations of relational databases for certain workloads:
- Horizontal scalability across commodity hardware.
- Flexible schemas (semi-structured data).
- High write throughput for simple access patterns.
- Handling unstructured or polymorphic data.
8.2 Document Stores
Store data as JSON/BSON documents. Each document can have a different structure.
Example (MongoDB):
{ "_id": 1, "name": "Alice", "courses": ["CS101", "MATH201"], "address": { "city": "Cambridge", "zip": "02139" }}Strengths: Flexible schema, nested data, good for content management. Weaknesses: No joins (requires embedding or application-level joins), eventual consistency in Distributed mode, limited transaction support.
8.3 Key-Value Stores
Simplest model: each item is a key-value pair. Values are opaque blobs.
Example (Redis):
SET user:1001 '{"name":"Alice","email":"alice@univ.edu"}'GET user:1001EXPIRE user:1001 3600Strengths: Extremely fast (in-memory), simple API, caching, session management. Weaknesses: No queries beyond key lookup, limited data modelling.
8.4 Graph Databases
Store nodes and edges with properties. Optimised for traversing relationships.
Example (Neo4j Cypher):
CREATE (a:Student {name: "Alice"})-[:ENROLLED_IN {grade: "A"}]->(c:Course {title: "Databases"})MATCH (s:Student)-[:ENROLLED_IN]->(c:Course {title: "Databases"})RETURN s.name, c.titleStrengths: Efficient for complex relationship queries, social networks, recommendation engines, Knowledge graphs. Weaknesses: Not suitable for simple CRUD, horizontal scaling is harder.
8.5 Column-Family Stores
Store data in column families (groups of related columns). Each row can have different columns.
Example (Apache Cassandra):
CREATE TABLE grades ( student_id text, course_id text, grade text, semester text, PRIMARY KEY ((student_id), course_id));Strengths: High write throughput, efficient column-level reads, horizontal scalability, Time-series data. Weaknesses: No joins, limited query flexibility, eventual consistency.
8.6 CAP Theorem
Theorem 8.1 (Brewer’s CAP Theorem). A distributed data store can provide at most two of the Following three guarantees simultaneously:
- Consistency (C): Every read receives the most recent write.
- Availability (A): Every request receives a non-error response.
- Partition tolerance (P): The system continues to operate despite network partitions.
Since network partitions are inevitable in distributed systems, the real trade-off is between consistency and availability during a partition.
| System | Choice | Notes |
|---|---|---|
| Redis Cluster | CP | Partition: some nodes unreachable |
| Cassandra | AP | Tunable consistency per operation |
| MongoDB | CP | Primary unavailable during partition |
| RDBMS (with replication) | CA | Not partition-tolerant (single node) |
PACELC. Extension of CAP: in the absence of partitions, the trade-off is between latency And consistency.
:::caution Common Pitfall “NoSQL” does not mean “no SQL.” It means “Not Only SQL.” Many NoSQL databases now support SQL-like Query languages (e.g., Cassandra CQL). The choice between relational and NoSQL depends on the Workload, not on a blanket preference. Relational databases remain the best choice for strongly Structured data with complex queries and transactional requirements.
9. Distributed Databases
9.1 Architecture and Motivation
A distributed database stores data across multiple nodes connected by a network. Motivations Include:
- Performance: Data locality reduces latency; parallel query processing increases throughput.
- Availability: Replication allows the system to survive node failures.
- Scalability: Horizontal scaling by adding more nodes.
Shared-nothing architecture. Each node has its own CPU, memory, and disk. Nodes communicate Only via the network. This is the dominant architecture for distributed databases.
Data fragmentation:
- Horizontal fragmentation: Each fragment is a subset of rows (tuples) defined by a selection predicate. Example: partition
Studentby department. - Vertical fragmentation: Each fragment is a subset of columns. Example: store frequently accessed columns on fast nodes.
- Hybrid fragmentation: A combination of horizontal and vertical.
9.2 Distributed Transactions
A distributed transaction involves operations on multiple nodes. The challenge is ensuring atomicity Across nodes.
Two-Phase Commit (2PC).
- Phase 1 (Voting): The coordinator sends a
PREPAREmessage to all participants. Each participant writes its changes to a local log, votesYES(ready to commit) orNO(abort). - Phase 2 (Decision): If all participants voted
YESThe coordinator sendsCOMMIT; otherwise it sendsABORT. Participants apply the decision and acknowledge.
Theorem 9.1. 2PC guarantees atomicity of distributed transactions if no participant crashes Permanently and the log is on stable storage.
Proof. If the coordinator crashes after phase 1, participants that voted YES are blocked — they Cannot decide without knowing the coordinator’s decision. Upon recovery, the coordinator reads its Log to determine the decision and notifies participants. Since each participant wrote its vote to Stable storage before responding, no vote is lost.
Blocking problem. If the coordinator crashes after phase 1, participants that voted YES must Wait indefinitely (they are blocked). This is a fundamental limitation of 2PC.
Three-Phase Commit (3PC). Adds a PRE-COMMIT phase to eliminate blocking under the fail-stop Model:
- CanCommit: Coordinator asks if participants can commit. Participants reply
YESorNO. - PreCommit: If all
YESCoordinator sendsPRE-COMMIT. Participants acknowledge. - DoCommit: Coordinator sends
COMMIT. Participants commit and acknowledge.
If the coordinator crashes after PRE-COMMITAny participant can contact others to determine the Outcome (since at least one participant must have received PRE-COMMIT and can coordinate).
Limitation of 3PC: Assumes reliable communication (no lost messages), which is unrealistic in Practice. 3PC is rarely used in production systems.
Worked Example 9.2: Two-Phase Commit Walkthrough
Scenario: A bank transfer deducts 100 from Account A (on Node 1) and adds 100 to Account B (on Node 2). The coordinator manages the distributed transaction.
Normal execution:
- Coordinator sends
PREPAREto Node 1 and Node 2. - Node 1: locks Account A, writes -100 to log, votes
YES. - Node 2: locks Account B, writes +100 to log, votes
YES. - All votes are
YES. Coordinator sendsCOMMITto both nodes. - Node 1 applies the write (Account A -= 100), unlocks, sends ACK.
- Node 2 applies the write (Account B += 100), unlocks, sends ACK.
- Coordinator writes
COMMITto its log. Transaction complete.
Node 1 votes NO (insufficient funds):
- Coordinator sends
PREPAREto Node 1 and Node 2. - Node 1: locks Account A, discovers balance is 50, votes
NO. - Coordinator receives
NOSendsABORTto both nodes. - Node 1 unlocks Account A (no changes applied).
- Node 2 unlocks Account B (no changes applied).
- Coordinator writes
ABORTto its log. Transaction aborted.
Coordinator crashes after Phase 1:
- Both nodes voted
YESand are waiting for the decision. - Nodes are blocked: they hold locks and cannot proceed.
- When the coordinator recovers, it reads its log, sees that all votes were
YESbut no decision was recorded, and sendsCOMMITto all participants.
9.3 Replication
Synchronous replication. A write is not acknowledged until all replicas have applied it.
- Guarantees strong consistency (all replicas identical).
- High latency (wait for the slowest replica).
- Reduced availability during partitions (cannot acknowledge writes if a replica is unreachable).
Asynchronous replication. A write is acknowledged after the primary applies it; replicas are Updated in the background.
- Low latency (acknowledged immediately).
- Eventual consistency (replicas may lag behind).
- Higher availability during partitions.
Replication architectures:
| Architecture | Writes | Reads | Consistency |
|---|---|---|---|
| Single-leader | Primary only | Any replica | Strong (reads from primary) or eventual (from replicas) |
| Multi-leader | Any node | Any node | Eventual (conflict resolution required) |
| Leaderless | Any node | Any node | Tunable (quorum reads/writes) |
Quorum-based consistency. For a system with replicas, define write quorum and read quorum such that . This guarantees that any read sees at least one replica with the Latest write.
9.4 Consistency Models
Consistency models define the guarantees a distributed system provides about the order and visibility Of updates.
Strong consistency models:
- Linearisability: Every operation appears to execute atomically at a single point in real time. The strongest consistency model. Equivalent to “serialisable + real-time ordering.”
- Sequential consistency: All operations appear in some total order that is consistent with the program order of each individual process. Weaker than linearisability (no real-time requirement).
Weak consistency models:
- Causal consistency: Operations that are causally related are seen by all processes in the same order. Concurrent (non-causally-related) operations may be seen in different orders.
- Read-your-writes consistency: A process always sees its own writes.
- Eventual consistency: If no new updates are made, all replicas will eventually converge to the same state. The weakest model; provides no ordering guarantees.
Worked Example 9.1: Quorum Read/Write
Scenario: replicas. Choose , . Note .
Write: Client writes value . Primary sends write to all 5 replicas. At least 3 acknowledge (). Write is considered successful.
Read: Client reads from 3 replicas (). Since Any read quorum overlaps With the write quorum, so the reader is guaranteed to see at least one replica with the latest Value. The reader returns the most recent version among the 3 responses.
During a partition: If 2 replicas are unreachable, writes require available replicas (OK, 3 are available). Reads require (OK). If 3 replicas are unreachable, writes fail (only 2 Available, need 3). This is the consistency-availability trade-off from CAP.
10. Problem Set
Problems 1—3: Introduction and Data Models
Explain the three levels of the ANSI-SPARC architecture with an example. Show how logical data independence allows a new column to be added to the conceptual schema without modifying existing external views.
Compare the relational and graph data models. Give two concrete scenarios where a graph database would be preferable to a relational database, explaining why.
A university uses a relational database for student records and a document store for course materials. Discuss the benefits and challenges of this polyglot persistence approach.
Problems 4—7: Relational Model and Algebra
Given relation
R(A, B, C, D, E)with FDs : (a) Find all candidate keys. (b) Compute the attribute closure . (c) Compute and .Given
R(A, B, C)with tuples andS(B, C)with tuples : (a) Compute . (b) Compute . (c) Compute .Express the following query in relational algebra: “Find the names of students who have enrolled in at least two courses taught by the same instructor.”
Prove that the cross product is commutative: . Prove that the cross product is associative: .
Problems 8—10: SQL
Write a recursive SQL query that computes the total cost of all parts (direct and indirect) required to assemble product
P100Given aBOM(parent_part, child_part, quantity)table.Write a SQL query that, for each department, returns the student with the highest GPA, their GPA, and the difference between their GPA and the department average. Use window functions only (no subqueries).
Consider the Python code:
query = f"DELETE FROM Student WHERE dept = '{dept}' AND gpa < {min_gpa}"(a) Identify the SQL injection vulnerability. (b) Show an input that exploits it. (c) Rewrite using parameterised queries.
Problems 11—14: Normalisation
Given
R(A, B, C, D, E)with FDs : (a) Find all candidate keys. (b) Determine the highest normal form of . Justify your answer. (c) If is not in BCNF, decompose it into BCNF relations. Verify each decomposition is lossless.Given
R(A, B, C, D)with FDs : (a) Find all candidate keys. (b) Decompose into BCNF. Is the decomposition dependency-preserving? (c) Show that is actually already in BCNF.Prove Theorem 4.5: every relation in 4NF is in BCNF.
Given `R(A, B, C, D)A \twoheadrightarrow BA \twoheadrightarrow CA \to DR$ in 4NF? If not, decompose into 4NF.
Problems 15—16: Indexing
Insert the keys 8, 5, 1, 7, 3, 12, 9, 6 into a B+ tree of order 3 (maximum 2 keys per node). Show the tree after each operation that causes a split. How many leaf-level and internal-level splits occur in total?
A table has 100000 tuples stored in 5000 pages. The buffer pool has 101 pages. Compare the estimated I/O cost of answering
SELECT * FROM T WHERE A = 42using: (a) A full table scan. (b) A B+ tree index on of height 3. (c) A static hash index on with 500 buckets and average chain length 0.4. State any assumptions you make.
Problems 17—18: Transaction Management
For each schedule below, determine if it is conflict-serialisable by drawing the precedence graph. If serialisable, give the equivalent serial order. (a) (b) (c)
Explain the difference between strict 2PL and rigorous 2PL. Give a schedule that is allowed by basic 2PL but not by strict 2PL. Explain why strict 2PL is preferred in practice.
Problems 19—20: Distributed Databases
Explain the two-phase commit protocol. Describe what happens in each of the following failure scenarios: (a) A participant crashes after voting
YESbut before receiving the coordinator’s decision. (b) The coordinator crashes after sendingCOMMITto some but not all participants. (c) The coordinator crashes after phase 1 (all votes received) but before sending any decision.A distributed key-value store uses replicas with quorum settings and . (a) Show that any read is guaranteed to see the latest write. (b) What is the maximum number of node failures the system can tolerate while still serving both reads and writes? (c) How does the system behave during a network partition that isolates 3 nodes from the remaining 4?
11. Further Reading
- Silberschatz, Korth, Sudarshan: Database System Concepts (7th ed.).
- Ramakrishnan, Gehrke: Database Management Systems (3rd ed.).
- Elmasri, Navathe: Fundamentals of Database Systems (7th ed.).
- Kleppmann: Designing Data-Intensive Applications (2017).
Common Pitfalls
Forgetting to convert between units (e.g., to ) when calculating concentrations.
Confusing enthalpy of formation with enthalpy of combustion, or using the wrong sign convention.
Forgetting to balance equations before performing calculations — always check that atoms and charges balance on both sides.
Assuming that a strong acid always has a lower pH than a weak acid without considering concentration.
Worked Examples
Example 1: SQL Joins and Aggregation
Problem. Given tables students(id, name, major) and grades(student_id, course, score), find the average score per major for courses with more than 5 enrolled students.
Solution.
SELECT s.major, AVG(g.score) AS avg_scoreFROM students sJOIN grades g ON s.id = g.student_idGROUP BY s.majorHAVING COUNT(DISTINCT g.student_id) > 5ORDER BY avg_score DESC;The HAVING clause filters groups after aggregation, unlike WHERE which filters rows before.
Example 2: ER to Relational Mapping
Problem. Map an ER diagram with entity Student (sid, name), entity Course (cid, title), and M:N relationship Enrolls with attribute grade to relational tables.
Solution. Three tables:
Student(sid PK, name)Course(cid PK, title)Enrolls(sid FK, cid FK, grade, PRIMARY KEY (sid, cid))The M:N relationship becomes its own table with foreign keys to both entities and a composite primary key.
Summary
- Relational model: data stored in tables (relations) with attributes; keys (primary, foreign) enforce integrity.
- SQL fundamentals:
SELECT,JOIN(inner, left, right, full),GROUP BY,HAVING, subqueries. - Normalisation: 1NF (atomic values), 2NF (no partial dependencies), 3NF (no transitive dependencies), BCNF.
- ACID properties: Atomicity, Consistency, Isolation, Durability — guarantee reliable transactions.
- Indexing: B+ tree indexes speed up point and range queries; trade-off between query speed and update overhead.
Cross-References
| Topic | Site | Link |
|---|---|---|
| Advanced Databases | WyattsNotes | View |
| Advanced SQL | WyattsNotes | View |
| Algorithms and Data Structures | WyattsNotes | View |
| Database Systems — CMU 15-445 | CMU | View |
:::