Relational Model
2.1 Relations, Tuples, Attributes
A relation over attributes is a set of tuples where Each is drawn from the domain of . A relation is a subset of .
Properties of relations:
- Each relation has a unique name.
- Each attribute has a unique name within a relation.
- Tuples are unordered (a relation is a set).
- All tuples are distinct (no duplicates in a mathematical relation, though SQL tables allow them).
- Attributes are unordered.
2.2 Keys
Let be a relation schema.
- Superkey: A set of attributes such that no two distinct tuples agree on all attributes in .
- Candidate key: A minimal superkey (no proper subset is also a superkey).
- Primary key: One of the candidate keys, chosen by the designer.
- Foreign key: An attribute (or set) in relation that references the primary key of relation . Enforces referential integrity: every value of must appear as a value of in Or be
NULL.
Example. Student(ID, Name, Email). ID is a candidate key (unique). If Email is also unique, It is another candidate key. One is chosen as the primary key.
Theorem 2.1. Every relation has at least one candidate key (possibly the entire set of attributes).
2.3 Relational Algebra
Relational algebra provides a formal query language based on operations on relations.
Selection : Return tuples from satisfying condition .
Projection : Return a relation containing only attributes .
Union : All tuples in or (both must be union-compatible: same arity and Attribute domains).
Difference : Tuples in but not in .
Cartesian product : All pairs where and .
Rename : Rename relation to .
Natural join : Combine tuples from and that agree on all common attributes.
Theta join : .
Equi-join : A theta join where is an equality on specific Attributes. Keeps both join columns.
Left outer join : All tuples from Matched with where possible; NULL-padded otherwise.
Right outer join : All tuples from Matched with .
Full outer join : All tuples from both and .
Division : Tuples in such that for every tuple , .
Example. Find students who have taken all courses:
Worked Example 2.1: Complex Relational Algebra Query
Query: Find the names of students who have enrolled in every course offered by the CS department.
Schema: Student(sid, name, dept)``Course(cid, title, dept)``Takes(sid, cid, semester).
Step 1. Get CS course IDs:
Step 2. Get student-course pairs from enrolments:
Step 3. Students who have taken all CS courses (division):
Step 4. Get names:
Combined:
Worked Example 2.2: Relational Algebra with Outer Joins
Query: Find all students and the number of CS courses they have taken, including students who have Taken no CS courses (count should be 0).
Step 1. Filter enrolments to CS courses:
Step 2. Left outer join with Student to include those with no CS courses:
Note: aggregation over outer join results handles NULL values (they are excluded from COUNT).
In SQL, this translates directly to:
SELECT S.sid, S.name, COUNT(E.cid) AS num_cs_coursesFROM Student SLEFT JOIN Takes T ON S.sid = T.sidLEFT JOIN Course C ON T.cid = C.cid AND C.dept = 'CS'GROUP BY S.sid, S.name;2.4 Relational Calculus
Tuple relational calculus. A query has the form where is a tuple variable And is a well-formed formula. The formula is built from:
- Atoms: (tuple is in relation ), (comparison), (comparison with constant), where .
- Logical connectives: (and), (or), (not).
- Quantifiers: (there exists), (for all).
Domain relational calculus. Variables range over individual attribute domains (not entire tuples). A query has the form .
Safety. A calculus expression is safe if it yields a finite relation. The expression is unsafe (it includes every tuple not in An infinite set). We Restrict variables to domains of the relations appearing in the query.
Theorem 2.2 (Codd). Relational algebra and relational calculus (both tuple and domain) are Equally expressive: every query expressible in one is expressible in the other.
Worked Example 2.3: Tuple Relational Calculus
Query: Find students who have taken no CS course.
Using tuple relational calculus:
Translation to relational algebra:
2.5 Equivalence Rules for Relational Algebra
The following rules allow the optimiser to transform queries without changing their meaning. Each rule States that two expressions produce the same relation for any input relations.
Rule 1 (Cascade of selections). .
Proof. A tuple satisfies if and only if it satisfies both and . Applying first removes tuples failing ; then removes tuples failing among the remainder. The result is exactly the Tuples satisfying both conditions.
Rule 2 (Commutativity of selections). .
Proof. Immediate from the commutativity of logical conjunction ().
Rule 3 (Selection pushdown through cross product). If involves only attributes of Then .
Proof. For each pair with and The condition depends only on . Filtering by on is equivalent to first filtering by And then forming the cross product, since does not affect the result of .
Rule 4 (Selection pushdown through join). If involves only attributes of Then .
Proof. The join combines matching pairs from and . Applying After the join filters these pairs by on ‘s attributes. Filtering first removes Non-matching -tuples before the join, yielding the same final set of pairs.
Rule 5 (Commutativity of joins). .
Proof. The natural join combines tuples agreeing on common attributes. This relation is symmetric In and .
Rule 6 (Associativity of joins). .
Proof. Both expressions produce tuples from that agree on all attributes Shared by any pair of the three relations.
Rule 7 (Projection pushdown). If and includes all attributes used in Join conditions, then .
Proof. Projecting after the join retains only attributes in . Since includes all Attributes needed for the join, performing the join on projected versions of and yields the Same join result, from which is then projected.
These rules form the foundation of heuristic query optimisation (see Section 7).