Database Normalization — The Kitchen Brigade
Database Normalization
The Kitchen Brigade
A disorganized kitchen burns food. A denormalized database burns queries.
1NF — Mise en Place
Every ingredient in its own container
1NF — Mise en Place
Every ingredient in its own container
2NF — Station Assignment
Every cook assigned to ONE station
2NF — Station Assignment
Every cook assigned to ONE station
3NF — Chain of Command
Orders go through the head chef only
3NF — Chain of Command
Orders go through the head chef only
BCNF — Michelin Standard
Every determinant is a candidate key
BCNF — Michelin Standard
Every determinant is a candidate key
| Normal Form | Rule | Kitchen Analogy | Eliminates |
|---|---|---|---|
| 1NF | Atomic values, no repeating groups | One ingredient per container | Repeating groups |
| 2NF | 1NF + no partial dependencies | Each cook → one station | Partial dependency |
| 3NF | 2NF + no transitive dependencies | No passing orders through sous-chef | Transitive dependency |
| BCNF | Every determinant is a candidate key | Michelin-star discipline | All remaining FD anomalies |
Unnormalized
Raw ingredients dumped on counter. Repeating groups, multi-valued cells.
1NF
Mise en place done. Atomic values, primary key identified.
2NF
Stations assigned. No partial dependencies on composite keys.
3NF
Chain of command clean. No transitive dependencies.
BCNF
Michelin standard. Every determinant is a candidate key.
Identify Normal Form Common
Given a relation and FDs, determine highest NF
1. Check 1NF (atomic values?) 2. Find candidate keys 3. Check partial deps (2NF) 4. Check transitive deps (3NF) 5. Check all determinants are superkeys (BCNF)
Assuming composite key when a single attribute is sufficient. Always compute closure first.
Is this relation in BCNF?
1. List all non-trivial FDs 2. For each FD A→B, check if A is a superkey 3. If ANY fails, not in BCNF
Forgetting to check ALL FDs — one violation is enough to fail BCNF.
Decomposition Common
Decompose into 3NF
1. Find minimal cover of FDs 2. Create relation for each FD 3. If no relation contains candidate key, add one 4. Remove redundant relations
Not computing minimal cover first. Skipping step 3 (candidate key check).
Lossless-join BCNF decomposition
1. Find FD A→B where A is not superkey 2. Decompose into R1(A,B) and R2(R-B) 3. Repeat until all in BCNF
BCNF decomposition may NOT preserve all FDs. This is the trade-off.
Closure & Keys Common
Find attribute closure
1. Start with X⁺ = X 2. For each FD A→B where A ⊆ X⁺, add B to X⁺ 3. Repeat until no change
Stopping too early. Must iterate until fixed point.
Find all candidate keys
1. Attributes only on LHS → must be in every key 2. Attributes only on RHS → never in any key 3. Try combinations from remaining
Missing candidate keys by not trying all combinations of 'both-side' attributes.
Common Exam Mistakes
1. Confusing partial dependency with transitive dependency
2. Saying 'every table should be in BCNF' — sometimes 3NF is preferred
3. Decomposing without checking lossless-join property
How to Avoid
1. Partial = depends on PART of composite key. Transitive = A→B→C where B is not a key
2. BCNF may not preserve FDs. If FD preservation matters, stop at 3NF
3. Always verify: R1 ∩ R2 → R1 or R1 ∩ R2 → R2
3NF guarantees both lossless-join AND dependency preservation. BCNF guarantees lossless-join but may sacrifice dependency preservation.
Exam favourite: 'When would you prefer 3NF over BCNF?' Answer: when FD preservation is critical for integrity constraints.
What is partial dependency?
4 cards — tap to flip
R(A,B,C,D) with FDs: A→B, B→C. What is the highest normal form?
A is the candidate key. A→B (fine, A is key). But B→C is transitive (A→B→C, B is not a key). So it's in 2NF but not 3NF.