Skip to main content
Study Guide published

Database Normalization — The Kitchen Brigade

Database Normalization

The Kitchen Brigade

A disorganized kitchen burns food. A denormalized database burns queries.

— Your AI Professor

1NF — Mise en Place

Every ingredient in its own container

No repeating groups, no multi-valued columns. Each cell holds exactly one value. Like chopping everything before you cook.

2NF — Station Assignment

Every cook assigned to ONE station

Remove partial dependencies. Non-key attributes must depend on the WHOLE primary key, not just part of it.

3NF — Chain of Command

Orders go through the head chef only

Remove transitive dependencies. Non-key attributes must depend on the key directly — no middleman.

BCNF — Michelin Standard

Every determinant is a candidate key

Stricter than 3NF. If A → B, then A must be a superkey. Eliminates all remaining anomalies.
Normal Form Rule Kitchen Analogy Eliminates
1NF Atomic values, no repeating groups One ingredient per container Repeating groups
2NF 1NF + no partial dependencies Each cook → one station Partial dependency
3NF 2NF + no transitive dependencies No passing orders through sous-chef Transitive dependency
BCNF Every determinant is a candidate key Michelin-star discipline All remaining FD anomalies

Unnormalized

Raw ingredients dumped on counter. Repeating groups, multi-valued cells.

1NF

Mise en place done. Atomic values, primary key identified.

2NF

Stations assigned. No partial dependencies on composite keys.

3NF

Chain of command clean. No transitive dependencies.

BCNF

Michelin standard. Every determinant is a candidate key.

Identify Normal Form Common

Given a relation and FDs, determine highest NF

1. Check 1NF (atomic values?) 2. Find candidate keys 3. Check partial deps (2NF) 4. Check transitive deps (3NF) 5. Check all determinants are superkeys (BCNF)

Assuming composite key when a single attribute is sufficient. Always compute closure first.

Is this relation in BCNF?

1. List all non-trivial FDs 2. For each FD A→B, check if A is a superkey 3. If ANY fails, not in BCNF

Forgetting to check ALL FDs — one violation is enough to fail BCNF.

Decomposition Common

Decompose into 3NF

1. Find minimal cover of FDs 2. Create relation for each FD 3. If no relation contains candidate key, add one 4. Remove redundant relations

Not computing minimal cover first. Skipping step 3 (candidate key check).

Lossless-join BCNF decomposition

1. Find FD A→B where A is not superkey 2. Decompose into R1(A,B) and R2(R-B) 3. Repeat until all in BCNF

BCNF decomposition may NOT preserve all FDs. This is the trade-off.

Closure & Keys Common

Find attribute closure

1. Start with X⁺ = X 2. For each FD A→B where A ⊆ X⁺, add B to X⁺ 3. Repeat until no change

Stopping too early. Must iterate until fixed point.

Find all candidate keys

1. Attributes only on LHS → must be in every key 2. Attributes only on RHS → never in any key 3. Try combinations from remaining

Missing candidate keys by not trying all combinations of 'both-side' attributes.

Common Exam Mistakes

1. Confusing partial dependency with transitive dependency

2. Saying 'every table should be in BCNF' — sometimes 3NF is preferred

3. Decomposing without checking lossless-join property

How to Avoid

1. Partial = depends on PART of composite key. Transitive = A→B→C where B is not a key

2. BCNF may not preserve FDs. If FD preservation matters, stop at 3NF

3. Always verify: R1 ∩ R2 → R1 or R1 ∩ R2 → R2

The BCNF vs 3NF Trade-off

3NF guarantees both lossless-join AND dependency preservation. BCNF guarantees lossless-join but may sacrifice dependency preservation.

Exam favourite: 'When would you prefer 3NF over BCNF?' Answer: when FD preservation is critical for integrity constraints.

What is partial dependency?

4 cards — tap to flip

R(A,B,C,D) with FDs: A→B, B→C. What is the highest normal form?

A 1NF
B 2NF
C 3NF
D BCNF

A is the candidate key. A→B (fine, A is key). But B→C is transitive (A→B→C, B is not a key). So it's in 2NF but not 3NF.