Chapter 7 Normalization
Chapter 7 Normalization
Functional Dependency
f : α -> β
α β α β α β
a 1 a 1 a 1
b 2 a 2 b 1
c 3 c 3 c 3
d 4 d 4 d 4
AB -> A β ⊄α
β⊆α AB -> ABC
Functional Dependency
R
A B C D E
a 2 3 4 5 α -> β
2 a 3 4 5
If α is unique, no need to check β
a 2 3 6 5 If β is same, no need to check α
a 2 3 6 6
R(ABC)
A-> B
B -> C
Find (A) +
(A) + = A
= AB
= ABC
Closure of Attribute Set
R(ABCDEF)
A -> B
C -> DE
AC -> F
D -> AF
E -> CF
Find (D) +
(D) + = D
= DAF
= DABF
Find (DE) +
(DE) + = DE
= DAFE
= DAFCE
= DABFCE
Closure of Attribute Set
R(ABCDEFG) Find (AC) + = ABCDE
A -> B
BC -> DE
AEG -> G
R(ABCDE)
A -> BC Find (B) + = BD
CD -> E
B -> D
E -> A
R(ABCDEF)
AB-> C Find (AB) + = ABCDE
BC -> AD
D -> E
CF -> B
R(ABCDEFGH)
A -> BC Find whether BCD-> H
= yes
CD -> E
E -> C
ED-> AEH
ABH -> BD
DH -> BC
Equivalence of Functional Dependency
R(ACDEH)
F: A C G: A -> CD
AC -> D E -> AH
E -> AD
E -> H
Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
Soln :- A+ = ACD A+ = ACD
AC + = ACD E + = ACDEH
E + = ACDEH
F⊆G G⊆F
Therefore, c. F=G is true
Equivalence of Functional Dependency
R(PQRS)
X: P Q Y: P -> QR
Q -> R R -> S
R -> S
Find which one holds true
a. X ⊆ Y
b. Y ⊆ X
c. X = Y
d. X ≠ Y
Soln :- P+ = PQRS P+ = PQRS
Q+ = Q R + = RS
R + = RS
X ⊄ Y therefore X≠Y
Therefore, b. Y ⊆ X is correct
Equivalence of Functional Dependency
R(VWXYZ)
F: WX G: W -> XY
WX -> Y Z -> WX
Z -> WY
Z -> V
Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
Soln :- ?
Armstrong’s Axiom
Primary Rules(RAT)
1. Reflexivity :- if y ⊆ x AB ⊆ ABC
then, x-> y ABC -> AB
2. Augmentation :- if x -> y
then, xz -> yz
3. Transitivity :- if x-> y && y-> z
then x -> z
Secondary Rules
1. Union:- if x-> y && x-> z
then, x -> yz
2. Decomposition :- if x-> yz
then, x-> y && x-> z
Soln :- ?
Keys
R(ABCD)
A -> BC
A + = ABC
A + ≠ ABCD ‘A’ CANNOT BE A KEY
(Key) + = R
A super key is a set of one or more attributes (columns), which can
uniquely identify a row in a table
R(ABCD)
ABC -> D (ABC) + = ABCD Super Key
Stu_id
Stud Course
Stu Name
Normalization - 1NF
Emp ID Emp Name Contact No
101 ABC 8412332142,
8456321111
102 PQR 9676543212,
9698765412
B -> E
C -> D Partial Dependency
Find whether it is in 2NF, if not then convert it into 2NF
CK = AC
Prime attributes = AC
Non prime attributes = BDE
Soln :-
R1(ABE)
R2(CD)
R3(AC)
Normalization - 2NF
R(ABCDE)
AB -> C
D -> E
Find whether is it in 2NF, if not then convert it into 2NF
Sol = R1(ABC)
R2(DE)
R3(ABD)
Normalization - 2NF
Stud_No Course_No Course_Fee Stud_No Course_No
1 C1 1000 1 C1
2 C2 1500 2 C2
1 C4 2000
1 C4 R1
4 C3 1000
4 C3
4 C1 1000
4 C1
2 c5 2000
2 c5
CK = Stud_No, Course_No
Course_No -> Course_Fee Partial Dependency
R1(Stud_No, Course_No) Course_No Course_Fee
C1 1000
R2(Course_No, Course_Fee)
C2 1500
C3 1000 R2
C4 2000
C5 2000
Normalization - 2NF
Stud_Id Stud_Name P_Id Project_Name
1 Abc 101 HMS
2 Def 102 LMS
3 Pqr 101 HMS
4 Xyz 103 BMS
• R(ABC) B C
A -> B 1 x R2
B -> C
2 y
• Find CK
3 z
• CK = A
• Soln : R1(AB)
R2(BC)
Normalization - 3NF
For every dependency from x -> y, transitive dependency
does not exists if
Either x is super key
Or y is a prime attribute
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Normalization - 3NF
Q. R(ABCDE)
A -> B
B -> E Not in 3rd NF
C -> D
CK = AC
Prime Attributes = AC
Non Prime Attributes = BDE
Soln = R1(ABE) => R1(AB), R2(BE)
R3(CD)
R4(AC)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C
A -> DE
Not in 3rd NF
B -> F
F -> GH
D -> IJ
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDEFGHIJ
Soln = R1(ABC)
R2(ADE)
R3(DIJ)
R4(BF)
R5(FGH)
Normalization - 3NF
Q. R(ABCDE)
AB -> C
B -> D Not in 3rd NF
D -> E
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDE
Soln = R1(ABC)
R2(BD)
R3(DE)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C A -> I
AD -> GH H -> J
BD -> EF
CK = ABD
Prime Attributes = ABD
Non Prime Attributes = CEFGHIJ
Soln = R1(ABC)
R2(ADGH)
R3(HJ)
R4(BDEF)
R5(AI)
R6(ABD)
Normalization - 3NF
Q. Cust(CID, Cname, Accno, BankCode, Branch)
Find dependency
CID -> Cname, Accno, BankCode
BankCode -> Branch
CK = CID
Prime Attributes = CID
Non Prime Attributes = Cname, Accno, BankCode,
Branch
Soln = R1(CID, Cname, Accno, BankCode)
R2(BankCode, Branch)
Normalization – BCNF or 3.5NF
Boyce-Codd Normal Form
Advanced version of 3NF or stricter version of 3NF
Every relation in BCNF is also in 3NF
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Dependency :- prime / non prime -> prime
R(ABC)
CK = AB, AC
AB -> C Prime Attributes = ABC
C -> B Non Prime Attributes = none
No partial dependency
No Transitive dependency
Normalization – BCNF or 3.5NF
R(ABC)
Not in BCNF
AB -> C
C -> B
It is in 3NF,
A relation is in BCNF if and only if there are no non-trivial
functional dependencies of attributes on anything other
than a superset of a candidate key.
A relation is in BCNF if
relation is in 3NF, and
every functional dependency
d 3 w d w R2(AC)
R2
e 3 w e w
R R1
Normalization – BCNF or 3.5NF
R(ABCD) x -> y
BCNF => x should be a super key
A -> BCD BCNF
3NF => x is super key or y is prime
BC -> AD BCNF 2NF => y is non prime and x is super key
D -> B 3NF
Find CK = A , BC
Prime Attributes = A, B, C
Non Prime Attributes = D
Soln :- R1(DB)
R2(ABCD)=>R(ACD)
Normalization – BCNF or 3.5NF
Stu_id Sub
Prof -> Sub Prof 3 NF
101Stu_id,
DBMSSub ->ANV
Prof
101 CJT HBP Prof Sub Stu_id Prof
102 CJT HBP ANV DBMS 101 ANV
102 TAFL NPD HBP CJT 101 HBP
103 DBMS ANV NPD TAFL 102 HBP
104 DBMS VBS VBS DBMS 102 NPD
103 ANV
R1
CK = Stu_id, Sub
104 VBS
Prime Attributes = stu_id, Sub
Non Prime Attributes = Prof R2
Soln : R1(Prof, Sub)
R2(Stu_id, Sub, Prof) => R2(Stu_id, Prof)
Find Normal Forms
BCNF(3 NF, 2NF, 1NF)
3NF(2NF, 1NF)
2NF(1NF)
1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
3NF => x is super key or y is prime
CE -> D BCNF 2NF => x is super key and y is non prime
or prime
D -> B Not in BCNF Not in 3NF
E -> F
CK = ABD , BCD
Prime attribute = ABCD
Non Prime Attribute = EF
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDEFGHI) BCNF => x should be a super key
Not in BCNF 3NF => x is super key or y is prime
AB -> C 2NF => x is super key and y is non prime
Not in 3NF
or prime
Not in 2NF
BD -> EF
AD -> GH
A -> I
CK = ABD
Prime attribute = ABD
Non Prime Attribute = CEFGHI
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
BCNF 3NF => x is super key or y is prime
AB -> CD 2NF => x is super key and y is non prime
or prime
D -> A Not in BCNF 3NF
BC -> DE
CK = AB, BD, BC
Prime attribute = ABCD
Non Prime Attribute = E
Conclusion is :- Relation is in 3NF
Find Normal Forms x -> y
R(ABCDEF) BCNF => x should be a super key
3NF => x is super key or y is prime
A -> BCDEF BCNF 2NF => x is super key and y is non prime
or prime
BC -> ADEF BCNF
A B C D
1 a p R =>
X R1, R2
2 b q R1 y, R2 => R
R
A B D
1 a X
2 b y
R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅
A B C D A B C D
1 a p X 1 a p X
2 b q y 1 a q Y
R 2 b p x
A B C D 2 b q y
1 a p X R1 X R2
2 b q y
R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅
3. Attribute(R1) ∩ Attribute(R2) -> Attribute(R1) or
Attribute(R1) ∩ Attribute(R2) -> Attribute(R1)
A B C A B C
A B B C
1 a p 1 a p
R1 1 a a p R2
R 1 a r R1 ⋈ R2
2 b q 2 b b q
3 a r 2 b q
3 a a r
3 a p
3 a r
Lossless or Lossy decomposition
Find whether the decomposition is Lossless or Lossy
A B C D E
a 122 1 p w
R
b 234 2 q x
a 568 1 r y
c 347 3 s z
R1 OS Shaw
Normalization – 4NF
A relation schema R is in 4NF with respect to a set D
of functional and multivalued dependencies if for all
multivalued dependencies in D+ of the form ,
where R and R, at least one of the following
hold:
is trivial (i.e., or = R)
is a superkey for schema R
If a relation is in 4NF it is in BCNF
Normalization – 4NF
R =(A, B, C, G, H, I)
• Is dependency preserved in
F ={ A B following decomposition ?
B HI
CG H }
R is not in 4NF since A B and A is not a superkey for R
Decomposition
a) R1 = (A, B) (R1 is in 4NF)
b) R2 = (A, C, G, H, I) (R2 is not in 4NF)
c) R21 = (C, G, H) (R21 is in 4NF)
d) R3 = (A, C, G, I) (R3 is not in 4NF)
Since A B and B HI, A HI, A I
e) R31 = (A, I) (R31 is in 4NF)
f )R4 = (A, C, G) (R4 is in 4NF)
Normalization – 5NF
Must be in 4NF
It should have no join dependency and also the joining must be
lossless R2
A relation in 5NF cannot be decomposed further without any kind of
SID MNo
modification in the meaning or facts.
101 9876543210
5NF is also known as Project Join Normal Form (PJNF).
102 9812345678
SID MNo Activity MNo Activity
102 7777733333
101 9876543210 Dancing 9876543210 Dancing
103 9865432176
101 9876543210 Singing 9876543210 Singing
102 9812345678 Cricket 9812345678 Cricket SID Activity
102 7777733333 Cricket 7777733333 Cricket 101 Dancing
103 9865432176 Singing 9865432176 Singing 101 Singing
R R1 102 Cricket
R3
103 Singing
R = R1 R2 R3
Normalization – 5NF
Sub Lecturer Sem