0% found this document useful (0 votes)
68 views

Chapter 7 Normalization

The document discusses functional dependencies and how they are used to determine relationships between attributes in a relational database. It provides examples of functional dependencies and how to determine if attributes are functionally dependent based on example tables. It also covers concepts like closure of attribute sets, Armstrong's axioms, canonical covers, keys, and data normalization.

Uploaded by

Atman Hapani
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Chapter 7 Normalization

The document discusses functional dependencies and how they are used to determine relationships between attributes in a relational database. It provides examples of functional dependencies and how to determine if attributes are functionally dependent based on example tables. It also covers concepts like closure of attribute sets, Armstrong's axioms, canonical covers, keys, and data normalization.

Uploaded by

Atman Hapani
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

Chapter 7 - Relational Database Design

Functional Dependency
f : α -> β
α β α β α β
a 1 a 1 a 1
b 2 a 2 b 1
c 3 c 3 c 3
d 4 d 4 d 4

Given the value of α, one can search value of β


Functional Dependency
α -> β

Trivial Non Trivial

AB -> A β ⊄α
β⊆α AB -> ABC
Functional Dependency
R
A B C D E
a 2 3 4 5 α -> β
2 a 3 4 5
If α is unique, no need to check β
a 2 3 6 5 If β is same, no need to check α

a 2 3 6 6

Find which of the following is true


1. A -> BC
2. DE -> C
3. C -> DE
4. BC -> A
Functional Dependency
R
A B C
1 2 4
3 5 4
3 7 2
1 4 2

Find which of the following is true


1. A -> B && BC -> A
2. C -> B && CA -> B
3. B -> C && AB -> C
4. A -> C && BC -> A
Closure of Attribute Set
Attribute closure of an attribute set ‘A’ can be defined
as a set of attributes which can be functionally
determined from it denoted by F+

R(ABC)
A-> B
B -> C
Find (A) +
(A) + = A
= AB
= ABC
Closure of Attribute Set
R(ABCDEF)
A -> B
C -> DE
AC -> F
D -> AF
E -> CF
Find (D) +
(D) + = D
= DAF
= DABF
Find (DE) +
(DE) + = DE
= DAFE
= DAFCE
= DABFCE
Closure of Attribute Set
 R(ABCDEFG) Find (AC) + = ABCDE
A -> B
BC -> DE
AEG -> G
 R(ABCDE)
A -> BC Find (B) + = BD
CD -> E
B -> D
E -> A
 R(ABCDEF)
AB-> C Find (AB) + = ABCDE
BC -> AD
D -> E
CF -> B
 R(ABCDEFGH)
A -> BC Find whether BCD-> H
= yes
CD -> E
E -> C
ED-> AEH
ABH -> BD
DH -> BC
Equivalence of Functional Dependency
 R(ACDEH)
 F: A C G: A -> CD
AC -> D E -> AH
E -> AD
E -> H
 Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
 Soln :- A+ = ACD A+ = ACD
AC + = ACD E + = ACDEH
E + = ACDEH
F⊆G G⊆F
Therefore, c. F=G is true
Equivalence of Functional Dependency
 R(PQRS)
 X: P  Q Y: P -> QR
Q -> R R -> S
R -> S
 Find which one holds true
a. X ⊆ Y
b. Y ⊆ X
c. X = Y
d. X ≠ Y
 Soln :- P+ = PQRS P+ = PQRS
Q+ = Q R + = RS
R + = RS
X ⊄ Y therefore X≠Y
Therefore, b. Y ⊆ X is correct
Equivalence of Functional Dependency
 R(VWXYZ)
 F: WX G: W -> XY
WX -> Y Z -> WX
Z -> WY
Z -> V
 Find which one holds true
a. F ⊆ G
b. G ⊆ F
c. F = G
d. F ≠ G
 Soln :- ?
Armstrong’s Axiom
Primary Rules(RAT)
1. Reflexivity :- if y ⊆ x AB ⊆ ABC
then, x-> y ABC -> AB
2. Augmentation :- if x -> y
then, xz -> yz
3. Transitivity :- if x-> y && y-> z
then x -> z
Secondary Rules
1. Union:- if x-> y && x-> z
then, x -> yz
2. Decomposition :- if x-> yz
then, x-> y && x-> z

Eg:- AB -> CD UnionAB -> C


AB -> C A -> C Decomposition
AB -> D B -> C
Armstrong’s Axiom
Secondary Rules
3. Pseudo transitivity:- if x-> y && wy-> z
then, wx -> z
Proof:- x -> y
wx -> wy (Augmentation)
wy -> z (given)
therefore, wx -> z (Transitivity)
4. Composition :- if x-> y && z-> w
then, xz -> yw
Proof :- x -> y xz -> yz (Augmentation)
z -> w xz -> xw (Augmentation)
therefore, xz -> yzxw
xz -> yw
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X
WZ -> Y
Y -> W
Y -> X
Y -> Z
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X
WZ -> Y
Y -> W
Y -> X
Y -> Z
Canonical Cover
Minimal Set / Irreducible set of Functional Dependency
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
X -> W X + = XW X+ = X
WZ -> X WZ + = WZXY WZ + = WZXY
WZ -> Y WZ + = WZYX WZ + = WZ
Y -> WY + = YWXZ Y + = YWXZ
Y -> X Y + = YXZW Y + = YZ
Y -> Z Y + = YXZW Y + = YWX
Canonical Cover
R(WXYZ)
X -> W
WZ -> XY
Y -> WXZ
STEP 3: Consider remaining dependencies
X -> W
WZ -> Y
Y -> X
Y -> Z
STEP 4: Decompose LHS
WZ + = WZYX
W+ = W
Z+ = Z
OUTPUT:
X -> W
WZ -> Y
Y -> XZ
Find Canonical Cover
R(ABCD)
A -> B
C -> B
D -> ABC
AC -> D
STEP 1 : Decompose RHS STEP 2: Find Closure with and without statement
A -> B A + = AB A + = A
C -> B C + = CB C + = C
D -> A D + = DABC D + = DBC
D -> B D + = DABC D + = DABC
D -> C D + = DACB D + = DAB
AC -> D AC + = ACDB AC + = ACB
Find Canonical Cover
 R(ABCD)
A -> B
C -> B
D -> ABC
AC -> D
STEP 3: Consider remaining dependencies
A -> B
C -> B
D -> A
D -> C
AC -> D
STEP 4: Decompose LHS
AC + = ACDB
A + = AB
C + = CB
OUTPUT:
A -> B
C -> B
D -> AC
AC -> D
Find Canonical Cover
R(vwxyz)
V -> W
VW -> X
Y -> VXZ

 Soln :- ?
Keys
R(ABCD)
A -> BC
A + = ABC
A + ≠ ABCD ‘A’ CANNOT BE A KEY
(Key) + = R
A super key is a set of one or more attributes (columns), which can
uniquely identify a row in a table
R(ABCD)
ABC -> D (ABC) + = ABCD Super Key

AB -> CD (AB) + = ABCD Super Key


Super Key Candidate
A -> BCD (A) + = ABCD
Key
A candidate key is a minimal set of super key
A primary key can be any candidate key selected by DBA.
Keys
R(ABCD)
B -> ACD
ACD -> B
Find Super key, candidate key and primary key
(B) + = ABCD Super Key Candidate
Key
(ACD) = ABCD
+
Super Key Candidate
Find Super key, candidate key and primaryKey
key
R(ABCD)
AB -> C (AB) + = ABCD
C -> BD (C) + = ABCD
D -> A (D) + = AD
Compute Candidate keys
R(ABCD)
A -> B
B -> C
C -> A
Find essential attribute, and then combine it with LHS
attributes
Here D is essential attribute
(AD) + = ABCD
(BD) + = ABCD
(CD) + = ABCD
Here candidate keys are AD, BD, CD
Compute Candidate keys
R(ABCD)
AB -> CD
D -> A
Find essential attribute, and then combine it with LHS
attributes
Here B is essential attribute
(AB) + = ABCD
(BC) + = BC
(BD) + = ABCD
Here candidate keys are AB and BD.
Compute Candidate keys
R(ABCDEF)
AB -> C
C -> D
B -> AE
Here B and F are essential attributes
(BF) + = BFAECD
 Here candidate key is BF.
Compute Candidate keys
Q) R(ABCD)
AB -> CD
C -> A
D -> B
Here, no essential attribute present.
Check individual attributes
A+ = A AB + = ABCD
B+ = B AC + = AC
C + = CA AD + = ADBC
D + = DB BC + = BCAD
CK are AB,AD,BC and CD. BD +
= BD
CD + = CDBA
Compute Candidate keys
Q) R(ABCDE)
AB -> CD
D -> A
BC -> DE
Here, no essential attribute present.
Check individual attributes
A+ = A AB + = ABCD
B+ = B AC + = AC
C + = CA AD + = ADBC
D + = DB BC + = BCAD
CK are AB,AD,BC and CD. BD +
= BD
CD + = CDBA
Compute Candidate keys
Q) R(ABCDEF)
AB -> C
DC -> AE
E -> F
CK = ABD,BCD
Q) R(ABCDEF)
AB -> C
C -> D
D -> BE
E -> F
F -> A
CK = C,D,AB,BE,BF
Data Redundancy
Stu ID Name Age Sem Branch ID Branch HOD
Name
1 A 17 1 101 IT XY
2 B 17 1 101 IT XY
3 C 18 1 101 IT XY
4 D 18 1 102 CE PQ
5 E 19 1 102 CE PQ
6 F 17 1 103 IC AB

• Data Redundancy :- when same data is stored multiple times


unnecessarily in a database
• Disadvantages
• Insertion Anomalies, Deletion Anomalies and Updation Anomalies
• Inconsistent Data
• Database size increases, time of search also increases.
Normalization
Stu ID Name Age Sem Branch ID Branch HOD
Name
1 A 17 1 101 IT XY
2 B 17 1 101 IT XY
3 C 18 1 101 IT XY
4 D 18 1 102 CE PQ
5 E 19 1 102 CE PQ
6 F 17 1 103 IC AB

Stu ID Name Age Sem Branch Branch ID Branch Name HOD


ID
101 IT XY
1 A 17 1 101
102 CE PQ
2 B 17 1 101
103 IC AB
3 C 18 1 101
4 D 18 1 102
5 E 19 1 102
6 F 17 1 103
Normalization - 1NF
A relation is in first normal form if and only if the
domain of each attribute contains only atomic
(indivisible) values, and the value of each attribute
contains only a single value from that domain.
Stu ID Stu Name Course Stu ID Stu Name Course
101 ABC DBMS, 101 ABC DBMS
TAFL
101 ABC TAFL
102 PQR CJT,
AMP 102 PQR CJT
102 PQR AMP

Stu_id
Stud Course

Stu Name
Normalization - 1NF
Emp ID Emp Name Contact No
101 ABC 8412332142,
8456321111
102 PQR 9676543212,
9698765412

Emp ID Emp Name Contact No


101 ABC 8412332142
101 ABC 8456321111
102 PQR 9676543212
102 PQR 9698765412
Normalization - 2NF
Must be in 1NF
No partial dependency should be present
R(A B C D)
AB->D
B -> C Partial Dependency exists
Find candidate key
ans :- AB
Prime Attributes = A ,B
Non Prime Attributes = C,D
Normalization - 2NF
A B
R(A B C D)
-- 1 AB->D
2 -- B -> C B -> C
-- -- CK = AB
3 4
C is dependent on B
How to decompose
R1(ABD)
R2(BC)
Normalization - 2NF
A B C R(ABC)
a 1 X B -> C
b 2 Y
CK = ?
a 3 z
c 3 z A B B C
d 3 z a 1 1 X
e 3 z b 2 2 Y
a 3 3 z
• CK = AB c 3
• Prime Attributes = AB d 3
• Non Prime Attributes = C e 3
• Partial dependency exists as C is only
dependent on B instead of AB
R1 R2
• Soln:- ?
• R1(AB)
• R2(BC)
Normalization - 2NF
R(ABCDE)
A -> B Partial Dependency

B -> E
C -> D Partial Dependency
Find whether it is in 2NF, if not then convert it into 2NF
CK = AC
Prime attributes = AC
Non prime attributes = BDE
Soln :-
R1(ABE)
R2(CD)
R3(AC)
Normalization - 2NF
R(ABCDE)
AB -> C
D -> E
Find whether is it in 2NF, if not then convert it into 2NF
Sol = R1(ABC)
R2(DE)
R3(ABD)
Normalization - 2NF
Stud_No Course_No Course_Fee Stud_No Course_No
1 C1 1000 1 C1
2 C2 1500 2 C2
1 C4 2000
1 C4 R1
4 C3 1000
4 C3
4 C1 1000
4 C1
2 c5 2000
2 c5
CK = Stud_No, Course_No
Course_No -> Course_Fee Partial Dependency
R1(Stud_No, Course_No) Course_No Course_Fee
C1 1000
R2(Course_No, Course_Fee)
C2 1500
C3 1000 R2
C4 2000
C5 2000
Normalization - 2NF
Stud_Id Stud_Name P_Id Project_Name
1 Abc 101 HMS
2 Def 102 LMS
3 Pqr 101 HMS
4 Xyz 103 BMS

Find whether it is in 2NF, if not then convert it into 2NF


Find dependency
PD
Stud_Id -> Stud_Name
PD
P_Id -> Project Name
CK = Stud_Id, P_Id
Sol = R1(Stud_Id, Stud_Name)
R2(P_Id, Project_Name)
R3(Stud_Id, P_Id)
Normalization - 3NF
• Must be in 2NF
• No Transitive dependency should be present
• R(ABC)
A -> B
B -> C TD
• Find CK
• CK = A
• Transitive dependency exists if non prime attribute
depends on another non prime attribute
x -> y
• x,y both are non prime
Normalization - 3NF
A B C A B
a 1 x a 1
b 1 x b 1
c 1 x c 1 R1
d 2 y d 2
e 2 y e 2
f 3 z
f 3
g 3 z
g 3

• R(ABC) B C
A -> B 1 x R2
B -> C
2 y
• Find CK
3 z
• CK = A
• Soln : R1(AB)
R2(BC)
Normalization - 3NF
For every dependency from x -> y, transitive dependency
does not exists if
Either x is super key
Or y is a prime attribute
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Normalization - 3NF
Q. R(ABCDE)
A -> B
B -> E Not in 3rd NF

C -> D
CK = AC
Prime Attributes = AC
Non Prime Attributes = BDE
 Soln = R1(ABE) => R1(AB), R2(BE)
R3(CD)
R4(AC)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C
A -> DE
Not in 3rd NF
B -> F
F -> GH
D -> IJ
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDEFGHIJ
 Soln = R1(ABC)
R2(ADE)
R3(DIJ)
R4(BF)
R5(FGH)
Normalization - 3NF
Q. R(ABCDE)
AB -> C
B -> D Not in 3rd NF

D -> E
CK = AB
Prime Attributes = AB
Non Prime Attributes = CDE
 Soln = R1(ABC)
R2(BD)
R3(DE)
Normalization - 3NF
Q. R(ABCDEFGHIJ)
AB -> C A -> I
AD -> GH H -> J
BD -> EF
CK = ABD
Prime Attributes = ABD
Non Prime Attributes = CEFGHIJ
 Soln = R1(ABC)
R2(ADGH)
R3(HJ)
R4(BDEF)
R5(AI)
R6(ABD)
Normalization - 3NF
Q. Cust(CID, Cname, Accno, BankCode, Branch)
Find dependency
CID -> Cname, Accno, BankCode
BankCode -> Branch
CK = CID
Prime Attributes = CID
Non Prime Attributes = Cname, Accno, BankCode,
Branch
 Soln = R1(CID, Cname, Accno, BankCode)
R2(BankCode, Branch)
Normalization – BCNF or 3.5NF
Boyce-Codd Normal Form
Advanced version of 3NF or stricter version of 3NF
Every relation in BCNF is also in 3NF
Partial dependency :- Prime -> non prime
Transitive dependency :- non Prime -> non prime
Dependency :- prime / non prime -> prime
R(ABC)
CK = AB, AC
AB -> C Prime Attributes = ABC
C -> B Non Prime Attributes = none
No partial dependency
No Transitive dependency
Normalization – BCNF or 3.5NF
R(ABC)
Not in BCNF
AB -> C
C -> B
It is in 3NF,
A relation is in BCNF if and only if there are no non-trivial
functional dependencies of attributes on anything other
than a superset of a candidate key.
A relation is in BCNF if
relation is in 3NF, and
every functional dependency

X → Y, X is the super key of the table.


Normalization – BCNF or 3.5NF
R(ABC)
AB -> C Not in BCNF
C -> B CK = AB, AC
A B C A Soln:-
C R1(CB)
C B
a 1 x a x
x R2(AB) 1
Not considered
b 2 y b y y 2
R3(AC)
z 2
c 2 z c z
c 3 w
Final Soln:-R1(CB)
c w w 3

d 3 w d w R2(AC)
R2

e 3 w e w

R R1
Normalization – BCNF or 3.5NF
R(ABCD) x -> y
BCNF => x should be a super key
A -> BCD BCNF
3NF => x is super key or y is prime
BC -> AD BCNF 2NF => y is non prime and x is super key

D -> B 3NF
Find CK = A , BC
Prime Attributes = A, B, C
Non Prime Attributes = D
Soln :- R1(DB)
R2(ABCD)=>R(ACD)
Normalization – BCNF or 3.5NF
Stu_id Sub
Prof -> Sub Prof 3 NF
101Stu_id,
DBMSSub ->ANV
Prof
101 CJT HBP Prof Sub Stu_id Prof
102 CJT HBP ANV DBMS 101 ANV
102 TAFL NPD HBP CJT 101 HBP
103 DBMS ANV NPD TAFL 102 HBP
104 DBMS VBS VBS DBMS 102 NPD
103 ANV
R1
CK = Stu_id, Sub
104 VBS
Prime Attributes = stu_id, Sub
Non Prime Attributes = Prof R2
Soln : R1(Prof, Sub)
R2(Stu_id, Sub, Prof) => R2(Stu_id, Prof)
Find Normal Forms
BCNF(3 NF, 2NF, 1NF)
3NF(2NF, 1NF)
2NF(1NF)
1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
3NF => x is super key or y is prime
CE -> D BCNF 2NF => x is super key and y is non prime
or prime
D -> B Not in BCNF Not in 3NF

C -> A Not in 2NF


CK = CE
Prime attribute = CE
Non Prime Attribute = ABD
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDEF) BCNF => x should be a super key
3NF => x is super key or y is prime
AB -> C Not in BCNF 3NF 2NF => x is super key and y is non prime
or prime
DC -> AE Not in 3NF Not in 2NF

E -> F
CK = ABD , BCD
Prime attribute = ABCD
Non Prime Attribute = EF
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDEFGHI) BCNF => x should be a super key
Not in BCNF 3NF => x is super key or y is prime
AB -> C 2NF => x is super key and y is non prime
Not in 3NF
or prime
Not in 2NF
BD -> EF
AD -> GH
A -> I
CK = ABD
Prime attribute = ABD
Non Prime Attribute = CEFGHI
Conclusion is :- Relation is in 1NF
Find Normal Forms x -> y
R(ABCDE) BCNF => x should be a super key
BCNF 3NF => x is super key or y is prime
AB -> CD 2NF => x is super key and y is non prime
or prime
D -> A Not in BCNF 3NF

BC -> DE
CK = AB, BD, BC
Prime attribute = ABCD
Non Prime Attribute = E
Conclusion is :- Relation is in 3NF
Find Normal Forms x -> y
R(ABCDEF) BCNF => x should be a super key
3NF => x is super key or y is prime
A -> BCDEF BCNF 2NF => x is super key and y is non prime
or prime
BC -> ADEF BCNF

DEF -> ABC BCNF


CK = ?
Prime attribute =
Non Prime Attribute =
Conclusion is :- Relation is in BCNF
Lossless Join Decomposition
Non Additive property
This property guarantees that the extra or loss tuple
generation problem does not occur after
decomposition
It is a mandatory property that must always hold true
A B C D
1 a p X
2 b q y R1, R2
R =>
A B
R1 , R2 => R
C D
1 a p X
2 b q y
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)

A B C D
1 a p R =>
X R1, R2
2 b q R1 y, R2 => R
R
A B D
1 a X
2 b y

R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅

A B C D A B C D
1 a p X 1 a p X
2 b q y 1 a q Y

R 2 b p x

A B C D 2 b q y

1 a p X R1 X R2
2 b q y
R1 R2
Lossless Join Decomposition
If a relation R is decomposed into two relations R1 and
R2, then it will be lossless if
1. Attribute(R1) U Attribute(R2) = Attribute(R)
2. Attribute(R1) ∩ Attribute(R2) ≠ ∅
3. Attribute(R1) ∩ Attribute(R2) -> Attribute(R1) or
Attribute(R1) ∩ Attribute(R2) -> Attribute(R1)
A B C A B C
A B B C
1 a p 1 a p
R1 1 a a p R2
R 1 a r R1 ⋈ R2
2 b q 2 b b q
3 a r 2 b q
3 a a r
3 a p
3 a r
Lossless or Lossy decomposition
Find whether the decomposition is Lossless or Lossy

A B C D E
a 122 1 p w
R
b 234 2 q x
a 568 1 r y
c 347 3 s z

1. R1(AB) , R2(CD) Lossy

2. R1(ABC) , R2(DE) Lossy


3. R1(ABCD) , R2(ACDE) Lossless
4. R1(ABCD) , R2(DE) Lossless
5. R1(ABC) , R2(BCD), R3(DE) Lossless
R12(ABCD)
Lossless or Lossy decomposition
Find whether the decomposition is Lossless or Lossy
R(VWXYZ)
Z -> Y
Y -> Z
X -> YV
VW -> X
1. R1(VW) , R2(WXYZ) Nothing can be said
2. R1(VWX) , R2(YZ) Lossy
3. R1(VW) , R2(YZ) Lossy
4. R1(VWX) , R2(XYZ) Lossless
Dependency Preservation
 If a Table R is having functional dependency set F, is decomposed into two
tables R1 and R2 having functional dependency set F1 and F2, then
F1 ⊆ F
F2 ⊆ F
(F1 U F2)+ = F+
 R(ABC)
A -> B
B -> C
C -> A
 R1(AB), R2(BC)
R1(AB) R2(BC)
A -> B B -> C
B -> A C-> B
(C)+ = CBA
There fore, dependency is preserved for given relation
Dependency Preservation
 If a Table R is having functional dependency set F, is decomposed into two tables R1 and
R2 having functional dependency set F1 and F2, then
F1 ⊆ F
F2 ⊆ F
(F1 U F2)+ = F+
 R(ABCD)
AB -> CD
D -> A
R1(AD), R2(BCD)
R1(AD) R2(BCD)
A -> A B -> B
D -> A C -> C
D -> D
BC -> BC
BD -> BDC
CD-> CD
(AB)+ = AB No Dependency Preservation
BCNF VS 3NF
It is always possible to decompose a relation into
relations in 3NF and
the decomposition is lossless
the dependencies are preserved
It is always possible to decompose a relation into
relations in BCNF and
the decomposition is lossless
it may not be possible to preserve dependencies.
Normalize Relation Table
 R(ABCDEFGHIJ)
AB -> C
A -> DE
B -> F
F -> GH
D -> IJ
 Find CK = AB
 Find Prime and Non Prime Attributes = AB , = CDEFGHIJ
 Find in which Normal form relation is = 1NF
 Decompose = R1(ABC)
R2(ADEIJ) =>R21(ADE), R22(DIJ)
R3(BFGH) =>R31(BF), R32(FGH)
 Find decomposition is lossy or lossless
 Find whether dependency is preserved or not.
Design Goals
 Goal for a relational database design is:
 BCNF.
 Lossless join.
 Dependency preservation.
 If we cannot achieve this, we accept one of
 Lack of dependency preservation
 Redundancy due to use of 3NF
 Interestingly, SQL does not provide a direct way of specifying
functional dependencies other than super keys.
 Can specify FDs using assertions, but they are expensive to test
 Even if we had a dependency preserving decomposition, using SQL
we would not be able to efficiently test a functional dependency
whose left hand side is not a key.
Normalization – 4NF
 Must be in BCNF
 Not more than one Multivalued dependency should be present.
 Multivalued Dependency :-
 A table is said to have multi-valued dependency, if the following
conditions are true,
 For a dependency A ->-> B, if for a single value of A, multiple
value of B exists, then the table may have multi-valued
dependency.
 A table should have at-least 3 columns for it to have a multi-
valued dependency.
 And, for a relation R(A,B,C), if there is a multi-valued
dependency between, A and B, then B and C should be
independent of each other.
 Symbol is ->-> or ->>
Normalization – 4NF
Course Teacher Book
DBMS ANV
R(Course,
Korth
Teacher, Book)
Course ->-> Teacher
DBMS ANV Ullman
Course ->-> Book
DBMS VBC Korth
This relation is not in 4NF
DBMS VBC Ullman
Soln :- R1(Course, Teacher)
OS VKP William
R2(Course, Book)
OS VKP Shaw

Course Teacher Course Book


DBMS ANV DBMS Korth
DBMS VBC DBMS Ullman R2
OS VKP OS William

R1 OS Shaw
Normalization – 4NF
A relation schema R is in 4NF with respect to a set D
of functional and multivalued dependencies if for all
multivalued dependencies in D+ of the form   ,
where   R and   R, at least one of the following
hold:
   is trivial (i.e.,    or    = R)
 is a superkey for schema R
If a relation is in 4NF it is in BCNF
Normalization – 4NF
 R =(A, B, C, G, H, I)
• Is dependency preserved in
F ={ A  B following decomposition ?
B  HI
CG  H }
 R is not in 4NF since A  B and A is not a superkey for R
 Decomposition
a) R1 = (A, B) (R1 is in 4NF)
b) R2 = (A, C, G, H, I) (R2 is not in 4NF)
c) R21 = (C, G, H) (R21 is in 4NF)
d) R3 = (A, C, G, I) (R3 is not in 4NF)
 Since A  B and B  HI, A  HI, A  I
e) R31 = (A, I) (R31 is in 4NF)
f )R4 = (A, C, G) (R4 is in 4NF)
Normalization – 5NF
 Must be in 4NF
 It should have no join dependency and also the joining must be
lossless R2
 A relation in 5NF cannot be decomposed further without any kind of
SID MNo
modification in the meaning or facts.
101 9876543210
 5NF is also known as Project Join Normal Form (PJNF).
102 9812345678
SID MNo Activity MNo Activity
102 7777733333
101 9876543210 Dancing 9876543210 Dancing
103 9865432176
101 9876543210 Singing 9876543210 Singing
102 9812345678 Cricket 9812345678 Cricket SID Activity
102 7777733333 Cricket 7777733333 Cricket 101 Dancing
103 9865432176 Singing 9865432176 Singing 101 Singing
R R1 102 Cricket
R3
103 Singing
 R = R1 R2 R3
Normalization – 5NF
Sub Lecturer Sem

Comp Adam Sem1


Comp John Sem1
Math John Sem1
Math Smith Sem2
Chemistry Ely Sem1
Denormalization for performance
 May want to use non-normalized schema for performance
 E.g. displaying customer-name along with account-number and
balance requires join of account with depositor
 Alternative 1: Use denormalized relation containing attributes of
account as well as depositor with all above attributes
 faster lookup
 Extra space and extra execution time for updates
 extra coding work for programmer and possibility of error in
extra code
 Alternative 2: use a materialized view defined as
account depositor
 Benefits and drawbacks same as above, except no extra coding
work for programmer and avoids possible errors
Some design issues
 Some aspects of database design are not caught by
normalization
 Examples of bad database design, to be avoided:
Instead of earnings(company-id, year, amount), use
 earnings-2000, earnings-2001, earnings-2002, etc., all on the
schema (company-id, earnings).
 Above are in BCNF, but make querying across years difficult
and needs new table each year
 company-year(company-id, earnings-2000, earnings-2001,
earnings-2002)
 Also in BCNF, but also makes querying across years difficult
and requires new attribute each year.
 Is an example of a crosstab, where values for one attribute
become column names
 Used in spreadsheets, and in data analysis tools

You might also like