0% found this document useful (0 votes)
11 views

Normalization

Uploaded by

jaishreemane73
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Normalization

Uploaded by

jaishreemane73
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

 Looping

Outline
• Functional Dependency
• Definition and types of FD
• Armstrong's axioms (inference rules)
• Closure of FD set
• Closure of attribute set
• Canonical cover
• Decomposition and its types
• Anomaly in database design and its
types
• Normalization and normal forms
• 1NF
• 2NF
• 3NF
• BCNF
• 4NF
• 5NF
Functional Dependency
(FD) and
its types
Section – 1.1
What is Functional Dependency (FD)?
 Let R be a relation schema having n attributes A1, A2, A3,…, An.
Stud
ent
Roll Nam SP B
No e I L
101 Raju 8 0
102 Mites 7 1
h
103 Jay 7 0
 Let attributes X and Y are two subsets of attributes of relation R.
 If the values of the X component of a tuple uniquely (or functionally)
determine the values of the Y component, then there is a functional
dependency from X to Y.
 This is denoted by X → Y (i.e RollNo → Name, SPI, BL).
 It is referred as: Y is functionally dependent on the X or X
functionally determines Y.
3
Diagrammatic representation of Functional
Dependency (FD)
X→ {X1, X2} X → {Y1,
Y →Y Y2}
X Y X1 X2 Y X Y1 Y2

 Example
 Consider the relation Account(account_no, balance, branch).
 account_no can determine balance and branch.
 So, there is a functional dependency from account_no to balance and
branch.
 This can be denoted by account_no → {balance, branch}.
account_no balan branch
ce
4
Types of Functional Dependency (FD)
 Full Functional Dependency
 In a relation, the attribute B is fully functional dependent on A if B is functionally
dependent on A, but not on any proper subset of A.
 Eg. {Roll_No, Semester, Department_Name} → SPI
 We need all three {Roll_No, Semester, Department_Name} to find SPI.
 Partial Functional Dependency
 In a relation, the attribute B is partial functional dependent on A if B is functionally
dependent on A as well as on any proper subset of A.
 If there is some attribute that can be removed from A and the still dependency holds
then it is partial functional dependancy.
 Eg. {Enrollment_No, Department_Name} → SPI
 Enrollment_No is sufficient to find SPI, Department_Name is not required to find
SPI.

5
Types of Functional Dependency (FD)
 Transitive Functional Dependency
 In a relation, if attribute(s) A → B and B → C, then A → C (means C is transitively
depends on A via B).
Sub_F
ac
Subje Facul Ag
ct ty e
DS Shah 35
DBMS Patel 32
DF Shah 35

 Eg. Subject → Faculty & Faculty → Age then Subject → Age


 Therefore as per the rule of transitive dependency: Subject → Age should hold, that
makes sense because if we know the subject name we can know the faculty’s age.

6
Types of Functional Dependency (FD)
 Trivial Functional Dependency
 X → Y is trivial FD if Y is a subset of X
 Eg. {Roll_No, Department_Name, Semester} → Roll_No
 Nontrivial Functional Dependency
 X → Y is nontrivial FD if Y is not a subset of X
 Eg. {Roll_No, Department_Name, Semester} → Student_Name

7
Armstrong's axioms OR
Inference rules
Section – 1.2
Armstrong's axioms OR Inference rules
 Armstrong's axioms are a set of rules used to infer (derive) all the
functional dependencies on a relational database.
Reflexivi Augmentat Self-
ty ion determination
 If B is a subset of A  If A → B  If A → A
 then A → B  then AC → BC

Transitiv Pseudo Decomposit


ity Transitivity ion
 If A → B and B → C  If A → B and BD → C  If A → BC
 then AD → C  then A → B and A →
 then A → C
C

Unio Compositi
n on
 If A → B and A → C  If A → B and C → D
 then A → BC  then AC → BD

9
Closure of a set of FDs
Section – 2
What is closure of a set of FDs?
 Given a set F set of functional dependencies, there are certain other
functional dependencies that are logically implied by F.
 E.g.: F = {A → B and B → C}, then we can infer that A → C (by
transitivity rule)
 The set of functional dependencies (FDs) that is logically implied
by F is called the closure of F.
 It is denoted by F+.

11
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)

 The functional dependency A → H is logical implied.

We
have
A→B Transitivity
A→H
B→H rule

12
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)

 The functional dependency CG → HI is logical implied.

We
have
CG → H
Union rule CG → HI
CG → I

13
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)

 The functional dependency AG → I is logical implied.

We
have
A→C Pseudo-
transitivity AG → I
CG → I rule

14
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)

 The functional dependency AG → I is logical implied.

We
have Augmentation
A→C AG → CG
rule

AG → CG Transitivity
AG → I
CG → I rule

15
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)

 Find out the closure of F.

Several members of
F+ CG
F+ = (A → H, are→ HI, AG →
I)

16
Closure of a set of FDs [Example]
 Compute the closure of the following set F of functional dependencies FDs
for relational schema R = (A,B,C,D,E,F):
 F = (A → B, A → C, CD → E, CD → F, B → E)
 Find out the closure of F.

A → B & A → C Union Rule A → BC


CD → E & CD Union Rule CD → EF

A→F B & B → E Transitivity Rule A→E
A → C & CD → Pseudo-transitivity AD → E
A → C & CD → Rule
E Pseudo-transitivity AD → F
F Rule

F+ = (A → BC, CD → EF, A → E, AD → E, AD
→ F)

17
Closure of a set of FDs [Example]
 Compute the closure of the following set F of functional dependencies FDs
for relational schema R = (A,B,C,D,E):
 F = (AB → C, D → AC, D → E )
 Find out the closure of F.

D → AC Decomposition D→A&D
D → AC & D → Rule
Union Rule →
DC→ ACE
E

F+ = (D → A, D → C, D → ACE)

18
Closure of attribute
sets
Section – 3
What is a closure of attribute sets?
 Given a set of attributes α, the closure of α under F is the set of
attributes that are functionally determined by α under F.
 It is denoted by α+.

20
What is a closure of attribute sets?
 Given a set of attributes α, the closure of α under F is the set of
attributes that are functionally determined by α under F.
 It is denoted by α+.
Algorith
m
 Algorithm to compute α+, the closure of α under F
 Steps
1. result = α
2. while (changes to result) do
 for each β → γ in F do
 begin
• if β ⊆ result then result = result U γ
• else result = result
 end

21
Closure of attribute sets [Example]
 Consider the relation schema R = (A, B, C, G, H, I).
 For this relation, a set of functional dependencies F can be given as
F = {A → B, A → C, CG → H, CG → I, B → H}
 Find out the closure of (AG)+.  Step 1.
Algorith result = α => result =
m AG
 Algorithm to compute α+, the closure of α
A→B A ⊆ AG result = ABG
under F
 Steps A→C A ⊆ ABG result = ABCG
1. result = α CG → CG ⊆ result =
2. while (changes to result) do H
CG → ABCG
CG ⊆ ABCGH
result =
 for each β → γ in F do IB → H ABCGH
B⊆ ABCGHI
result =
 begin ABCGHI ABCGHI
• if β ⊆ result then result =
result U γ AG+ = ABCGHI
• else result = result
 end
#(DS)  Relational Database Design 22
Closure of attribute sets [Exercise]
 Given functional dependencies (FDs) for relational schema R =
(A,B,C,D,E):
 F = {A → BC, CD → E, B → D, E → A}
 Find Closure for A
 Find Closure for CD
 Find Closure for B
 Find Closure for BC
 Find Closure for E
Answer

A+ = ABCDE
CD+ = ABCDE
B+ = BD
BC+ = ABCDE
E+ = ABCDE

#(DS)  Relational Database Design 23


Decomposition
Section – 5
What is decomposition?
 Decomposition is the process of breaking down given relation into
two or more relations.
 Relation R is replaced by two or more relations in such a way that:
 Each new relation contains a subset of the attributes of R
 Together, they all include all tuples and attributes of R
 Types of decomposition
 Lossy decomposition
 Lossless decomposition (non-loss decomposition)

25
Lossy decomposition
 The decomposition of relation R into Custo
mer
An Balan Bna
R1 and R2 is lossy when the join of R1
o ce me
and R2 does not yield the same
A0 5000 Rajko
relation as in R. 1 t
 This is also referred as lossy-join Table A0 5000 Sura Table
decomposition. -1
An Balan
2 t -2
Balan Bna
 The disadvantage of such kind of o ce Not ce me
decomposition is that some A0 5000
Same 5000 Rajko
information is lost during retrieval of 1 t
original relation. A0 5000 Custo 5000 Sura
2 mer t
 From practical point of view, An Balan Bna
o ce me
decomposition should not be lossy
A0 5000 Rajko
decomposition. 1 t
A0 5000 Surat
1
A0 5000 Rajko
26
Lossless decomposition
 The decomposition of relation R into Custo
mer
An Balan Bna
R1 and R2 is lossless when the join of
o ce me
R1 and R2 produces the same relation
A0 5000 Rajko
as in R. 1 t
 This is also referred as a non-additive Table A0 5000Sura Table
(non-loss) decomposition. -1
An Balan
2 t -2
An Bna
 All decompositions must be lossless. o ce Same o me
A0 5000 A0 Rajko
1 1 t
A0 5000 Custo A0 Sura
2 mer 2 t
An Balan Bna
o ce me
A0 5000 Rajko
1 t
A0 5000 Sura
2 t

27
Anomaly and its types
Section – 6
What is an anomaly in database design?
 Anomalies are problems that can occur in poorly planned, un-
normalized database where all the data are stored in one table.
 There are three types of anomalies that can arise in the database because
of redundancy are
 Insert anomaly
 Delete anomaly
 Update / Modification anomaly

29
Insert anomaly
 Consider a relation Emp_Dept(EID, Ename, City, DID, Dname, Manager)
EID as a primary key
Emp_D
ept Ena DI Dna Mana An insert anomaly occurs when certain
EID City
me D me ger attributes cannot be inserted into the
1 Raj Rajkot 1 CE Shah
database without the presence of
2 Meet
NUL NULL
Surat
NULL
1
2
CE
IT
Shah
NULL
another attribute.
Want to insert new department
L
detail (IT)

 Suppose a new department (IT) has been started by the organization


but initially there is no employee appointed for that department.
 We want to insert that department detail in Emp_Dept table.
 But the tuple for this department cannot be inserted into this table
as the EID will have NULL value, which is not allowed because EID
is primary key.
 This kind of problem in the relation where some tuple cannot be inserted
30
Delete anomaly
 Consider a relation Emp_Dept(EID, Ename, City, DID, Dname, Manager)
EID as a primary key
Emp_D
ept
EI Ena DI Dna Mana A delete anomaly exists when certain
City
D me D me ger attributes are lost because of the
1 Raj Rajkot 1 CE Shah deletion of another attribute.
2 Meet Surat 1 CE Shah Want to delete
3 Jay Barod 2 IT Dave (Jay) employee's
a detail

 Now consider there is only one employee in some department (IT)


and that employee leaves the organization.
 So we need to delete tuple of that employee (Jay).
 But in addition to that information about the department also
deleted.
 This kind of problem in the relation where deletion of some tuples can lead
to loss of some other data not intended to be removed is known as delete
31
Update anomaly
 Consider a relation Emp_Dept(EID, Ename, City, Dname, Manager) EID as
aEmp_D
primary key
ept
EI Ena Mana An update anomaly exists when one or
City Dname
D me ger more records (instance) of
1 Raj Rajkot CE Sah duplicated data is updated, but not
2 Meet Surat C.E Shah all.
Want to update
3 Jay Barod Comput Shaah
manager of CE
a er
department
4 Hari Rajkot IT Dave

 Suppose the manager of a (CE) department has changed, this


requires that the Manager in all the tuples corresponding to that
department must be changed to reflect the new status.
 If we fail to update all the tuples of given department, then two
different records of employee working in the same department
might show different Manager lead to inconsistency in the
database. 32
How to deal with insert, delete and update
anomaly
Emp_D Em De
ept Ena DI Dna Mana p
EI Ena DI pt Dna
DI Mana
EID City City
me D me ger D me D D me ger
1 Raj Rajkot 1 CE Shah 1 Raj Rajko 1 1 CE Shah
t 2 IT Dave
2 Meet Surat 1 C.E Shah
3 Jay Barod 2 IT Dave 3
2 Jay
Meet Barod
Surat 2
1 3 EC NULL
a a
NUL NULL NULL 3 EC NULL
L

Such type of anomalies in the database design can be solved by


using normalization.

33
Normalization and
normal forms
Section – 7
What is normalization?
 Normalization is the process of removing redundant data from tables
to improve data integrity, scalability and storage efficiency.
 data integrity (completeness, accuracy and consistency of data)
 scalability (ability of a system to continue to function well in a growing amount of
work)
 storage efficiency (ability to store and manage data that consumes the least amount
of space)

 What we do in normalization?
 Normalization generally involves splitting an existing table into multiple (more
than one) tables, which can be re-joined or linked each time a query is issued
(executed).

35
How many normal forms are there?
 Normal forms:
 1NF (First normal form)
 2NF (Second normal form)
 3NF (Third normal form)
 BCNF (Boyce–Codd normal form)
 4NF (Forth normal form)
 5NF (Fifth normal form)

As we move from 1NF to 5NF number of tables and complexity


increases but redundancy decreases.

36
Normal forms
1NF (First Normal
Form)
Section – 7.1
1NF (First Normal Form)
 Conditions for 1NF
Each cells of a table should contain
a single value.

 A relation R is in first normal form (1NF) if and only if it does not contain
any composite attribute or multi-valued attributes or their
combinations.
OR
 A relation R is in first normal form (1NF) if and only if all underlying
domains contain atomic values only.

38
1NF (First Normal Form) [Example - Composite
attribute]
Custo
mer
CI Nam
Address • In customer relation address is composite
D e
C0 Raju Jamnagar Road, attribute which is further divided into sub-
1 Rajkot attributes as “Road” and “City”.
C0 Mites Nehru Road, • So customer relation is not in 1NF.
2 h Jamnagar
C0 Jay C.G Road,
3 Ahmedabad
 Problem: It is difficult to retrieve the list of customers living in
’Jamnagar’ city from customer table.
 The reason is that address attribute is composite attribute which
contains road name as well as city name in single cell.
 It is possible that city name word is also there in road name.
 In our example, ’Jamnagar’ word occurs in both records, in first record it is
a part of road name and in second one it is the name of city.
#(DS)  Relational Database Design 39
1NF (First Normal Form) [Example - Composite
attribute]
Custo Custo
mer
CI Nam mer
CI Nam
Address Road City
D e D e
C0 Raju Jamnagar Road, C0 Raju Jamnagar Rajkot
1 Rajkot 1 Road
C0 Mites Nehru Road, C0 Mites Nehru Road Jamnagar
2 h Jamnagar 2 h
C0 Jay C.G Road, C0 Jay C.G Road Ahmeda
3 Ahmedabad 3 bad
 Solution: Divide composite attributes into number of sub-
attributes and insert value in proper sub-attribute.
Exerci Convert below relation into 1NF (First
se Normal Form)
Pers
on
PI
Full_Name City
D
P0 Raju Maheshbhai Rajk
1 Patel ot 40
1NF (First Normal Form) [Example - Multivalued
attribute]
Stude
nt
Rn Nam
FailedinSubjects • In student relation FailedinSubjects
o e
10 Raju DS, DBMs attribute is a multi-valued attribute
1 which can store more than one values.
10 Mites DBMS, DS • So above relation is not in 1NF.
2 h
10 Jay DS, DBMS, DE
3
10 Jeet DBMS, DE, DS
4
10 Harsh DE, DBMS, DS
 Problem:
5
It is difficult to retrieve the list of students failed in ’DBMS’
as well as ’DS’ but not in other subjects from student table.
10 Neel DE, DBMS
 The
6 reason is that FailedinSubjects attribute is multi-valued attribute so it
contains more than one value.

#(DS)  Relational Database Design 41


1NF (First Normal Form) [Example - Multivalued
attribute]
Stude Stude Resul
nt
Rn Nam nt tRI Rn
FailedinSubjects Rn Nam Subje
o e o e D o ct
10 Raju DS, DBMs 10 Raju 1 10 DS
1 1 1
10 Mites DBMS, DS 10 Mites 2 10 DBMS
2 h 2 h 1
10 Jay DS, DBMS, DE 10 Jay 3 10 DBMS
3 3 2
10 Jeet DBMS, DE, DS 10 Jeet 4 10 DS
4 4 2
10 Harsh DE, DBMS, DS
 Solution: 10 Harsh 5 10 DS
5
Split the table into two tables
5
in such as way that
3
 the first table contains all attributes except multi-valued attribute with same
10 Neel DE, DBMS 10 Neel … … …
6
primary key and
6
 second table contains multi-valued attribute and place a primary key in it.
 insert the primary key of first table in the second table as a foreign key.

42
Normal forms
2NF (Second Normal
Form)
Section – 7.2
2NF (Second Normal Form)
 Conditions for 2NF
It is in 1NF and each table should contain a
single primary key.

 A relation R is in second normal form (2NF)


 if and only if it is in 1NF and
 every non-primary key attribute is fully dependent on the primary key
OR
 A relation R is in second normal form (2NF)
 if and only if it is in 1NF and
 no any non-primary key attribute is partially dependent on the primary key

44
2NF (Second Normal Form) [Example]
Custo
mer
CI AccessD Balanc BranchN
FD
ANO 2
D ate e ame
C0 A01 01-01- 50000 Rajkot CID AN AccesssD Balan BranchNa
1 2017
O ate ce me
C0 A01 01-03- 50000 Rajkot
2 2017 FD
C0 A02 01-05- 25000 Surat 1
1 2017
C0 A02 01-07- 25000 Surat
3FD1: {CID,
2017ANO} → {AccesssDate, Balance, BranchName}
 FD2: ANO → {Balance, BranchName}
 Balance and BranchName are partial dependent on primary key
(CID + ANO). So customer relation is not in 2NF.

45
2NF (Second Normal Form) [Example]
Custo
mer
CI AccessD Balanc BranchN
FD
ANO 2
D ate e ame
C0 A01 01-01- 50000 Rajkot CID AN AccesssD Balan BranchNa
1 2017
O ate ce me
C0 A01 01-03- 50000 Rajkot
2 2017 FD
C0 A02 01-05- 25000 Surat 1
1 2017
C0 A02 01-07- 25000 Surat
3Problem: For
2017 example, in case of a joint account multiple (more than
one) customers have common (one) accounts.
 If an account ’A01’ is operated jointly by two customers says ’C01’
and ’C02’ then data values for attributes Balance and BranchName
will be duplicated in two different tuples of customers ’C01’ and
’C02’.

46
2NF (Second Normal Form) [Example]
Custo Table-1 Table-2
mer
CI AccessD Balanc BranchN Balanc BranchN CI AccessD
ANO ANO ANO
D ate e ame e ame D ate
C0 A01 01-01- 50000 Rajkot A01 50000 Rajkot C0 A01 01-01-
1 2017 1 2017
A02 25000 Surat
C0 A01 01-03- 50000 Rajkot C0 A01 01-03-
2 2017 2 2017
C0 A02 01-05- 25000 Surat C0 A02 01-05-
1 2017 1 2017
C0 A02 01-07- 25000 Surat C0 A02 01-07-
3Solution: Decompose relation
2017 in such a way that resultant
3 relations
2017
do not have any partial FD.
 Remove partial dependent attributes from the relation that violets 2NF.
 Place them in separate relation along with the prime attribute on which they
are fully dependent.
 The primary key of new relation will be the attribute on which it is fully
dependent.
 Keep other attributes same as in that table with the same primary key.
47
Normal forms
3NF (Third Normal
Form)
Section – 7.3
3NF (Third Normal Form)
 Conditions for 3NF
It is in 2NF and there is no transitive
dependency.
(Transitive dependency???) A → B & B → C
then A → C

 A relation R is in third normal form (3NF)


 if and only if it is in 2NF and
 every non-key attribute is non-transitively dependent on the primary key
OR
 A relation R is in third normal form (3NF)
 if and only if it is in 2NF and
 no any non-key attribute is transitively dependent on the primary key

49
3NF (Third Normal Form) [Example]
Custo
mer
AN Balan BranchN BranchAdd
FD
O ce ame ress 2
A01 50000 Rajkot Kalawad AN Balan BranchNa BranchAddr
road
O ce me ess
A02 40000 Rajkot Kalawad
Road FD
A03 35000 Surat C.G Road 1
A0 25000 Surat C.G Road
4
 FD1: ANO → {Balance, BranchName, BranchAddress}
 FD2: BranchName → BranchAddress
 So AccountNO → BranchAddress (Using Transitivity rule)
 BranchAddress is transitive depend on primary key (ANO). So
customer relation is not in 3NF.

50
3NF (Third Normal Form) [Example]
Custo
mer
AN Balan BranchN BranchAdd
FD
O ce ame ress 2
A01 50000 Rajkot Kalawad AN Balan BranchNa BranchAddr
road
O ce me ess
A02 40000 Rajkot Kalawad
Road FD
A03 35000 Surat C.G Road 1
A0 25000 Surat C.G Road
4
 Problem: In this relation, branch address will be stored repeatedly
for each account of the same branch which occupies more space.

51
3NF (Third Normal Form) [Example]
Custo Table-1 Table-2
mer
AN Balan BranchN BranchAdd BranchN BranchAdd AN Balan BranchN
O ce ame ress ame ress O ce ame
A01 50000 Rajkot Kalawad Rajkot Kalawad A01 50000 Rajkot
road road
A02 40000 Rajkot
A02 40000 Rajkot Kalawad Surat C.G Road
Road A03 35000 Surat
A03 35000 Surat C.G Road A0 25000 Surat
4
A0
Solution: Decompose
25000 Surat relation in such a way that resultant relations
C.G Road
4
do not have any transitive FD.
 Remove transitive dependent attributes from the relation that violets 3NF.
 Place them in a new relation along with the non-prime attributes due to
which transitive dependency occurred.
 The primary key of the new relation will be non-prime attributes due to
which transitive dependency occurred.
 Keep other attributes same as in the table with same primary key and add
prime attributes of other relation into it as a foreign key.
52
Normal forms
BCNF (Boyce-Codd
Normal Form)
Section – 7.4
BCNF (Boyce-Codd Normal Form)
Primary Determin Depende
 Conditions for BCNF ant
Key nt
BCNF is based on the concept of a
AccountNO → {Balance,
determinant.
Branch}
It is in 3NF and every determinant should
be primary key.

 A relation R is in Boyce-Codd normal form (BCNF)


 if and only if it is in 3NF and
 for every functional dependency X → Y, X should be the primary key of the table.
OR
 A relation R is in Boyce-Codd normal form (BCNF)
 if and only if it is in 3NF and
 every prime key attribute is non-transitively dependent on the primary key
OR
 A relation R is in Boyce-Codd normal form (BCNF)
 if and only if it is in 3NF and
54
BCNF (Boyce-Codd Normal Form) [Example]
Studen FD
tRN Subje Facul 1 • FD1: RNO, Subject → Faculty
O ct ty • FD2: Faculty → Subject
RN Subje Facul
101 DS Patel • So {RNO, Subject} → Subject
O ct ty (Transitivity rule)
102 DBMS Shah
FD
103 DS Jadeja
2
In FD2, determinant is Faculty which is not a primary key. So
104 DBMS Dave
student table is not in BCNF.
105 DBMS Shah
102 DS Patel Problem: In this relation one student can learn more than one
subject with different faculty then records will be stored
101 DBMS Dave
repeatedly for each student, language and faculty
105 DS Jadej combination which occupies more space.
a
• Here, one faculty teaches only one subject, but a
subject may be taught by more than one faculty.
• A student can learn a subject from only one faculty.
55
BCNF (Boyce-Codd Normal Form) [Example]
• Solution: Decompose
relation in such a way that
Studen Table-1 Table-2 resultant relations do not
tRN Subje Facul Facul Subje RN Facul have any transitive FD.
O ct ty ty ct O ty • Remove transitive dependent
101 DS Patel Patel DS 101 Patel prime attribute from relation
102 DBMS Shah Shah DBMS 102 Shah that violets BCNF.
• Place them in separate new
103 DS Jadeja Jadeja DS 103 Jadeja relation along with the non-
104 DBMS Dave Dave DBMS 104 Dave prime attribute due to which
105 DBMS Shah 105 Shah transitive dependency
occurred.
102 DS Patel 102 Patel • The primary key of new
101 DBMS Dave 101 Dave relation will be this non-prime
105 DS Jadej 105 Jadej
attribute due to which
a a
transitive dependency
occurred.
• Keep other attributes same as
in that table with same
primary key and add a prime
attribute of other relation
56 into
Multivalued dependency (MVD)
 For a dependency X → Y, if for a single value of X, multiple values of
Y exists, then the table may have multi-valued dependency.
Studen
tRN Subje Facul
O ct ty
101 DS Patel
101 DBMS Patel
101 DS Shah
101 DBMS Shah
 Multivalued dependency (MVD) is denoted by →→
 Multivalued dependency (MVD) is represented as X →→ Y

57
Normal forms
4NF (Forth Normal
Form)
Section – 7.5
4NF (Forth Normal Form)
 Conditions for 4NF
 A relation R is in fourth normal form (4NF)
 if and only if it is in BCNF and
 has no multivalued dependencies

Studen Subjec Faculty


tRN Subje Facul tRN Subje
RN Facul
O ct ty O ct O ty
101 DS Patel 101 DS 101 Patel
101 DBMS Patel 101 DBMS 101 Shah
101 DS Shah
101 DBMS Shah
 Above student table has multivalued dependency. So student table is
not in 4NF.

59
Functional dependency & Multivalued
dependency
 A table can have both functional dependency as well as multi-valued
dependency together.
 RNO → Address
 RNO →→ Subject
 RNO →→ Faculty
Studen Subjec Faculty Addres
tRN Subje Facul tRN Subje RN Facul s
RN
Address Address
O ct ty O ct O ty O
101 C. G. Road, DS Patel 101 DS 101 Patel 101 C. G. Road,
Rajkot 101 DBMS 101 Shah Rajkot
101 C. G. Road, DBMS Patel
Rajkot
101 C. G. Road, DS Shah
Rajkot
101 C. G. Road, DBMS Shah
Rajkot

60
Normal forms
5NF (Fifth Normal
Form)
Section – 7.6
5NF (Fifth Normal Form)
 Conditions for 5NF
 A relation R is in fifth normal form (5NF)
 if and only if it is in 4NF and
 it cannot have a lossless decomposition in to any number of smaller tables
(relations).
Student_R
esult
RI RN Nam Subje Resu
D O e ct lt Student_Result relation is further
1 101 Raj DBMS Pass decomposed into sub-relations. So the
2 101 Raj DS Pass
above relation is not in 5NF.
3 101 Raj DF Pass
4 102 Meet DBMS Pass
5 102 Meet DS Fail
6 102 Meet DF Pass
7 103 Sures DBMS Fail
h
8 103 Sures DS Pass #(DS)  Relational Database Design 62
5NF (Fifth Normal Form)
 Conditions for 5NF
 A relation R is in fifth normal form (5NF)
 if and only if it is in 4NF and
 it cannot have a lossless decomposition in to any number of smaller tables
(relations).
Student_R Studen Subjec Result
esult
RI RN Nam Subje Resu tRN t Nam RI RN SI Resu
Name SID
D O e ct lt O e D O D lt
1 101 Raj DBMS Pass 101 Raj 1 DBMS 1 101 1 Pass
2 101 Raj DS Pass 102 Meet 2 DS 2 101 2 Pass
3 101 Raj DF Pass 103 Sures 3 DF 3 101 3 Pass
4 102 Meet DBMS Pass h 4 102 1 Pass
5 102 Meet DS Fail 5 102 2 Fail
None of the above relations can
6 102 Meet DF Pass be further decomposed into sub- 6 102 3 Pass
7 103 Sures DBMS Fail relations. So the above database 7 103 1 Fail
h is in 5NF. 8 103 2 Pass
8 103 Sures DS Pass #(DS)  Relational Database Design 63
How to find key?
 Conditions to find key
 The attribute is a part of key, if it does not occur on any side of FD
 The attribute is a part of key, if it occurs on the left-hand side of an FD, but
never occurs on the right-hand side
 The attribute is not a part of key, if it occurs on the right-hand side of an FD,
but never occurs on the left-hand side
 The attribute may be a part of key or not, if it occurs on the both side of an
FD

64
How to find key? [Example]
 Let a relation R with attributes ABCD with FDs C → A, B → C. Find keys for
relation R.
 attribute not occur on any side of FDs (D) √
 attribute occurs on only left-hand side of an FDs (B) √
 attribute occurs on only right-hand side of an FDs (A) X
 attribute occurs on both the sides of an FDs (C) ?
 The core is BD.
 B determines C and C determines A, So using transitivity rule B
determines A also.
 So BD is a key.

65
How to find key? [Exercise]
 Let a relation R with attributes ABCD with FDs C → D, C → A and B → C.
Find keys for relation R.
 The core is B. B determines C which determines A and D, so B is a key. Therefore B
is the key.

 Let a relation R with attributes ABCD with FDs B → C, D → A. Find keys for
relation R.
 The core is BD. B determines C and D determines A, so BD is a key. Therefore BD is
the key.

 Let a relation R with attributes ABCD with FDs A → B, BC → D and A → C.


Find keys for relation R.
 The core is A. A determines B and C which determine D, so A is a key. Therefore A is
the key.

66
Find (candidate) key & check for normal forms
[Example]
 Suppose you are given a relation R with four attributes ABCD. For each of
the following sets of FDs, do the following: F = (B → C, D → A)
 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key is BD

Relation R is in 1NF but not 2NF. In above FDs, there is a partial dependency
(As per FD B → C, C depends only on B but Key is BD so C is partial depends on key
(BD))
(As per FD D → A, A depends only on D but Key is BD so A is partial depends on key
(BD))

67
Find (candidate) key & check for normal forms
[Example]
 Suppose you are given a relation R with four attributes ABCD. For each of
the following sets of FDs, do the following: F = (C → D, C → A, B → C)
 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key is B

Relation R is in 2NF but not 3NF. In above FDs, there is a transitive dependency
(As per FDs B → C & C → D then B → D so D is transitive depends on key (B))
(As per FDs B → C & C → A then B → A so A is transitive depends on key (B))

68
Find (candidate) key & check for normal forms
[Example]
 Suppose you are given a relation R with four attributes ABCD. For each of
the following sets of FDs, do the following: F = (A → B, BC → D, A → C)
 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key is A

Relation R is in 2NF but not 3NF. In above FDs, there is a transitive dependency
(As per FDs A → B & A → C then A → BC using union rule) and
(As per FDs A → BC & BC → D then A → D so D is transitive depends on key (A))

69
Find (candidate) key & check for normal forms
[Example]
 Suppose you are given a relation R with four attributes ABCD. For each of
the following sets of FDs, do the following: F = (ABC → D, D → A)
 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key are ABC &


BCD

Relation R is in 3NF but not BCNF.


In the above FDs, both FDs have prime attribute (D and A) in
dependent (right) side.

70
How to normalize database?
 A software contract and consultancy firm maintains details of all the
various projects in which its employees are currently involved. These
details comprise: Employee Number, Employee Name, Date of Birth,
Department Code, Department Name, Project Code, Project Description,
Project Supervisor.
 Assume the following:
 Each employee number is unique.
 Each department has a single department code.
 Each project has a single code and supervisor.
 Each employee may work on one or more projects.
 Employee names need not necessarily be unique.
 Project Code, Project Description and Project Supervisor are repeating fields.
 Normalize this data to Third Normal Form.

71
How to normalize database?
 A software contract and consultancy firm maintains details of all the
various projects in which its employees are currently involved. These
details comprise: Employee Number, Employee Name, Date of Birth,
Department Code, Department Name, Project Code, Project Description,
Project Supervisor. UNF
Employ
Employ Date Departm Departm Proje Project Project
ee
ee of ent ent ct Descript Supervi
Numbe
Name Birth Code Name Code ion sor
r
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE 1 IOT Patel
1 Raj 1-1-85 1 CE 2 PHP Shah

72
How to normalize database?
UNF
Employ
Employ Date Departm Departm Proje Project Project
ee
ee of ent ent ct Descript Supervi
Numbe
Name Birth Code Name Code ion sor
r
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE 1 IOT Patel
1 Raj 1-1-85 1 CE
1NF 2 PHP Shah

Employ Employ
Employ Date Departm Departm Proje Project Project
ee ee
ee of ent ent ct Descript Supervi
Numbe Numbe
Name Birth Code Name Code ion sor
r r
1 Raj 1-1-85 1 CE 1 1 IOT Patel
2 Meet 4-4-86 2 EC 2 2 PHP Shah
3 Suresh 2-2-85 1 CE 3 1 IOT Patel
#(DS)  1
Relational Database Design 2 PHP Shah 73
How to normalize database?
1NF
Employ Employ
Employ Date Departm Departm Proje Project Project
ee ee
ee of ent ent ct Descript Supervi
Numbe Numbe
Name Birth Code Name Code ion sor
r r
1 Raj 1-1-85 1 CE 1 1 IOT Patel
2 Meet 4-4-86 2 EC 2 2 PHP Shah
3 Suresh 2-2-85 1 CE 3 1 IOT Patel
2NF
1 2 PHP Shah
Employ Proje Project Project Employ
Employ Date Departm Departm Proje
ee ct Descript Supervi ee
ee of ent ent ct
Numbe Code ion sor Numbe
Name Birth Code Name Code
r 1 IOT Patel r
1 Raj 1-1-85 1 CE 1 1
2 PHP Shah
2 Meet 4-4-86 2 EC 2 2
3 Suresh 2-2-85 1 CE 3 1
#(DS)  Relational Database Design 1 742
How to normalize database?

3NF

Employ Departm Departm


Employ Date Departm
ee ent ent
ee of ent
Numbe Code Name
Name Birth Code
r 1 CE
1 Raj 1-1-85 1
2 EC
2 Meet 4-4-86 2
3 Suresh 2-2-85 1
Proje Project Project Employ
Proje
ct Descript Supervi ee
ct
Code ion sor Numbe
Code
1 IOT Patel r
1 1
2 PHP Shah
2 2
3 1
1 2
#(DS)  Relational Database Design 75
Questions asked in GTU
1. What is meant by normalization? Write its need. List and discuss various
normalization forms.
2. Consider schema EMPLOYEE(E-ID,E-NAME,E-CITY,E-STATE) and
FD = {E-ID → E-NAME, E-ID → E-CITY, E-ID → E-STATE, E-CITY → E-
STATE}
 Find attribute closure for: (E-ID)+
3. Compute the closure of the following set F of functional dependencies for
relation schema R(A, B, C, D, E).
F = { A → BC, CD → E, B → D, E → A}
 List the candidate keys for R.
4. Consider schema R = (A, B, C, G, H, I) and the set F of functional
dependencies {A → B, A → C, CG → H, CG → I, B → H}. ( Use F+ )
 Prove that AG → I Holds.

76
Questions asked in GTU
5. In the BCNF decomposition algorithm, suppose you use a functional
dependency α → β to decompose a relation schema r (α, β, γ) into r1 (α,
β) and r2 (α, γ).
 What primary and foreign-key constraint do you expect to hold on the decomposed
relations?
 Give an example of an inconsistency that can arise due to an erroneous update, if
the foreign-key constraint were not enforced on the decomposed relations above.
 When a relation is decomposed into 3NF, what primary and foreign key
dependencies would you expect will hold on the decomposed schema?
6. A college maintains details of its lecturers' subject area skills. These
details comprise: Lecturer Number, Lecturer Name, Lecturer Grade,
Department Code, Department Name, Subject Code, Subject Name,
Subject Level. Assume that each lecturer may teach many subjects but
may not belong to more than one department. Subject Code, Subject
Name and Subject Level are repeating fields. Normalize this data to Third
Normal Form.

77

You might also like