ch 5
ch 5
Relational–Databas
e Design
1
Basic
Concept
• NORMALIZATION
• is a database design technique that reducesdata redundancyand
eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies.
• Normalization rules divides larger tables into smaller tables and links them using
relationships.
• The purpose of Normalization is to eliminate redundant (repetitive) data and ensure
data is stored logically.
• Normalization is process to convert bad database to good database design.
• Database Normal Forms
• 1NF (First Normal Form)
• 2NF (Second Normal Form)
• 3NF (Third Normal Form)
• BCNF (Boyce-Codd Normal Form)
• 4NF (Fourth Normal Form)
• 5NF (Fifth Normal Form)
2
Basic
Concept
• Pitfalls in Relational-Database:
• Example: Suppose Bank data base contains all data about Branch, Customer and Loan in a
single table…
Branch Name Branch-city Assets Customer-name Loan-no Loan-amount
D B 90 J L-17 1000
R P 21 S L-23 2000
P H 17 H L-15 1500
D B 90 JA L-14 1500
. . . . . .
. . . . . .
. . . . . .
P H 17 G L-25 2500
4
Terminologies:
• Before starting 1NF,2NF,3NF… Let us understand following Terminologies
:
• Functional Dependencies
• Armstrong Axioms (Rules of Inference)
• Decomposition
• Identification of Candidate Key
• Partial Functional Dependencies
• Full Functional Dependencies
• Functional Dependencies:
• Denoted by symbol
• A B
• Meaning A attribute determines B attribute or
• B attribute is dependent on A attribute
5
Terminologies:
• Armstrong's Axioms:
• If F is a set of functional dependencies then the closure set of F, denoted as F+, is the set of
all functional dependencies logically implied by F.
• Armstrong's Axioms are a set of rules, that when applied repeatedly, generates a closure of
functional dependencies.
• Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then
alpha holds beta.
• Augmentation rule − If a → b holds and y is attribute set, then ay → by also
holds.
• Union Rule :If a → b and a y holds then a by.
• Decomposition Rule: If a by holds then a b and a y.
• Pseudo transitivity Rule : If a b and yb dholds then ay d.
• Transitivity Rule : If a b and b y then a y.
6
Terminologies:
• Example :
• Suppose R = {A, B, C, G, H, I}
• Given FD are as follows :
A B,A C , CG I , B H
• Find Closure Set F + CG H,
7
Terminologies:
Decomposition:
• is to divide original Relation into smaller relations.
• Decomposition should loss-less should not lossy.
• 2 properties to check whether decomposition is loss-less or
not R1 ⋂ R2 = R1 Or R1 ⋂ R2 = R2
Example :
Let R = {A,B,C} and FD = A B
The Decompose relations are R1 = {A,B} and R2 = {A,C} Is R1 and R2 loss less or lossy
decomposition?
Solution : Since A B is given
So if we take R1 ⋂ R2will get A and A B is given meaning B is
depends on A
So B must with A which is in R1
R1 ⋂ R2 = R1 so we are getting R1
Hence this is loss less decomposition
8
Terminologies:
Decomposition: (Exercise)
1. Let R = {A,B,C} and FD = A B
The Decompose relations are R1 = {A,B} and R2 = {B,C} Is R1 and R2 loss less or lossy
decomposition?
2. Let R = {A,B,C,D,E} and FD = {A BC , CD E, B D, E A}
The Decompose relations are R1 = {A,B,C} and R2 = {A,D,E} Is R1 and R2 loss less or
lossy decomposition?
9
Terminologies:
Determine Candidate Key
Example : Suppose R =
{A,B,C,D,E,F} With FD = A B, AE F, CE D, BC D
CD A, the candidate Key for R.
Determine
1. F+ (A) = {A}
A B { A,B}
A can identify maximum 1 attribute hence cannot be a candidate
Key.
2. F+(AE) = AE F { A,E,F}
A B {A,B,E,F}
Therefore AE F can identify maximum 4 attribute hence cannot be a candidate Key.
3. F+(CD) = CD A { A,C,D}
A B {A,B,C,D}
Therefore CD A can identify maximum 4 attribute hence cannot be a candidate Key.
10
Terminologies:
Determine Candidate Key :
Cont…
Example : Suppose R = {A,B,C,D,E,F} BC D
Determine
With FD = AtheB,candidate Key A,
AE F, CD forCE D,
R.
4. F+ (CE) = CE D { C,D,E}
CD A {A,C,D,E}
A B {A,B,C,D,E}
AE F {A,B,C,D,E,F}
CE can identify all attributes of R hence CE is a candidate Key
5. R. F+(BC) = BC D = {B,C,D}
CD A = {A,B,C,D}
Therefore BC D can identify maximum 4 attribute hence cannot be a
candidate Key
11
Terminologies:
Determine Candidate Key (Exercise)
1. Example : Suppose R =
{A,B,C,D,E,F} With FD = A BC, E A,
CD E, B D,
Determine the candidate Key for R.
• Full Functional Dependency :
When Non key attributes are dependent on Key attributes is called as Full
Functional dependency.
• Partial Functional dependency:
When Non key attributes are dependent on part of the Key attributes is
called as Partial Functional dependency.
Example : Suppose R = {A,B,C,D } and FD = { AB C , B D} and AB is a candidate key then
AB C Where AB is Key Attribute and C is non key attribute which is fully dependent on AB
Hence AB is FULL Functional dependency.
B D Where B is a part of AB key attribute and D is non key attribute and D is partially
dependent on B which is part of AB Key attribute Hence B is Partially Functional Dependency.
12
NORMALIZATION
• NORMALIZATION
• 1NF(First Normal Form)
• If and only if all attributes of the Relation R is atomic in nature.
14
NORMALIZATION
• NORMALIZATION (Example of Bad Database
Design)
S# S-name DOB C# C-Name Pre-requisite Duration DOE Marks Grade
101 Davis 11/04/1986 M4 Applied Maths Basic Math's 7 11/11/2004 82 A
102 Daniel 11/06/1987 M4 Applied Maths Basic Math’s 7 11/11/2004 62 C
101 Davis 11/04/1986 H6 American History 4 22/11/2004 79 B
103 Sandra 10/02/1988 C3 Bio Chemistry Basic Chemistry 11 16/11/2004 65 B
104 Evelyn 22/02/1986 B3 Botany 8 26/11/2004 77 B
102 Daniel 11/6/1987 P3 Nuclear Physics Basic Physics 13 12/11/2004 68 B
105 Susan 31/8/1985 P3 Nuclear Physics Basic Physics 13 12/11/2004 89 A
103 Sandra 10/02/1988 B4 Zoology 5 27/11/2004 54 D
105 Susan 31/8/1985 H6 American History 4 22/11/2004 87 A
104 Evelyn 22/02/1986 M4 Applied Maths Basic Math's 7 11/11/2004 65 B
Above table contains All Anomalies(Insert, Update and Delete) hence table is
Bad
Database design.
There we will apply Normalization process i.e. 1NF, 2NF and 3 NF…..
15
NORMALIZATION
• NORMALIZATION (Applying 1 NF on Bad database
Design)
S# S-name DOB C# C-Name Pre-requisite Duration DOE Marks Grade
101 Davis 11/04/1986 M4 Applied Maths Basic Math's 7 11/11/2004 82 A
102 Daniel 11/06/1987 M4 Applied Maths Basic Math’s 7 11/11/2004 62 C
101 Davis 11/04/1986 H6 American History 4 22/11/2004 79 B
103 Sandra 10/02/1988 C3 Bio Chemistry Basic Chemistry 11 16/11/2004 65 B
104 Evelyn 22/02/1986 B3 Botany 8 26/11/2004 77 B
102 Daniel 11/6/1987 P3 Nuclear Physics Basic Physics 13 12/11/2004 68 B
105 Susan 31/8/1985 P3 Nuclear Physics Basic Physics 13 12/11/2004 89 A
103 Sandra 10/02/1988 B4 Zoology 5 27/11/2004 54 D
105 Susan 31/8/1985 H6 American History 4 22/11/2004 87 A
104 Evelyn 22/02/1986 M4 Applied Maths Basic Math's 7 11/11/2004 65 B
18
NORMALIZATION
• 3 NF(Third Normal Form)
• R is said to in second normal form if and only if :
• It is in the 2NF.
• No Transitive dependency exists between non-key attributes and Key attribute.
• Group such Transitive dependent attributes as a separate table and identify Key
name
and of newly created table. College (T1) Grades(T4)
• Solution : College S#
101 M4
C# Marks
82
Upper- Lower- Grad
bound bound e
• S#,C# Marks Grade 102 M4 62 100 95 A+
101 H6 79 94 85 A
• Transitive dependency 103 C3 65 84 70 B
104 B3 77 69 65 B-
102 P3 68
64 55 C
105 P3 89
54 45 D
103 B4 54
44 40 E
105 H6 87
104 M4 65
19
NORMALIZATION
• Final solution after 3NF with no Anomalies
: Course (T3)
• College (T1) Student
S# C# Marks (T2)
S# S-name DOB C# C-Name Pre- Duratio DOE
101 M4 82 101 Davis 11/04/1986 requisite n
102 M4 62 M4 Applied Maths Basic Maths 7 11/11/2004
101 H6 79 102 Daniel 11/06/1987
H6 American 4 22/11/2004
103 C3 65 History
103 Sandra 10/02/1988
104 B3 77 C3 Bio Chemistry Basic 11 16/11/2004
102 P3 68 Chemistry
104 Evelyn 22/02/1986
105 P3 89 B3 Botany 8 26/11/2004
• Important Point :
• Every BCNF relation is in 3NF but not every 3NF is in BCNF.
21
The Decomposition of a Table Structure to meet BCNF Requirements
◆ A + B → C, D
◆ C→B
Change the PK to A+C
◆ A + C → B, D
◆ C→B
◆ A+C→D
◆ C→B
22
The Boyce-Codd Normal Form
(BCNF)
23
The Boyce-Codd Normal Form
(BCNF)
24
Decomposition into BCNF
25
NORMALIZATION
• 4 NF(Fourth Normal Form)
• R is said to in second normal form if and only if :
• It is in the 3NF.
• No Multivalued dependency exists between non-key attributes and Key attribute.
• Group such Multivalued attributes as a separate table and identify Key and name
of newly created table.
Student(One of the Table in 3NF)Student(T1) Stu-Address(T11)
S# S-Name Add. DOB S# Add.
S# S-Name DOB
101 Davis A1 11/04/1986 101 Davis 11/04/1986 101 A1
101 Davis A2 11/04/1986 102 Smith 12/05/1987
101 A2
101 Davis A3 11/04/1986 101 A3
102 Smith B1 12/05/1987 102 B1
102 Smith B1 12/05/1987 102 B1
Dr. Amol 26
Pande