Normalization PPT
Normalization PPT
Functional Dependency
• Functional dependency (FD) is a set of constraints
between two attributes in a relation.
• Functional dependency says that if two tuples have
same values for attributes A1, A2,..., An, then those
two tuples must have to have same values for
attributes B1, B2, ..., Bn.
• Functional dependency is represented by an arrow
sign (→) that is, X→Y, where X functionally determines
Y. (or Y is determined by X)
• The left-hand side attributes determine the values of
attributes on the right-hand side.
Functional Dependency
Sid → Sname (Sid determines Sname)
Sachin
Sachin
Is these two person same or different?
EID->ENAME 1 A
2 B
ENAME->EID
3 B
Rule out the functional dependency based
on the tables??
EID ENAME
EID->ENAME 1 A
2 B
INCORRECT ENAME->EID
3 B
Rule out the functional dependency based
on the tables??
A B
A->B 1 1
1 2
B->A
2 2
Rule out the functional dependency based
on the tables??
A B
A->B 1 1
1 2
B->A
2 2
Rule out the functional dependency based
on the tables??
A B C
A->B 1 1 4
B->C 1 2 4
2 1 3
B->A
2 2 3
C->B
2 4 3
C->A
A->C
Rule out the functional dependency based
on the tables??
A B C
A->B 1 1 4
B->C 1 2 4
2 1 3
B->A
2 2 3
C->B
2 4 3
C->A
A->C
Rule out the functional dependency based
on the tables??
A B C
A->B 1 1 4
B->C 1 2 4
2 1 3
B->A
2 2 3
C->B
2 4 3
C->A
A->C
Armstrong's Axioms
Example:
Question: R(ABCD), FD{A →B, B →C,C →D}
Solution:
A+=B (As A can directly determine B)
A+=BC (As B can determine C, A can also determine C, transitive)
A+=BCD (As C can determine D, A can also determine D, transitive)
Find Candidate Key
Example:
Question: R(ABCD), FD{A →B, B →C,C →D}
Solution:
A+=BCDA (A is a candidate key, as it closure holds all the attributes of
R).
Now let’s find the closure of B,C and D.
B+=BCD
C+=CD
D+=D
So A is the only candidate key.
Let find the closure for (AB) +
(AB) + = ABCD
AB can’t be a candidate key as it should be minimal. AB can be super
key.
Closure and Candidate Key
Example:
Question: R(ABCD), FD{A →B, B →C,C →D, D →A}
Solution:
A+=ABCD
B+=BCDA
C+=CDAB
D+=DABC
Candidate key={A,B,C,D}
Closure and Candidate Key
Example:
Question: R(ABCDE), FD{A →B, B C→D,E→C, D →A}
Closure and Candidate Key
Example:
Question: R(ABCDE), FD{A →B, B C→D,E→C, D →A}
Solution:
A+=AB
B+=B
C+=C
D+=ADB
E+=EC
If you check all the dependents in the question, then you can understand that E
can only come as a dependent (in right hand side) if we take different
combinations of E to determine the candidate key.
(AE)+=ABECD
(BE)+=BECDA
(CE)+=CE
(DE)+=DEABC
Candidate key={AE,BE,DE}
Equivalence of Closure
• Two sets of FDs, F and G, are said to be
equivalent if F+=G+
• For example:
{(A,BC), (AB)} and
{(AC), (AB)}
are equivalent.
Trivial Functional Dependency
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
A+ , B+ , C+ , D+
R={A,B,C,D}
GIVEN:
A->BC
C->A
A+ , B+ , C+ , D+
GIVEN:
A->BC
C->A
A+ , B+ , C+ , D+
GIVEN:
A->BC
C->A
A+ , B+ , C+ , D+
GIVEN:
A->BC
C->A
A+ , B+ , C+ , D+
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
GIVEN:
A->BC
C->A
• Let, a new course has been introduced but no student has been registered
yet.
• So cid=205, cname=OS. But all the other values will be null.
• Here, sid is a primary key and can’t be null.
• This type of anomaly is known as insertion anomaly.
Update Anomaly
• When we try to update one data item having its copies scattered over
several places, a few instances get updated properly while a few others
are left with old values. Such instances leave the database in an
inconsistent state.
sid sname address cid cname grade
4 A 3
5 E 1
DID DNAME ADDRESS
EID EN DID Foreign key
1 CSE BLOCK A
1 A 1
2 ECE BLOCK B
2 B 2
3 ME BLOCK C
3 C 3
4 A 3
5 E 1
Redundancy
DID DNAME ADDRESS
EID EN DID Foreign key
1 CSE BLOCK A
1 A 1
2 ECE BLOCK B
2 B 2
3 ME BLOCK C
3 C 3
Redundancy
DID DNAME ADDRESS
EID EN DID Foreign key
1 CSE BLOCK A
1 A 1
2 ECE BLOCK B
2 B 2
3 ME BLOCK C
3 C 3
Redundancy
DID DNAME ADDRESS
EID EN DID Foreign key
1 CSE BLOCK A
1 A 1
2 ECE BLOCK B
2 B 2
3 ME BLOCK C
3 C 3
Redundancy
Normalization
Normalization is a process for assigning attributes to entities. It reduces
data redundancies and helps eliminate the data anomalies.
Normalization works through a series of stages called normal forms:
– First normal form (1NF)
– Second normal form (2NF)
– Third normal form (3NF)
– Boyce–Codd normal form (or BCNF or 3.5NF)
– Fourth normal form (4NF)
– Fifth normal form
– ……
Redundancy
– First normal form (1NF)
– Second normal form (2NF)
Number of Tables
– Third normal form (3NF)
– Boyce–Codd normal form (or BCNF or 3.5NF)
Complexity
– Fourth normal form (4NF)
– Fifth normal form
– ……
Flight Weekdays
UA59 Mo We Fr
UA73 Mo Tu We Th Fr
1NF Solution
Flight Weekday
UA59 Mo
UA59 We
UA59 Fr
UA73 Mo
UA73 Tu
… …
Second Normal Form
• A table is in 2NF iff
– It is in 1NF and
– no non-prime attribute is dependent on any
proper subset of any candidate key of the table
(i.e. no partial dependency exists)
• A non-prime attribute of a table is an attribute
that is not a part of any candidate key of the
table
• A candidate key is a minimal superkey
Example
• Library allows patrons to request books that
are currently out
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Question: Find out whether the given relation is in 2nf or not?
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Solution:
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Solution:
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Solution:
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Solution:
R(ABCDE)
GIVEN :
AB->C
C->D
B->E
Solution:
Partial dependency exist for the given relation hence it is in 1 NF but not in 2NF
Question: Find out whether the given relation is in 2nf or not?
R(ABCDE)
GIVEN :
AB->C
BC->D
Question: Find out whether the given relation is in 2nf or not?
R(ABCDE)
GIVEN :
AB->C
BC->D
Solution:
• Table BorrowedBooks
BookNo Patron Address Due
B1 J. Fisher 101 Main Street 3/2/15
B2 L. Perez 202 Market Street 2/28/15
Patron Address
J. Fisher 101 Main Street
L. Perez 202 Market Street
Another example
• Tournament winners
Tournament Year Winner DOB
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Winner DOB
Al Fredrickson 21 July 1975
R(ABCD)
GIVEN :
AB->C
BC->D
Solution:
No Partial dependency exist for the given relation hence it is in 2 NF but not in 3NF
Boyce-Codd Normal Form
• Stricter form of 3NF
• A table T is in BCNF iff
– for every one of its non-trivial dependencies
X → Y, X is a candidate key for T
• We can assume
– Manager → Branch
– {Project, Branch} → Manager
Example
Manager Project Branch
Alice Alpha Austin
Bob Delta Houston
Carol Alpha Houston
Alice Delta Austin
a1 b1 c1
a2 b1 c1
a1 b2 c2
Decomposition
A B C
a1 b1 c1
a2 b1 c1
a1 b2 c2
A B C
a1 b1 c1
a2 b2 c2
Decomposition
A B C
a1 b1 c1
When we need to combine
a2 b2 c2 again we HAVE TO PERFORM
natural join (CROSS PRODUCT )
R1xR2
A B C
a1 b1 c1
a1 b2 c2
a2 b1 c1
a2 b2 c2
Lossy Decomposition
A B C
A B C
a1 b1 c1
a1 b1 c1
a2 b2 c2 a2 b1 c1
a1 b2 c2
A B C Original table
a1 b1 c1
a1 b2 c2
a2 b1 c1
a2 b2 c2
Wrong approach
a1 b1 c1
a2 b1 c1
a1 b2 c2
A B A C
a1 b1 a1 c1
a2 b1 a2 c1
a1 b2 a1 c2
Decomposition
A B A C
a1 b1 a1 c1
a2 b1 a2 c1
a1 b2 a1 c2
a1 b1 c1
a1 b1 c2
a2 b1 c1
a1 b2 c1
a1 b2 c2
Lossy Decomposition
A B A C
a1 b1 a1 c1
a2 b1 a2 c1
a1 b2 a1 c2
A B C
a1 b1 c1
A B C
a2 b1 c1
a1 b1 c1
a1 b2 c2
a1 b1 c2
Extra tuple a2 b1 c1 Original table
a1 b2 c1
a1 b2 c2
Solution 3-Decomposition
A B C
a1 b1 c1
a2 b1 c1
a1 b2 c2
A B B C
a1 b1 b1 c1
a2 b1 b2 c2
a1 b2
Solution
A B
3-Decomposition
B C
a1 b1
b1 c1
a2 b1
b2 c2
a1 b2
A B C
a1 b1 c1
a2 b1 c1
a1 b2 c2
Q. 1 LOSSLESS Decomposition?
Example 1:
• R(ABC), FD: {A->B, B->C, C->A}
• R1(AB), R2(BC)
Q. 1 LOSSLESS DEPENDENCY?
Ans. Yes
Functional dependency preserving
• The dependency preservation decomposition is another property of decomposed
relational database schema D in which each functional dependency X -> Y
specified in F either appeared directly in one of the relation schemas Ri in the
decomposed D or could be inferred from the dependencies that appear in some
Ri.
Example: Functional dependency preserving
Example: Functional dependency preserving
• 1) To find the closure of FD1, we have to consider all combinations of (P, Q, R). i.e., we need to find
out the closure of P, Q, R, PQ, QR, and RP.
• 1) To find the closure of FD1, we have to consider all combinations of (P, Q, R). i.e., we need to find
out the closure of P, Q, R, PQ, QR, and RP.
• From the given result, in FD1, PQ holds R (PQ --> R) and in FD2, R holds S
(R --> S). But, there is no follow up in Functional Dependency S holds P (S --
> P).
• FD1 U FD2 is a subset of FD.
• So as a consequence, given decomposition is not dependency preserving.
Example 1: R1(AB) R2(BC)
Q. 1 LOSSLESS DEPENDENCY?
Ans. Yes
Q. 2 Dependency preserving?
Example 1:
• R(ABC), FD: {A->B, B->C, C->A}
• R1(AB), R2(BC)
Q. 1 LOSSLESS DEPENDENCY?
Ans. Yes
Q. 2 Dependency preserving?
Ans. Yes
Example 2:
• R(ABCD), FD: {AB->CD, D->A}
• R1(AD), R2(BCD)
Q. 1 LOSSLESS DEPENDENCY?
Q. 2 Dependency preserving?
Example 2:
• R(ABCD), FD: {AB->CD, D->A}
• R1(AD), R2(BCD)
Q. 1 LOSSLESS DEPENDENCY?
ANS. YES
Q. 2 Dependency preserving?
Example 2:
R1(AD) R2(BCD)
• R(ABCD), FD: {AB->CD, D->A}
• R1(AD), R2(BCD) A->D (NOT HOLD) B->B
D->A C->C
D->D
Q. 1 LOSSLESS DEPENDENCY? BC->BC
CD->CD
ANS. YES BD->BDC
Q. 2 Dependency preserving?
Example 2: ONLY NON TRIVIAL DEPENDENCY
R1(AD) R2(BCD)
• R(ABCD), FD: {AB->CD, D->A}
• R1(AD), R2(BCD) D->A BD->C
Q. 1 LOSSLESS DEPENDENCY?
ANS. YES
Q. 2 Dependency preserving?
ANS NO
Degree of Redundancy
1NF>2NF>3NF>BCNF
0 % redundancy for FD
Not 0 % redundancy for MVD
DB Design Goals 1NF 2NF 3NF BCNF
1. 0% REDUNDANCY NO NO NO YES(FD)
NO(MVD)
Restaurant Pizza
Pizza Milano Thin crust
Pizza Milano Thick crust
Pizza Firenze Thin crust
Pizza Firenze Thick crust
Join dependency
Store Brand
Circuit City Apple
Circuit City Toshiba
CompUSA Apple
Lossless Decomposition: General Concept
• R = AB(R) ⋈ AC(R)
– AB(R) is the projection of R on AB
– ⋈ is the natural join operator
Example
Course Text
Course, Text (R) 4330 none
3330 Patterson & Hennessy
Course Instructor
4330 Paris
Course, Instructor (R)
4330 Cheng
3330 Hillford
A different case
Course Instructor
4330 Paris
Course, Instructor (R) 4330 Cheng
3330 Hillford
Levels of Normalization
• Levels of normalization based on the amount
of redundancy in the database.
• Various levels of normalization are:
– First Normal Form (1NF)
Redundancy
– Second Normal Form (2NF)
– Third Normal Form (3NF)
Number of Tables
– Boyce-Codd Normal Form (BCNF)
– Fourth Normal Form (4NF)
Complexity
– Fifth Normal Form (5NF)
– Domain Key Normal Form (DKNF)
Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFor
orBCNF
BCNFininorder
ordertotoavoid
avoidthe
thedatabase
databaseanomalies.
anomalies.
An Example
Normalisation Example
– Second table
Order Customer Address
2NF Solution (II)
• Second decomposition
– First table
Order Product Quantity
– Second table
Product UnitPrice
3NF
• In second table
Order Customer
Customer Address
Normalisation to 2NF
• 1NF:
– {Order, Product, Customer, Address, Quantity, UnitPrice}
• 2NF:
– {Order, Customer, Address}, {Product, UnitPrice}, and
{Order, Product, Quantity}
• 3NF:
– {Product, UnitPrice}, {Order, Product, Quantity},
{Order, Customer}, and {Customer, Address}
Another Example
First Normal Form (1NF)
Example (Not 1NF)
AuName
AuNameand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar
1NF - Decomposition
1. Place all items that appear in the repeating group
in a new table
2. Designate a primary key for each new table
produced.
3. Duplicate in the new table the primary key of the
table from which the repeating group was
extracted or vice versa.
ISBN AuName AuPhone
Example (1NF) 0-321-32132-1 Sleepy 321-321-1111
0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333
2. Each manager can have more than one childMary NULL Adam
3. Each manager can supervise more than one employee
4. 4NF Violated
A. Entities
B. Rows
C. Attributes
D. Tables
MCQ
A functional dependency is a relationship between or
among
A. Entities
B. Rows
C. Attributes
D. Tables
MCQ
If one attribute is determinant of second, which in turn is
determinant of third, then the relation cannot be:
A. Well-structured
B. 1NF
C. 2NF
D. 3NF
MCQ
If one attribute is determinant of second, which in turn is
determinant of third, then the relation cannot be:
A. Well-structured
B. 1NF
C. 2NF
D. 3NF
MCQ
A relation is in this form if it is in BCNF and has no
multivalued dependencies
A. 2NF
B. 3NF
C. 4NF
D. DKNF
MCQ
A relation is in this form if it is in BCNF and has no
multivalued dependencies
A. 2NF
B. 3NF
C. 4NF
D. DKNF
MCQ
In the __________ normal form, a composite
attribute is converted to individual attributes.
a) First
b) Second
c) Third
d) Fourth
MCQ
In the __________ normal form, a composite
attribute is converted to individual attributes.
a) First
b) Second
c) Third
d) Fourth
MCQ
Which is a bottom-up approach to database
design that design by examining the relationship
between attributes:
a) Functional dependency
b) Database modeling
c) Normalization
d) Decomposition
MCQ
Which is a bottom-up approach to database
design that design by examining the relationship
between attributes:
a) Functional dependency
b) Database modeling
c) Normalization
d) Decomposition