Lecture 5
Normalization
Lecturer: Rana Salah
[email protected]Room:3005
Made by:
Shahinaz S. Azab
Edited by:
Mona Saleh
Rana Salah
What is Normalization?
• Normalization of data- a process that takes a table
through a series of tests (normal forms) to certify the
goodness of a design and thus to minimize
redundancy and anomalies (insert, update, delete
anomalies)
2
Why do we need Normalization?
3
Normalization Avoids
• Duplication of Data
• Insert Anomaly
• Delete Anomaly
• Update Anomaly
• Frequent Null Values
4
When to use Normalization?
• To certify the goodness of a relational schema design
• When acquiring existing database design from
previous legacy models, or from existing files
5
Functional Dependency
• A constraint between two attributes (columns) or two
sets of columns
• A → B if “for every valid instance of A, that value of A
uniquely determines the value of B”
• Or …A →B if “there exists at most one value of B for
every value of A”
6
Examples
• Social security number determines employee name
SSN -> ENAME
• Project number determines project name and location
PNUMBER -> {PNAME, PLOCATION}
• Employee SSN and project number determines the
hours per week that the employee works on the
project
{SSN, PNUMBER} -> HOURS
• So functional dependency is the technical term
for determines
7
Types of Functional Dependency
FD4
FD5
8
Types of Functional Dependency
• Full Functional Dependency- X → Y is a FFD if
removal of any attribute A from X means that the
dependency does not hold any more
• Partial Functional Dependency- X→Y is a PFD if
some attribute A ε X can be removed from X and the
dependency still holds
• Transitive Functional Dependency- X→Y in a relation
R is a TFD if there exists a set of attributes Z in R that
is neither a primary key nor a subset of any key of R,
and both X→Z and Z→Y hold
9
Definition
• Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations
• Normal form: Condition using keys and FDs of a relation
to certify whether a relation schema is in a particular
normal form
10
First Normal Form
• A relation is in 1NF if it contains no multivalued,
repeating groups or composite attributes
To put a relation in 1NF
• Remove each repeating group and place it in a new
table carrying the PK as a FK
• Remove each multivalued attribute and place it in a
new table carrying the PK as a FK
• Put composite attribute subparts each in a column
when necessary
11
School Example
Stud_ID Name Location Tel Level Level_ Subjec Subj-Desc G
Mgr t
11 Ali Cairo 010 Primary Noha M. DB, CN Database, A,
Networks B
22 Mai Giza 011, Primary Noha M CN, DB Networks, B,
010 Database C
33 Marwa Giza 010 Secon. Moh.A. SW, DB Software, A,
Database A
12
School Example 1NF
Stud_ID Name Location Level Level_Mgr
Stud_ID Tel
Stud_ID Subject Subject_Desc Grade
13
Second Normal Form
• A relation is in 2NF if it is in 1NF and every nonkey
attribute is not partially dependent on the primary key
To put a relation in 2NF
• Remove partial functional dependent non-keys
carrying the key they depend on and place them in a
new table
14
Guidelines
• A relation is in 2NF if it is in 1NF and any one of these
is true:
• the PK consists of only 1 attribute
• all attributes are part of the PK (no nonkey attributes)
• every non key attribute is functionally dependent on the whole
PK
15
School Example 2NF
Stud_ID Name Location Level Level_Mgr
Stud_ID Tel
Stud_ID Subject Grade
Subject Subject_Desc
16
Third Normal Form
• A relation is in 3NF if it is in 2NF and no transitive
dependencies exist
To put a relation in 3NF
• Remove the nonkey attributes carrying the nonkey
attribute they depend on and place them in a new
table. (Hint: leave the nonkey they depend on in the
same table as well)
17
School Example 3NF
Student
Stud_ID Name Location Level
Level
Level Level_Mgr
Student_Tel
Stud_ID Tel
Stud_Subject
Stud_ID Subject Grade
Subject
Subject Subject_Desc
18
ITI Example
ITI Students Sheet
Student Number: ITI205-40 F-code: ENG
Student Name: Hassan Ali Ahmed Faculty: Engineering
Address(Street, City): 12 Haram st, Major: Computer
giza
Tel no/Mobile: 33868420
01111111253
Department Name Department Description Admission grade Comments
ERP-SAP ERP-SAP Functional 59 Average personality
Consultant
Java -MAD Java mobile applications 70 Very Good
developer
CS Cyber Security 60 Above average technical
19
1NF
• Student (Stud_No,Stud_Name, F-code, Faculty,
Major, Street, City)
• Student_Tel (Stud_No, Tel_No)
• Department_Student (Dept_Name, Stud_No,
Dept_desc , Ad_Grade, Comments)
20
2NF
• Student (Stud_No,Stud_Name, F-code, Faculty,
Major, Street, City)
• Student_Tel (Stud_No, Tel_No)
• Department_Student (Dept_Name, Stud_No,
Ad_Grade, Comments)
• Department ( Dept_Name, Dept_Desc)
21
3NF
• Student (Stud_No,Stud_Name, F-code,Major, Street,
City)
• Faculty (F-code, Faculty)
• Student_Tel (Stud_No, Tel_No)
• Department_Student (Dept_Name, Stud_No,
Ad_Grade, Comments)
• Department ( Dept_Name, Dept_Desc)
22
Real World - School Data
Student
First Parent 1 Parent 2 Application No
Renee Ann Jones Theodore Smith 123
Lucy Barbara Mills Steve Mills 558
Jennifer
Brendan Jones Stephen Jones 145 …
Current
City Postal Code Birth date Previous Teacher Teacher
Annandale 22003 6/25/1983 Hamil Burke
Annandale 22003 8/14/1983 Hamil Burke
Fairfax 22032 6/13/1984 Hamil Burke
Attended/
Student_Phone Course Course-desc Enrolled days
(703) 323-0893,
(703) 3240708 X,Y,Z X,y,z 96/97, 96/97, 97/98 0,0,0
(703) 764-5829 Y Y 96/97 0
(703) 978-1083 Z Z 96/97 0
23
ONF
• Student (App_No, Stud_Fname, Parent1, Parent2,
City, Postal_Code, Birthdate, Prev_Teacher,
Curr_Teacher, Student_Phone, Course,
Course_Desc, Enrolled, Att_Days)
24
1NF
• Student (App_No, Stud_Fname, Parent1, Parent2,
City, Postal_Code, Birthdate, Prev_Teacher,
Curr_Teacher)
• Student_Course (App_No,Course, Course_Desc,
Enrolled, Att_Days)
• Student_Phone ( App_No,Phone)
25
2NF
• Student (App_No, Stud_Fname, Parent1, Parent2,
City, Postal_Code, Birthdate, Prev_Teacher,
Curr_Teacher)
• Student_Course (App_No,Course, Enrolled,
Att_Days)
• Student_Phone ( App_No,Phone)
• Course (Course, Course_Desc)
26
3NF
• Student (App_No, Stud_Fname, Parent1, Parent2,
Postal_Code, Birthdate, Prev_Teacher,
Curr_Teacher)
• Student_Course (App_No,Course, Enrolled,
Att_Days)
• Student_Phone ( App_No,Phone)
• Course (Course, Course_Desc)
• City (City, Postal_Code)
27
Suppliers Data
• Relation (s#, country, currency, p#, qty)
where
• s# supplier identification number (this is the primary key)
• country name of country where supplier is located
• currency: Currency of the country of each supplier
• p# part number of part supplied
• qty quantity of parts supplied to date
• In order to uniquely associate quantity supplied (qty) with part
(p#) and supplier (s#), a composite primary key composed of
s# and p# is used.
28
1NF
• Supplier (S#, country, currency)
• Supplier_Parts (S#,P#,qty)
29
2NF
Same as First
• Supplier (S#, country, currency)
• Supplier_Parts (S#,P#,qty)
30
3NF
• Supplier (S#, Country)
• Country (Country, Currency)
• Supplier_Parts (S#,P#,qty)
31
Sales Order
Fiction Company
202 N. Main
Mahattan, KS 66502
CustomerNumber: 1001 Sales Order Number: 405
Customer Name: ABC Company Sales Order Date: 2/1/2000
Customer Address: 100 Points Clerk Number: 210
Manhattan, KS 66502 Clerk Name: Martin Lawrence
Item Ordered Description Quantity Unit Price Total
800 widgit small 40 60.00 2,400.00
801 tingimajigger 20 20.00 400.00
805 thingibob 10 100.00 1,000.00
Order Total 3,800.00
32
Questions?
33