NORMALIZATION
BASIC NORMAL FORMS
INTRODUCTION
When we carefully define an E-R diagram, identifying
all entities correctly, thetables generated from
diagram should not need further normalization.
theE-R
However, there can be
functional
between attributes of an entity. dependencies
Normalization makes the database more able to
accommodate changes in the structure of the data. It
also protects the database against certain kinds of
errors.
Normalization is a process of rearranging the
database to put it into a standard (normal) form that
prevents from modification anomalies.
2
DIFFERENT NORMAL FORMS
1NF: based on attributes only
2NF, 3NF, BCNF: based on keys and FDs
4NF: based on keys and multi-valued dependencies (MVDs)
5NF or PJNF: based on keys and join dependencies
DKNF: based on all constraints
3
1. FIRST NORMAL FORM (1NF)
All Attributes must be atomic and no repeating groups
There are a couple of extra properties added on to make
the database more useful, but mostly these rules are pretty
basic.
1. Each column must have a unique name.
2. The order of the rows and columns doesn’t matter.
3. Each column must have a single data type.
4. No two rows can contain identical values.
5. Each column must contain a single value.
4
6. Columns cannot contain repeating groups.
1NF – DEMONSTRATION
5
STEPS TO MAKE THIS RELATION INTO 1NF.
1. Make sure every column has a unique name.
This table has two columns named City. To fix these
problems, rename those columns to StartCity and
DestinationCity.
6
2. Make sure the order of the rows and columns
doesn’t matter. If the order of the rows matters,
add a column to record the information implied by
their positions. In this example, make a new
Priority column and explicitly list the passengers’
priorities.
7
3. Make sure each column holds a single data type.
If a column holds more than one type of data, split it
into multiple columns, one for each data type. In
this case, list the connecting cities, and don’t even
record the number of cities. Just count them when
necessary.
8
4. Make sure no two rows can contain identical
values. If two rows contain identical values, add a
field to differentiate them. In this case, add a
CustomerId column so you can tell the customers
apart.
9
10
5. Make sure each column contains a single value. If a column
holds multiple data values, split them out into a new table. Make
a new Connections table and move the connection data there.
To tie those records back to their original rows in this table, add
columns that correspond to the primary key columns in the
original table (CustomerId and Date).
11
6. Make sure multiple columns don’t contain repeating groups.
In this example, you need to think about the two city fields
and decide whether they contain distinguishable values.
12
CLASS WORK
Make the following into
1NF
3
1
13
1 SOLUTION
S
14
PRIME ATTRIBUTE, FULL FUNCTIONAL DEPENDENCY
AND TRANSITIVE FUNCTIONAL DEPENDENCY
A prime attribute must be a member of some candidate key
Example: roll
A non-prime attribute is not a member of any candidate key
Example: gender
A FD X Y is a full functional dependency if the FD does not hold when any attribute
from X is removed
Example: (roll) (name)
It is a partial functional dependency otherwise
(roll, gender) (name)
A FD X Y is a transitive functional dependency if it can be derived from two FDs X
Z and Z Y
Example: (roll) (hod) since (roll) (deptid) and (deptid) (hod) hold
It is non-transitive otherwise 15
Example: (roll) (name)
2. SECOND NORMAL FORM (2NF)
A relation is in 2NF if
Every non-prime attribute is fully functionally dependent on every
candidate key
(Or)
Alternatively, every attribute should either be
In a candidate key or
Depend fully on every candidate key
(Or)
1. It is in 1NF.
2. All of the non-key fields depend on all of the key
fields. 16
If a table is not in second normal form, remove
those non–key fields that are not dependent on
the whole of the primary key. Create another
table with these fields and the part of the primary
key on which they do depend.
17
PROBLEM
Consider (Id, ProjId, Hrs, Name, ProjName) with FDs:
(Id, ProjId) (Hrs);
(Id) (Name);
(ProjId)
(ProjName)
It is not in 2NF since
(Name) depends
partially on (Id, ProjId)
After 2NF normalization
(Id, ProjId, Hrs) with FD: (Id, ProjId) (Hrs)
(Id, Name) with FD: (Id) (Name)
(ProjId, ProjName) with FD: (ProjId)
(ProjName) 18
PROBLEM :2
19
PROBLEM:3
20
CLASS WORK
Make the following relations into 2NF
1.
21
2.
22
3.
R = {part_no, part_description, part_price,
supplier_id, supplier_address}
23
4.
24
SOLUTION
1.
S
25
2.
26
3.
part_no → part_description
supplier_id → supplier_address
{part_no, supplier_id} →
part_price
27
4.
28
3. THIRD NORMAL FORM (3NF):
A relation is in 3NF if
It is in 2NF, and
No non-prime attribute is transitively functionally dependent on
the candidate keys
Alternatively, for every FD X Y, either
It is trivial, or
X is a superkey, or
Every attribute in Y - X is prime
Alternatively, every non-prime attribute should be
Fully functionally dependent on every key, and
29
Non-transitively dependent on every key
PROBLEM 1
Consider (Id, Name, ProjId, ProjName) with FDs:
(Id) (Name, ProjId); (ProjId) (ProjName)
It is not in 3NF since (ProjName) depends transitively on (Id) through
(ProjId)
After 3NF normalization,
(Id, Name, ProjId) with FD: (Id) (Name, ProjId)
(ProjId, ProjName) with FD: (ProjId)
(ProjName)
30
PROBLEM :2
31
CLASS WORK
Make the following relations into 3NF
1.
Student (Roll no, name, dept, year,
hostelname)
- If students in a given year are all put in one hostel then year and the hostel
are functionally dependent
- Year implies hostel, hostel name unnecessarily duplicated
- If all students of year 1 are moved to another hostel many tuples need to be
changed.
32
2.
Employee (empcode, name, salary, project no,
termination date of project)
3.
Passenger (Ticket no, Passenger name, Train no,
Departure time, Fare)
4.
33
5.
6.
34
7.
8.
Student_id, Student_name, DOB, Street, city, State, Zip
35
SOLUTION
S
1.
Student ( Roll no, name, dept, year)
Hostel (year, hostel)
2.
Employee(empcode,name,salary,projectno)
Project( Projectno ,termination date of project)
3.
Passenger (Ticket no, Passenger name, Train no, Fare)
Train details (Train no, departure time)
36
4.
5.
37
6.
7.
8.
Student_id, Student_name,
DOB, Zip, Zip, Street, city, state
38
3 NF ALL TOGETHER
PROBLEM
39
CONT…
2 NF:
PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
3NF:
PROJECT (PROJ_NUM, PROJ_NAME)
ASSIGN (PROJ_NUM, EMP_NUM, HOURS)
EMPLOYEE (EMP_NUM, EMP_NAME,
JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR)
40
CLASS WORK
Normalize the following relation upto 3NF
41
SOLUTION
•First Normal Form (1NF)
42
• Second Normal Form (2NF)
43
•Third Normal Form
(3NF)
44
SUMMAR
Y
Informally
1NF: All attributes depend on the key
2NF: All attributes depend on the whole key
3NF: All attributes depend on nothing but the key
Tests
1NF: The relation should have no multivalued attributes or nested relations
2NF: For a relation where candidate key contains multiple attributes, no nonkey
attribute should be functionally dependent on a part of the candidate key
3NF: The relation should not have a nonkey attribute functionally determined by
a set of nonkey attributes.
Remedies
1NF: Form new relations for each multi-valued attribute or nested relation.
2NF: Decompose and set up a relation for each partial key with its dependent(s);
retain the primary key and attributes fully dependent on it. 45 3NF: Decompose
and set up a relation for each nonkey attribute with nonkey attributes