0% found this document useful (0 votes)
148 views

Chapter 7 Normalization

The document discusses normalization in databases. The purpose of normalization is to organize attributes into relations to reduce data redundancy and anomalies. Normalization is achieved through several normal forms - first normal form removes repeating groups, second normal form removes partial dependencies, third normal form removes transitive dependencies, and Boyce-Codd normal form ensures every determinant is a candidate key. The document provides examples of tables with anomalies and how to normalize them through different forms.

Uploaded by

Jiawei Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views

Chapter 7 Normalization

The document discusses normalization in databases. The purpose of normalization is to organize attributes into relations to reduce data redundancy and anomalies. Normalization is achieved through several normal forms - first normal form removes repeating groups, second normal form removes partial dependencies, third normal form removes transitive dependencies, and Boyce-Codd normal form ensures every determinant is a candidate key. The document provides examples of tables with anomalies and how to normalize them through different forms.

Uploaded by

Jiawei Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Normalization

WITH EXAMPLES
Purpose of normalization

  the purpose of normalization is to identify the best


grouping for attributes that ultimately forms relations.
 Some of the characteristics of relations formed are;
 Support of data requirements with minimal number of attributes
 Relation holds attributes with a close logical relationship
(functional dependency)
 Relation holds minimal redundancy with each attribute (except
foreign keys)
 increasing the performance of updates
 reducing the storage consumption
 avoiding update anomalies (insertion, modification and deletion)
Why Normalization?
 The database design must be efficient (performance-wise) - should be
free of update, insertion and deletion anomalies.

 3 anomalies to avoid:
 Insertion Anomaly – need to store a value for an
attribute but can not because the value for
another attribute is unknown
 Deletion Anomaly – deleting rows may cause a
loss of important information about the entity
 Modification/update Anomaly – occurs when a
change of a single attribute in one record requires
changes in multiple records
4

Example

 Anomalies in this Table :


 Insertion
 Deletion
 Modification
Anomalies in the table 5

 Insertion – can’t enter a new employee if


the new employee doesn’t assign to any
course (is not allow for NULL value).

Chapter 6 Normalization
Anomalies in the table 6

 Deletion – if we remove employee 140,


we lose information about the existence
of a Tax Acc class

Chapter 6 Normalization
Anomalies in the table 7

 Modification/update – giving a salary


increase to employee 100 forces us to update
multiple records

Chapter 6 Normalization
Functional dependency (FD)

 FD  the relationship between attributes in a relation.


 For an example, if EmployeeCode and FirstName are attributes
of Employee relation, we can say that FirstName is functionally
dependent on EmployeeCode. This means, each EmployeeCode is
associated with exactly one value of FirstName.
 We denote this like;
 EmployeeCode  -> FirstName
 Basically the left-hand side of the arrow is considered as
the determinant. The relationship between left to right is always
one to one (1:1).
 If the right-hand attribute is fully dependent on left-hand side, we
call this dependency as full functional dependency.
Types of functional dependency

 Full functional dependency


 Partial dependency
 If the left-hand side is a composite one (two or more attributes) and
right-hand side can be determined by part of left-hand side, then the
dependency is considered as partial dependency
 For example:
 Product code, receipt number  product name
 Product code  product name
* product name is partially dependent on
product code
 *Transitive dependency – see later
First normal form

 In order to make sure that the relation is normalized for 1NF, we


need to make sure that;
 No multiple values in intersection of each row and column
 No repeatable groups in attributes (like Course1, Course2, Course3...
columns)
 Order of attributes and tuples are insignificant
 No duplicate tuples in the relation.
 For example:
First normal form

 Since the relation has no multiple values in


intersections and no repeatable groups, it is now a
1NF relation
Second normal form

 In order to make sure that the relation is normalized for 2NF,


we need to make sure that;
 The table need to be in 1NF
 Every non-primary key attribute is fully dependent on the primary
key (There should not be partial dependency between primary key
and non primary key)
 thus, in the form, the main task is to remove partial
dependency.
 Step 1: identify the functional dependencies
 StudentCode, Course  ->  DateRegistered
 StudentCode  ->  Name, Town, Province
 Town  ->  Province
 Step 2: identify the candidate key
 Studentcode + course
 Step 3: identify the type of functional dependency
 Full FD : StudentCode, Course  ->  Name, Town, Province,
Course, DateRegistered
 Partial dependency : StudentCode  ->  Name, Town, Province
 *Transitive dependency
Second Normal form

 Step 4: to remove partial dependency from the original


table (Taking studentcode + course as primary key)
Third normal form

 In order to make sure that the relation is


normalized for 3NF, we need to make sure that;
 The relation need to be in 1NF and 2NF
 no non-primary-key attribute is transitively
dependent on the primary key
 Transitive dependency:
A B
B C
 Thus, AC
Third normal form

 In this form, is to remove transitive dependency if


they are exist
Transitive dependency
 Studentcode  Town
 Town  Province
Boyce-Codd normal form
(BCNF/3.5NF)
 In order to make sure that the relation is
normalized for BCNF, we need to make sure that;
 Relation need to be in 1Nf, 2NF and 3NF.
 every non-primary-key determinant is a candidate key with
identified functional dependencies. The definition goes
as A relation is in BCNF, if and only if, every determinant
is a candidate key.
 What does it exactly means? 
For example

 Assume that business rules related to this relation are as follows;


 Course has one or more subjects.
 Course is managed by one or more lecturers.
 Subject is taught by one or more lecturers.
 Lecturer teaches only one subject.

Let's list out all possible


functional dependencies.

Course, Subject  ->  Lecturer


Course, Lecturer  ->  Subject
Lecturer  ->  Subject
 If you consider the primary key of this table is Course + Subject,
then no violation of 1NF, 2NF and 3NF.
 Course + Lecturer is also a candidate key as we can identify
tuples uniquely using it 
 cannot make Lecturer as a primary key because it has
duplicates. Now you have a determinant that cannot be set as a
primary key, hence it violates BCNF.
Let's list out all possible
functional dependencies.

Course, Subject  ->  Lecturer


Course, Lecturer  ->  Subject
Lecturer  ->  Subject
BCNF

 In order to make the table BCNF


table, need to decompose as below:
Forth Normal Form (4NF)

 This normal form handles multi-valued


dependencies caused by 1NF.
 When we see repeated groups or multiple values in
an intersection, we add additional tuples removing
multiple values. That is what we do with 1NF.
 When there are two multi-value attributes in a
relation, then each value in one of the attributes has
to be repeated with every value of the other
attribute. This situation is referred as a multi-valued
dependency.
Forth Normal form

 In order to make sure that the relation is normalized for BCNF,


we need to make sure that;
 A relation that is in Boyce-Codd normal form and
 does not contain nontrivial multi-valued dependencies.
(meaning tit should be a trivial dependency)
 What is trivial dependency:
 A ->B is trivial functional dependency if B is a subset of A.
 The following dependencies are also trivial: A->A & B->B
 For example: Consider a table with two columns Student_id and
Student_Name.
 Student_id  student_id student_id + student_name  student_id
 Student_name  student_name
For example:

 If we apply 1NF to this relation;


Forth Normal form

 See the CustomerContacts table.
  CustomerCode determines multiple Telephone (CustomerCode
->> Telephone) and 
 CustomerCode determines multiple Address (CustomerCode  ->>
 Address).
 The above are non-trivial dependency, thus need to remove:

You might also like