0% found this document useful (0 votes)

36 views54 pages

Data Base - Database - Databse Chapter 9

The document provides an overview of database normalization, explaining its importance in organizing data to eliminate redundancy and prevent anomalies during data operations. It details the various forms of normalization, including First, Second, and Third Normal Forms, as well as Boyce-Codd Normal Form and Fourth Normal Form, highlighting the rules and conditions for each. The document also illustrates common issues related to data redundancy and the solutions provided by normalization techniques.

Uploaded by

shabir.ahmad1317

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views54 pages

Data Base - Database - Databse Chapter 9

Uploaded by

shabir.ahmad1317

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Faculty of Computer Science

Fundamentalsof DatabaseSystems
(Database I)

Lecturer:Barakzai
Fundamentalsof Database Systems

Normalization
What is normalization in database?
• Database Normalization is a technique of organizing
the data in the database.
• Normalization is a systematic approach of
decomposing tables to eliminate data
redundancy(repetition) and undesirable characteristics
like Insertion, Update and Deletion Anomalies.
• It is a multi-step process that puts data into tabular
form, removing duplicated data from the relation tables.
Continue…
• Normalization is used for mainly two purposes,
• Eliminating redundant(useless) data.
• Ensuring data dependencies make sense i.e data is logically
stored.
Why normalization is important in database?

• Normalization is a technique for organizing data in a

database. It is important that a database is normalized to
minimize redundancy (duplicate data) and to ensure only
related data is stored in each table. It also prevents any
issues stemming from database modifications such as
insertions, deletions, and updates.
• So we will start with understanding what all problems arise if a table
or database is not normalized and how normalization solve these
problems?
• Normalization is actually a technique of organizing the data into
multiple related tables, to minimize data redundancy.
• So now the question is what is data redundancy?
• Data redundancy is nothing but the repetition of similar data at
multiple places.
• Why we want to reduce data redundancy?
• We want to reduce it because the repetition of data increases the
size of database.
It also leads to multiple other issues like
• Insertion problems
• Deletion problems
• Updation problems
Roll number Name Branch Head of Office
department telephone
1 Ali BCS Mr.Sabit 12345

2 Ahmad BCS Mr.Sabit 12345

3 Sara BCS Mr.Sabit 12345

4 Omar BCS Mr.Sabit 12345

• So from the table we can observe that the branch, hod and
office telephone number is the same for all the entries of
the table so whenever a new entry is added in the table so
we must repeat the same thing again and again so that is
called data redundancy.
• We can conclude from this that data redundancy leads to
• Insertion anomaly
• Deletion anomaly
• Updation anomaly
• Insertion anomaly; In this case if we want to create a new entry in
the table we have to type repeatedly the same data and if we do it
for the more 500 students so it is obvious it will consume more
space.
( To insert redundant data for every new row of student data in our
class is a data insertion problem or anomaly)
• Deletion anomaly; In this case if we delete one student information
the branch information will also be deleted and if we delete all
students from the table so we will lose the entire branch
information from the table.
(Loss of a related dataset when some other dataset is deleted is
called Deletion anomaly)
Continue…

• Updation anomaly; In this case we can present an example

like if head of department leaves the university for any reason
and the new head comes so we have to update all student
rows and if we miss any of these rows so it will lead to data
inconsistency.
So now the question is how the normalization will solve all these
problems?
So the normalization will break the existence table into to different
table.
student table branch table
Roll number Name Branch Branch Head of Office
department telephone

1 Ali Bcs Bcs Mr.Sabit 12345

2 Ahmad Bcs

3 Sara Bcs
Types of normalization
• Normalization can be achieved in multiple ways
• We have 3 basic forms of normalization but the forms are divided
into further forms.
• 1st normal form
• 2nd normal form
• 3rd normal form
• BCNF (Boyce code normal form)
• 4th normal form
• 5th normal form
What is First Normal Form (1NF)?
• Rules for First Normal Form
The first normal form expects you to follow a few simple rules
while designing your database, and they are:
• Rule 1: Single Valued Attributes
Each column of your table should be single valued which means
they should not contain multiple values. We will explain this with
help of an example later, let's see the other rules for now.
• Rule 2: Attribute Domain should not change
This is more of a "Common Sense" rule. In each column the
values stored must be of the same kind or type.
Continue…
For example: If you have a column dob to save date of births of a set of
people, then you cannot or you must not save 'names' of some of them in
that column along with 'date of birth' of others in that column. It should
hold only 'date of birth' for all the records/rows.

• Rule 3: Unique name for Attributes/Columns

This rule expects that each column in a table should have a unique name.
This is to avoid confusion at the time of retrieving data or performing any
other operation on the stored data.
If one or more columns have same name, then the DBMS system will be
left confused.
Continue…
• Rule 4: Order doesn't matters
This rule says that the order in which you store the data in your table
doesn't matter.

Example:
• Although all the rules are self explanatory still let's take an
example where we will create a table to store student data which
will have student's roll no., their name and the name of subjects
they have opted for.
• Here is our table, with some sample data added to it.
Continue…
• Our table already satisfies 3 rules out of the 4 rules, as all our
column names are unique, we have stored data in the order we
wanted to and we have not inter-mixed different type of data in
columns.
• But out of the 3 different students in our table, 2 have opted for
more than 1 subject. And we have stored the subject names in a
single column. But as per the 1st Normal form each column must
contain atomic value.
How to solve this Problem?
• It's very simple, because all we have to do is break the values into
atomic values.
• Here is our updated table and it now satisfies the First Normal
Form.
Continue…
• By doing so, although a few values are getting repeated but values for
the subject column are now atomic for each record/row.

• Using the First Normal Form, data redundancy increases, as there will be
many columns with same data in multiple rows but each row as a whole
will be unique.
What is Second Normal Form?
• For a table to be in the Second Normal Form, it must satisfy two
conditions:
1.The table should be in the First Normal Form.
2.There should be no Partial Dependency.

• What is Dependency?
• Let's take an example of a Student table with columns student_id, name,
reg_no(registration number), branch and address(student's home
address).
Continue…
• In this table, student_id is the primary key and will be unique for every
row, hence we can use student_id to fetch any row of data from this
table
• Even for a case, where student names are same, if we know the
student_id we can easily fetch the correct record.
Continue…
• Hence we can say a Primary Key for a table is the column or a group of
columns(composite key) which can uniquely identify each record in the
table.

• I can ask from branch name of student with student_id 10, and I can get
it. Similarly, if I ask for name of student with student_id 10 or 11, I will
get it. So all I need is student_id and every other column depends on it,
or can be fetched using it.

• This is Dependency and we also call it Functional Dependenc

What is Partial Dependency?
• Now that we know what dependency is, we are in a better state to
understand what partial dependency is.

• For a simple table like Student, a single column like student_id can
uniquely identfy all the records in a table.

• But this is not true all the time. So now let's extend our example to see if
more than 1 column together can act as a primary key.

• Let's create another table for Subject, which will have subject_id and
subject_name fields and subject_id will be the primary key.
Continue…
• Now we have a Student table with student information and
another table Subject for storing subject information.
• Let's create another table Score, to store the marks obtained by
students in the respective subjects. We will also be saving name
of the teacher who teaches that subject along with marks.
Continue…
• In the score table we are saving the student_id to know which student's
marks are these and subject_id to know for which subject the marks are
for.

• Together, student_id + subject_id forms a Candidate Key(learn about

Database Keys) for this table, which can be the Primary key.

• Confused, How this combination can be a primary key?

• See, if I ask you to get me marks of student with student_id 10, can you
get it from this table? No, because you don't know for which subject.
And if I give you subject_id, you would not know for which student.
Hence we need student_id + subject_id to uniquely identify any row.
But where is Partial Dependency?
• Now if you look at the Score table, we have a column names teacher
which is only dependent on the subject, for Java it's Java Teacher and for
C++ it's C++ Teacher & so on.

• Now as we just discussed that the primary key for this table is a
composition of two columns which is student_id & subject_id but the
teacher's name only depends on subject, hence the subject_id, and has
nothing to do with student_id.

• This is Partial Dependency, where an attribute in a table depends on only

a part of the primary key and not on the whole key.
How to remove Partial Dependency?
• There can be many different solutions for this, but out objective is to
remove teacher's name from Score table.

• The simplest solution is to remove columns teacher from Score table and
add it to the Subject table. Hence, the Subject table will become:
Quick Recap
• For a table to be in the Second Normal form, it should be in the First
Normal form and it should not have Partial Dependency.
• Partial Dependency exists, when for a composite primary key, any
attribute in the table depends only on a part of the primary key and not
on the complete primary key.
• To remove Partial dependency, we can divide the table, remove the
attribute which is causing partial dependency, and move it to some other
table where it fits in well.
Third Normal Form (3NF)
• Third Normal Form is an upgrade to Second Normal Form. When a
table is in the Second Normal Form and has no transitive
dependency, then it is in the Third Normal Form.

So let's use the same example, where we have 3 tables, Student,

Subject and Score.
Requirements for Third Normal Form
• For a table to be in the third normal form,
1.It should be in the Second Normal form.
2.And it should not have Transitive Dependency.

What is Transitive Dependency?

With exam_name and total_marks added to our Score table, it

saves more data now. Primary key for our Score table is a
composite key, which means it's made up of two attributes or
columns → student_id + subject_id.
Continue…
Our new column exam_name depends on both student and subject.
For example, a mechanical engineering student will have Workshop
exam but a computer science student won’t.
And for some subjects you have Prctical exams and for some you
don't. So we can say that exam_name is dependent on both
student_id and subject_id.

And what about our second new column total_marks? Does it

depend on our Score table's primary key?
Continue…
Well, the column total_marks depends on exam_name as with exam
type the total score changes. For example, practicals are of less
marks while theory exams are of more marks.

But, exam_name is just another column in the score table. It is not a

primary key or even a part of the primary key, and total_marks
depends on it.

This is Transitive Dependency. When a non-prime attribute depends

on other non-prime attributes rather than depending upon the prime
attributes or primary key.
How to remove Transitive Dependency?
• Again the solution is very simple. Take out the columns exam_name and
total_marks from Score table and put them in an Exam table and use the
exam_id wherever required.
Advantage of removing Transitive Dependency
The advantage of removing transitive dependency is:
• Amount of data duplication is reduced.
• Data integrity achieved.
Boyce-Codd Normal Form (BCNF)
Rules for BCNF
• For a table to satisfy the Boyce-Codd Normal Form, it should
satisfy the following two conditions:
1.It should be in the Third Normal Form.
2.And, for any dependency A → B, A should be a super key.
• The second point sounds a bit tricky, right? In simple words, it
means, that for a dependency A → B, A cannot be a non-prime
attribute, if B is a prime attribute.
Continue…
Below we have a college enrolment table with
columns student_id, subject and professor.
Continue..
• As you can see, we have also added some sample data to the table.

• In the table above:

• One student can enrol for multiple subjects. For example, student with
student_id 101, has opted for subjects - Java & C++
• For each subject, a professor is assigned to the student.
• And, there can be multiple professors teaching one subject like we have
for Java.
Continue…
What do you think should be the Primary Key?

Well, in the table above student_id, subject together form the primary key,
because using student_id and subject, we can find all the columns of the
table.
One more important point to note here is, one professor teaches only one
subject, but one subject may have two different professors.
Hence, there is a dependency between subject and professor here, where
subject depends on the professor name.
Continue…
This table satisfies the 1st Normal form because all the values are atomic,
column names are unique and all the values stored in a particular column
are of same domain.

This table also satisfies the 2nd Normal Form as their is no Partial
Dependency.

And, there is no Transitive Dependency, hence the table also satisfies the
3rd Normal Form.
But this table is not in Boyce-Codd Normal Form.
Why this table is not in BCNF?
In the table above, student_id, subject form primary key, which means
subject column is a prime attribute.

But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute,

which is not allowed by BCNF.
How to satisfy BCNF?
• To make this relation(table) satisfy BCNF, we will decompose this
table into two tables, student table and professor table.

• Below we have the structure for both the tables.

Rules for 4th Normal Form
• For a table to satisfy the Fourth Normal Form, it should satisfy the
following two conditions:
1.It should be in the Boyce-Codd Normal Form.
2.And, the table should not have any Multi-valued Dependency.

What is Multi-valued Dependency?

A table is said to have multi-valued dependency, if the following conditions
are true,
For a dependency A → B, if for a single value of A, multiple value of B
exists, then the table may have multi-valued dependency.
Continue…
Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
And, for a relation R(A,B,C), if there is a multi-valued dependency
between, A and B, then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to have
multi-valued dependency.

Below we have a college enrolment table with columns s_id, course and
hobby.
As you can see in the table above, student with s_id 1 has opted for two
courses, Science and Maths, and has two hobbies, Cricket and Hockey.
Continue…
You must be thinking what problem this can lead to, right?
Well the two records for student with s_id 1, will give rise to two more
records, as shown below, because for one student, two hobbies exists,
hence along with both the courses, these hobbies should be specified.
Continue…
And, in the table above, there is no relationship between the columns
course and hobby. They are independent of each other.
So there is multi-value dependency, which leads to un-necessary
repetition of data and other anomalies as well.
How to satisfy 4th Normal Form?
• To make the above relation satify the 4th normal form, we can
decompose the table into 2 tables.
Continue…
Now this relation satisfies the fourth normal form.
A table can also have functional dependency along with multi-valued dependency.
In that case, the functionally dependent columns are moved in a separate table
and the multi-valued dependent columns are moved to separate tables.
If you design your database carefully, you can easily avoid these issues.
Continue…
• 5NF (Fifth Normal Form) Rules
• A table is in 5th Normal Form only if it is in 4NF and it cannot be
of smaller tables without loss of data.
• 6NF (Sixth Normal Form) Proposed
• 6th Normal Form is not standardized, yet however, it is being discussed
for some time. Hopefully, we would have a clear & standardized
in the near future…
• That’s all to SQL Normalization!!!
Thank you

Normalization of Database
No ratings yet
Normalization of Database
10 pages
12.1 Manupulating Data - Relational Data Base
No ratings yet
12.1 Manupulating Data - Relational Data Base
25 pages
Week 2
No ratings yet
Week 2
34 pages
DB 2
No ratings yet
DB 2
15 pages
1NF, 2NF
No ratings yet
1NF, 2NF
9 pages
Database Normalization
No ratings yet
Database Normalization
44 pages
Normalization of Database-Ass-2
No ratings yet
Normalization of Database-Ass-2
31 pages
CH 9
No ratings yet
CH 9
75 pages
Normalization Lesson
No ratings yet
Normalization Lesson
13 pages
DB Week 10 Lec 1
No ratings yet
DB Week 10 Lec 1
32 pages
Unit IV
No ratings yet
Unit IV
65 pages
Lecture 02
No ratings yet
Lecture 02
46 pages
Normalization in DBMS
No ratings yet
Normalization in DBMS
10 pages
NORMALIZATION
No ratings yet
NORMALIZATION
11 pages
Normalization Unit2
No ratings yet
Normalization Unit2
9 pages
12 Normalization
No ratings yet
12 Normalization
41 pages
Normalization
No ratings yet
Normalization
15 pages
CSC2243-Databases-Part III
No ratings yet
CSC2243-Databases-Part III
60 pages
DBMS Unit-4 Notes
No ratings yet
DBMS Unit-4 Notes
18 pages
Normalization
No ratings yet
Normalization
36 pages
Normalization in DBMS
No ratings yet
Normalization in DBMS
14 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
8 pages
Normalization
No ratings yet
Normalization
35 pages
Database Management System - 2 - 1753699708974 Conv
No ratings yet
Database Management System - 2 - 1753699708974 Conv
13 pages
Module3 PartB
No ratings yet
Module3 PartB
41 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
33 pages
Normalization FORM
No ratings yet
Normalization FORM
5 pages
DB NOMILIZATION AND CONSTRAINTS
No ratings yet
DB NOMILIZATION AND CONSTRAINTS
9 pages
Data Warehouse: Bilal Hussain
No ratings yet
Data Warehouse: Bilal Hussain
34 pages
NORMALIZATION
No ratings yet
NORMALIZATION
6 pages
Normalization FNL
No ratings yet
Normalization FNL
14 pages
Co4, Co5, Co6 Rdbms Assignment Solution
No ratings yet
Co4, Co5, Co6 Rdbms Assignment Solution
32 pages
Normal Forms
No ratings yet
Normal Forms
30 pages
CS331 - Chapter5 Normalization
No ratings yet
CS331 - Chapter5 Normalization
35 pages
DBMS Normalization
No ratings yet
DBMS Normalization
53 pages
Module-5 CSC 222 Database Normalization
No ratings yet
Module-5 CSC 222 Database Normalization
6 pages
CS331 - Chapter 5 Normalization
No ratings yet
CS331 - Chapter 5 Normalization
35 pages
Topic6 Normalization Updated
No ratings yet
Topic6 Normalization Updated
14 pages
Normalization
No ratings yet
Normalization
13 pages
Normalization With Example2
No ratings yet
Normalization With Example2
20 pages
CDM 7 Faveenna
No ratings yet
CDM 7 Faveenna
17 pages
DBMS Unit3
No ratings yet
DBMS Unit3
57 pages
Data Normalization
No ratings yet
Data Normalization
25 pages
Normalization & De-Normalization: Group Members
No ratings yet
Normalization & De-Normalization: Group Members
46 pages
Dbms Module 3
No ratings yet
Dbms Module 3
12 pages
RDBMS Concepts
No ratings yet
RDBMS Concepts
54 pages
Normalization
No ratings yet
Normalization
7 pages
Normalaization
No ratings yet
Normalaization
70 pages
Database Design and Development Week 1
No ratings yet
Database Design and Development Week 1
64 pages
Normalaization PPT 3nf
No ratings yet
Normalaization PPT 3nf
46 pages
4.what Is Normalization PDF
No ratings yet
4.what Is Normalization PDF
9 pages
Normalization of Database
No ratings yet
Normalization of Database
26 pages
SQL PPT
No ratings yet
SQL PPT
210 pages
UNIT-2,3: Hierarchical Model
No ratings yet
UNIT-2,3: Hierarchical Model
18 pages
Unit 5: Data Normalization
No ratings yet
Unit 5: Data Normalization
27 pages
DBMS Solved QP
No ratings yet
DBMS Solved QP
12 pages
Fundamental of Datawarehousing 3
No ratings yet
Fundamental of Datawarehousing 3
57 pages
Using Oracle Data Pump
No ratings yet
Using Oracle Data Pump
3 pages
Final Project Submit
No ratings yet
Final Project Submit
3 pages
Major Examination: Subject - Computer Maximum Marks: 80 Class - VIII Time: 3 Hour
No ratings yet
Major Examination: Subject - Computer Maximum Marks: 80 Class - VIII Time: 3 Hour
2 pages
Oracle PostgreSQL DBA Resume
No ratings yet
Oracle PostgreSQL DBA Resume
4 pages
Relational Algebra
No ratings yet
Relational Algebra
3 pages
WWW - Manaresults.Co - In: (Common To Ce, Eee, Me, Ece, Eie, MCT, Cee, MSNT)
No ratings yet
WWW - Manaresults.Co - In: (Common To Ce, Eee, Me, Ece, Eie, MCT, Cee, MSNT)
2 pages
ER to Relational Schema Mapping
No ratings yet
ER to Relational Schema Mapping
5 pages
SQL Notes - Practice Questions - Interview Questions
100% (1)
SQL Notes - Practice Questions - Interview Questions
60 pages
ADFinterview Questions
No ratings yet
ADFinterview Questions
2 pages
How To Optimize SQL Server Query Performance - Statistics, Joins and Index Tuning
No ratings yet
How To Optimize SQL Server Query Performance - Statistics, Joins and Index Tuning
25 pages
DB2 SQL Injection Cheat Sheet
No ratings yet
DB2 SQL Injection Cheat Sheet
4 pages
Practical File
No ratings yet
Practical File
32 pages
Noc 24 Hs 176 S 650906310
No ratings yet
Noc 24 Hs 176 S 650906310
19 pages
DBMS Recovery Problem Set
No ratings yet
DBMS Recovery Problem Set
2 pages
Ics 2404 Advanced Database Management Systems
No ratings yet
Ics 2404 Advanced Database Management Systems
2 pages
SQL Server CREATE TABLE Statement
No ratings yet
SQL Server CREATE TABLE Statement
7 pages
Cookies
No ratings yet
Cookies
2 pages
Dbms 15 Mark Questions With Answer
No ratings yet
Dbms 15 Mark Questions With Answer
5 pages
Business Objects Step by Step Tutorial
No ratings yet
Business Objects Step by Step Tutorial
59 pages
3DBMS ER Model Concept - Javatpoint PDF
No ratings yet
3DBMS ER Model Concept - Javatpoint PDF
6 pages
Mini-Project 231-CIS - LAB SET-B
No ratings yet
Mini-Project 231-CIS - LAB SET-B
3 pages
GoldenGate Conflict Resolution Guide
No ratings yet
GoldenGate Conflict Resolution Guide
2 pages
Database Programming & SQL Module
No ratings yet
Database Programming & SQL Module
18 pages
Database
No ratings yet
Database
129 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
141 pages
SQL Queries for Financial Data
No ratings yet
SQL Queries for Financial Data
3 pages
Airline Database
No ratings yet
Airline Database
2 pages
Rdbms File 2nd Sem
No ratings yet
Rdbms File 2nd Sem
53 pages
Querry Processing and Indexing, Hashing
No ratings yet
Querry Processing and Indexing, Hashing
24 pages
Regional Project Management Schema
No ratings yet
Regional Project Management Schema
1 page

Data Base - Database - Databse Chapter 9

Uploaded by

Data Base - Database - Databse Chapter 9

Uploaded by

Faculty of Computer Science

• Normalization is a technique for organizing data in a

2 Ahmad BCS Mr.Sabit 12345

3 Sara BCS Mr.Sabit 12345

4 Omar BCS Mr.Sabit 12345

• Updation anomaly; In this case we can present an example

1 Ali Bcs Bcs Mr.Sabit 12345

• Rule 3: Unique name for Attributes/Columns

• This is Dependency and we also call it Functional Dependenc

• Together, student_id + subject_id forms a Candidate Key(learn about

• Confused, How this combination can be a primary key?

• This is Partial Dependency, where an attribute in a table depends on only

So let's use the same example, where we have 3 tables, Student,

What is Transitive Dependency?

With exam_name and total_marks added to our Score table, it

And what about our second new column total_marks? Does it

But, exam_name is just another column in the score table. It is not a

This is Transitive Dependency. When a non-prime attribute depends

• In the table above:

But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute,

• Below we have the structure for both the tables.

What is Multi-valued Dependency?

You might also like