0% found this document useful (0 votes)
5 views

6 Normalization Part 1

The document discusses database normalization, focusing on its importance in eliminating redundancy and preventing data anomalies caused by update, insertion, and deletion operations. It outlines the requirements for the first three normal forms (1NF, 2NF, 3NF) and provides examples of data anomalies, including insertion, deletion, and modification anomalies. The goal of normalization is to create efficient relational schemas that allow for effective data storage and retrieval.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

6 Normalization Part 1

The document discusses database normalization, focusing on its importance in eliminating redundancy and preventing data anomalies caused by update, insertion, and deletion operations. It outlines the requirements for the first three normal forms (1NF, 2NF, 3NF) and provides examples of data anomalies, including insertion, deletion, and modification anomalies. The goal of normalization is to create efficient relational schemas that allow for effective data storage and retrieval.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 69

19ELC312 Database

Management Systems

Normalization

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 1 1
Overview Today we’ll discuss

Database Normalization

■ Data Anomalies Caused by:

● Update, Insertion, Deletion

Brief History/Overview

■ 1st Normal Form

■ 2nd Normal Form

■ 3rd Normal Form

○ Conclusion

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 2


The goal of relational database design is to
generate a set of relation schemas that allow
us to store information without unnecessary
redundancy, and to retrieve information
efficiently.

We formed relation schemas directly from ER


diagram.

How do we know the desirability (good / bad)


of
Bindu
Bindu
a collection
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
of
Amrita
relation
School
School
of Engineering,
schemas?
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 3 3
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 4
• Design alternatives
Large Schema Vs Smaller Schemas

Eg. Instead of instructor (id, name, dept_name, salary) and


department (dept_name, builing, budget), if we have

Inst_dept(Id, name, salary, dept_name, building, budget)

• It is the result of instructor natural join department.


• Drawback
(1) department information is repeated.
- this will lead to inconsistency.
(2) new department can not be added until an instructor joins.
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 5 5
• Suppose we started with Large Schema
Inst_dept(Id, name, salary, dept_name, building, budget)

• How do we identify the redundancy?


(1) by observing the content
- this is a unreliable process.
- in real-world, the database has a large
schema and millions of records.

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 6 6
• Suppose we started with Large Schema
Inst_dept(Id, name, salary, dept_name, building, budget)

• How do we identify the relation schema to be decomposed?


(1) Database designers to specify rules such as
“each specific value for dept_name corresponds to at most one
budge”

In other words we say “If there is a schema (dept_name, budget), the


dept_name is able to serve as the primary key”.

This rule is specified as a functional dependency Dept_name  budget


In some schemas, this rule can be checked even if dept_name is not
primary key.
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 7 7
Given the rule
Dept_name  budget

• Now, we can identify the problem in


Inst_dept(Id, name, salary, dept_name, building, budget)

Here a department may have several records, so the amount of a


budget is repeating.

So this schema is to be decomposed as instructor and department.

Finding right decomposition is a tough task. (we have methodologies


for that in Normalization).
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 8 8
Not all the decomposition also useful.

Eg. Employee(ID, Name, Street, City, Salary)


Decomposed into

Employee1(ID, Name) and Employee2(Name, Street, City, Salary)

(1234, Kim) (Kim, Raja Street, Coimbatore, 20000)


(2345, Kim) (Kim, V. H. Road, Chennai, 25000)
When we do natural join of Employee 1 and Employee 2
1234, Kim, Raja Street, Coimbatore, 20000
1234, Kim, V. H. Road, Chennai, 25000
2345, Kim, Raja Street, Coimbatore, 20000
2345, Kim, V. H. Road, Chennai, 25000
This decomposition unable to represent Employee – Lossy Decomposition.
Converse of this is lossy decomposition.
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 9 9
Why Database Normalization?
The main goal of Database Normalization is to
restructure the logical data model of a database to:

Eliminate redundancy.

Organize data efficiently.

Reduce the potential for data anomalies .

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 10


NORMALIZATION
Normalization is formal method to check tables for
potential data storage problems termed anomalies.

In order to clarify the discussion, we will formalize the


definition of several terms.
KEYS: The term KEY is often confusing because it has different meanings
during design and implementation of a system.

◦ DESIGN: During design, KEY means a combination of one or more


attributes (columns) of a relational table that uniquely identify
rows in the table.

◦ KEY guarantees uniqueness; no two rows can be identical..


Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 11
NORMALIZATION
IMPLEMENTATION:
During implementation, the term KEY is a
column on which the DBMS builds an index or
other data structure, to allow quick access to
rows.

Such keys need not be unique - they may be


secondary keys enabling access to a SET of
rows.
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 12
Data Anomalies
 Data anomalies are inconsistencies in the data stored
in a database as a result of an operation such as
update, insertion, and/or deletion.
 Such inconsistencies may arise when have a
particular record stored in multiple locations and not
all of the copies are updated.
 We can prevent such anomalies by implementing
different level of normalization called Normal Forms
(NF)

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 13


Student

ROLLNO NAME BRANCH HOD OFFICE_TEL

401 Asha CSE Mr. Sam 53337

402 Babu CSE Mr. Sam 53337

403 Cini CSE Mr. Sam 53337

404 Dilip CSE Mr. Sam 53337

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 14


Employee

EmpId Name Dept Salary Course DateTook Fee


130 Margaret Math 45,000 Calculus 01/15 150
130 Margaret Math 45,000 Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
250 Chris Math 52,000 Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
425 Bill Math 48,000 Calculus 04/15

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 15


Normalisation
Problems With This Table:
Redundancy of data storage.
Potential inconsistencies on
updating data.
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 16
Data Anomalies
Data Anomalies are problems with data storage caused by poorly
EmpId Name Dept Salary Course DateTook Fee
structured130tables. Margaret Math 45,000 Calculus 01/15 150
130 Margaret Math 45,000 Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
250 Chris Math 52,000 Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
425 Bill Math 48,000 Calculus 04/15

Insertion Anomaly
If the primary key is EmpId + Course, to add a new employee, the employee must
first be enrolled in a course.
If an employee is not enrolled in a course, then the COURSE column that is part of
the composite primary key will be null, and null key values are not allowed.

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 17


Data Anomalies
EmpId Name Dept Salary Course DateTook Fee
130 Margaret Math 45,000 Calculus 01/15 150
130 Margaret Math 45,000 Biology 02/15 200
200 Susan Sci 38,000 Biology 01/15 200
250 Chris Math 52,000 Calculus 03/15 150
250 Chris Math 52,000 Biology 03/15 200
425 Bill Math 48,000 Algebra 03/15 200
425 Bill Math 48,000 Calculus 04/15

Deletion Anomaly.
 Deleting data for Employee #425 (Bill) causes us to lose data about
Algebra and the course fee for Algebra because Bill is the only employee
who has enrolled in Algebra.
Modification Anomaly.
 If the fee for Calculus is increased, the data must be updated for more
than one row.
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 18
Brief History/Overview
Database Normalization was first proposed by Edgar F.
Codd.
Codd defined the first three Normal Forms.
In order to do normalization we must know what the
requirements are for each of the three Normal Forms .
One of the key requirements to remember is that Normal
Forms are progressive.
That is, in order to have 3rd NF we must have 2nd NF and in
order to have 2nd NF we must have 1st NF.

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 19


Normalizati

on
Method for designing (refining) relational databases.

• Goal: to generate set of relations schemas that allow us to store the data with out redundancy and to
retrieve the data easily.

• Approach: to design the schemas that are in normal form. (Rule)

• To check the normal form we need additional information, such as functional dependency. (Conditions)

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 20


First Normal Form (1NF)-The Requirements

The requirements to satisfy the First Normal Form (1NF):

 Each table has a primary key: minimal set of attributes


which can uniquely identify a record

 The values in each column of a table are atomic (No


multi-valued attributes allowed).

 There are no repeating groups: two columns do not store


similar information in the same table.

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 21


First Normal Form - No multivalued attributes
A domain is atomic if elements of the domain
are indivisible.
○ We say that a relation schema R is in first normal form (1NF) if
the domains of all attributes of R are atomic.

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 22


A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)

Author and AuPhone columns are not scalar


Author and AuPhone columns are not scalar

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 23


1NF - Decomposition
1. Place all items that appear in the repeating group in a new
table
2. Designate a primary key for each new table produced.
3. Duplicate in the new table the primary key of the table from
which the repeating group was extracted or vice versa.
Example (1NF)

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 24


Is it in First normal form (1 NF)?

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 25


Is it in First normal form (1 NF)?

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 26


Not in 1 NF

In 1 NF

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 27


Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 28
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 29
 There are no repeating groups:
two columns do not store
similar information in the same
This violates
table. first normal form.

An apparent solution is to introduce


more columns:
The two telephone number columns
still form a "repeating group":
Adding an extra telephone number
may require the table to be
reorganized by the addition of a new
column.
And we ensure no row contains more than
one phone number

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 30


First Normal Form(1NF) -
Example
Un-normalized Students table:

Student# AdvID AdvName AdvRoom Class1 Class2


123 123A James 555 102-8 104-9
124 123B Smith 467 209-0 102-8

Normalized Students table:


Student Table
Student AdvID AdvNam AdvRoom
Student# AdvID AdvName AdvRoom Class # e
#
123 123A James 555
123 123A James 555 102-8
124 123B Smith 467
123 123A James 555 104-9
124 123B Smith 467 209-0 Student# Class#

124 123B Smith 467 102-8 123 102-8


123 104-9
124 209-0
124 102-8

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 31


Functional Dependency
Eg.

○ Each specific value of dept_name corresponds to at most one


budget.

○ In other words “if there were a schema (dept_name, budget),


the dept_name is able to serve as the primary key”

○ This is rule is specified as a functional dependency

dept_name  budget
In X→Y, X functionally determines Y. The left-hand side attributes
determine the values of attributes on the right-hand side.
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 32
Functional Dependencies
1. If one set of attributes in a table determines another
set of attributes in the table, then the second set of
attributes is said to be functionally dependent on the
first set of attributes.

Example 1 Table Scheme: {ISBN, Title, Price}


Functional Dependencies:
{ISBN}  {Title}

{ISBN}  {Price}

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 33


Functional Dependencies
Example
2
Table Scheme: {PubID, PubName,
PubPhone}
Functional Dependencies:
{PubId}  {PubPhone}

{PubId}  {PubName}
Example 3
{PubName, PubPhone}  {PubID}
Table Scheme: {AuID, AuName, AuPhone}
Functional Dependencies:
{AuId}  {AuPhone}

{AuId}  {AuName}

{AuName, AuPhone}  {AuID}

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 34


Functional Dependency Vs Superkey
Super key is a set of attributes that uniquely identifies an entire
tuple.
if K is a superkey of r(R), then K  R

A functional dependency allows us to express constraints that


uniquely identify values of certain attributes.

In a relation A B means, for any two tuples t1 and t2 if t1[A]=t2[A] then t1[B]= t2[B].

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 35


Trivial functional dependency
○ The fds which are satisfied by all relations.

Example.

A A

AB A (ie, in X  Y if Y is a subset of X)

Given a set of fds holds on a relation r(R) it is possible to


infer certain other fds, which is denoted as F+ (F Closure)

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 36


2NF - 1 NF plus no partial dependencies
a table is in 2NF if and only if it is in 1NF and no non-prime
attribute is dependent on any proper subset of any
candidate key of the table.

A non-prime attribute of a table is an attribute that is not a


part of any candidate key of the table.

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 37


Second Normal Form
(2NF)
For a table to be in 2NF, there are two requirements
○ The database is in first normal form
○ All non-key attributes in the table must be functionally dependent on the entire
primary key
Note: Remember that we are dealing with non-key attributes

Example 1 (Not 2NF)


Schema {Title, PubId, AuId, Price, AuAddress}
1. Key  {Title, PubId, AuId}
2. {Title, PubId, AuID}  {Price}
3. {AuID}  {AuAddress}
AuAddress does not belong to a key
AuAddress functionally depends on AuId which is a
subset of a key
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 38
Second Normal Form
(2NF)
Example 2 (Not 2NF)
Schema  {City, Street, HouseNumber, HouseColor, CityPopulation}

1. key  {City, Street, HouseNumber}

2. {City, Street, HouseNumber}  {HouseColor}

3. {City}  {CityPopulation}

CityPopulation does not belong to any key.


CityPopulation is functionally dependent on the City which is a proper subset of
the key

Example 3 (Not 2NF)


Schema  {studio, movie, budget, studio_city}
4. Key  {studio, movie}

5. {studio, movie}  {budget}

6. {studio}  {studio_city}
studio_city is not a part of a key
studio_city functionally depends on studio which is a proper subset of the key
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 39
2NF - Decomposition
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key to
a new table.
2. If there are other data items which are functionally dependent on the
same part of the key, place them also in the new table.
3. Make the partial primary key copied from the original table as the
primary key for the new table. Place all items that appear in the
repeating group in a new table
Example 1 (Convert to 2NF)
Old Schema  {Title, PubId, AuId, Price, AuAddress}
New Schema  {Title, PubId, AuId, Price}
New Schema  {AuId, AuAddress}
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 40
2NF - Decomposition
Example 2 (Convert to 2NF)
Old Schema  {Studio, Movie, Budget, StudioCity}
New Schema  {Movie, Studio, Budget}
New Schema  {Studio, StudioCity}

Example 3 (Convert to 2NF)


Old Schema  {City, Street, HouseNumber, HouseColor,
CityPopulation}
New Schema  {City, Street, HouseNumber, HouseColor}
New Schema  {City, CityPopulation}

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 41


The table in this example is in first normal form (1NF) since all
Second normal form: attributes are single valued. But it is not yet in 2NF. If student 1
leaves university and the tuple is deleted, then we loose all
A relation is in second normal form if it is in 1NF and every non key information about professor Schmid, since this attribute is fully
functional dependent on the primary key IDSt.
attribute is fully functionally dependent on the primary key.

A university uses the following relation:

Student(IDSt, StudentName, IDProf, ProfessorName, Grade)


Key(Idst,IDProf)

The attributes IDSt and IDProf are the identification keys.


All attributes are single valued (1NF).

The following functional dependencies exist:


1. The attribute ProfessorName is functionally dependent on
attribute IDProf
(IDProf --> ProfessorName)
2. The attribute StudentName is functionally dependent
on IDSt
(IDSt --> StudentName)
3. The attribute Grade is fully functional dependent on IDSt and IDProf
(IDSt, IDProf --> Grade)

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 42 42
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 43 43
A table is in 2NF, only if a relation is in 1NF and meet all the rules, and every non-key attribute is fully
dependent on primary key.
The Second Normal Form eliminates partial dependencies on primary keys.
Let us see an example −
In the given table, we have partial dependency; let us see
how −
• The prime key attributes are StudentID and
ProjectID.

• As stated, the non-prime attributes i.e.


StudentName and ProjectName should be
functionally dependent on part of a candidate key, to
be Partial Dependent.

• The StudentName can be determined by StudentID,


which makes the relation Partial Dependent.

• The ProjectName can be determined by ProjectID,


which makes the relation Partial Dependent.

• Therefore, the <StudentProject> relation violates the


2NF in Normalization and is considered a bad
database design.

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 44 44
• COURSE_FEE cannot alone decide the value of
COURSE_NO or STUD_NO;

• COURSE_FEE together with STUD_NO cannot


decide the value of COURSE_NO;

• COURSE_FEE together with COURSE_NO cannot


decide the value of STUD_NO;

Hence,
COURSE_FEE would be a non-prime attribute, as it
does not belong to the one only candidate key
{STUD_NO, COURSE_NO} ;

But, COURSE_NO -> COURSE_FEE, i.e.,


COURSE_FEE is dependent on COURSE_NO, which
is a proper subset of the candidate key. Non-prime
attribute COURSE_FEE is dependent on a proper
subset of the candidate key, which is a partial
dependency and so this relation is not in 2NF.

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 45 45
This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase
Location].
In this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key.
Therefore, this table does not satisfy second normal form.

To bring this table to second normal form, we break the table into two tables, and now we have the
following:

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 46 46
3 NF

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 47 47
What do you understand from the
diagram below?

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 48 48
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 49 49
Is there any partial FD?

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 50 50
Which normal form is carried
out?

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 51 51
Third Normal Form
• 2NF and no transitive dependencies

• A transitive dependency is when a non-key


attribute depends on another non-key attribute.
If A->B and B->C are two FDs then A->C is called transitive
dependency.
(If A is prime, B and C are non-prime)

• Note: This is called transitive, because the primary


key is a determinant for another attribute, which in
turn is a determinant for a third attribute.
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 52 52
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 53 53
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 54 54
FDs
StudentID  StudentName
StudentID  ProgramId
StudentID  ProgramName
ProgramID ProgramName

Partial FDs – No.

Full FDs
StudentID  StudentName
(StudentID, ProgramId, StudentName) StudentID  ProgramId
StudentID  ProgramName

(ProgramID, ProgramName) Transitive FDs


ProgramID ProgramName

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 55 55
• A table design is said to be in 3NF if both the following conditions hold:
• Table must be in 2NF
• Transitive functional dependency of non-prime attribute on any super key should be removed.

Super keys: {emp_id}, {emp_id,


emp_name}, {emp_id, emp_name, emp_zip}
…so on
Candidate Keys: {emp_id}
Non-prime attributes: all attributes except
emp_id are non-prime as they are not part of
any candidate keys.
Here, emp_state, emp_city & emp_district
dependent on emp_zip.

And, emp_zip is dependent on emp_id that


makes non-prime attributes (emp_state,
emp_city & emp_district) transitively
dependent on super key (emp_id). This
violates the rule of 3NF.
emp_zipemp_state, empt_City, emp_district
emp_idemp_zip

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 56 56
Third Normal Form (3NF)
FD set:
A relation is in third normal form, if there is no {STUD_NO -> STUD_NAME,
transitive dependency for non-prime attributes as STUD_NO -> STUD_STATE,
well as it is in second normal form. STUD_STATE -> STUD_COUNTRY,
STUD_NO -> STUD_AGE}
Candidate Key:
{STUD_NO}

decompose the relation


STUDENT (STUD_NO, STUD_NAME, STUD_STATE, STUD_COUNTRY_STUD_AGE) as:

1. STUDENT(STUD_NO, STUD_NAME, STUDS_STATE, STUD_AGE)

2. And STATE_COUNTRY (STATE, COUNTRY)

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 57 57
The following dependencies exist:

1. Name, Account_No, Bank_Code_No are functionally dependent on ID


(ID --> Name, Account_No Bank_Code_No)

2. Bank is functionally dependent on Bank_Code_No


(Bank_Code_No --> Bank)

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 58 58
In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type].
Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive functional
dependency, and this structure does not satisfy third normal form.
To bring this table to third normal form, we split the table into two as follows:

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 59 59
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 60 60
You Try…
Is it in 3NF?

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 61 61
Cust_id  zip
You Try…
zip  Street, City, State
Not in 3NF

The advantages of removing transitive dependencies


(1) the amount of data duplication is reduced and
therefore your database becomes smaller.

(2) When duplicated data changes, there is a big risk of


updating only some of the data, especially if it is
spread out in many different places in the database.

Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 62 62
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 63 63
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 64 64
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 65 65
Bindu
Bindu
K. R.
K. R. Dept.Dept.
of CSE.,
of CSE.,
Amrita
Amrita
School
School
of Engineering,
of Engineering,
Coimbatore
Coimbatore
September
September
2020
2020 66 66
Conclusion
We have seen how
Database Normalization can

Decrease redundancy

Increase efficiency and

Reduce anomalies
by implementing three of different levels of normalization
called Normal Forms.

The first three Normal Forms are usually sufficient for most small to
medium
Bindu size
K. R. Dept. applications.
of CSE., Amrita School of Engineering, Coimbatore September 2020 67
References
Hillyer Mike, MySQL AB. An Introduction to Database
Normalization,
http://dev.mysql.com/tech-resources/articles/intro-to-normalizatio
n.html
, accessed October 17, 2006.

Microsoft. Description of the database normalization basics,


http://support.microsoft.com/kb/283878 , accessed October 17,
2006.
Wikipedia. Database Normalization.
http://en.wikipedia.org/wiki/Database_normalization.html ,
accessed October 17, 2006.
https://www.db-book.com/db6/index.html
https://www.youtube.com/watch?v=mfVCesoMaGA&list=PLroEs25
Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 68
Thank You

Happy to answer any questions ! ! !

Bindu K. R. Dept. of CSE., Amrita School of Engineering, Coimbatore September 2020 69

You might also like