0% found this document useful (0 votes)
213 views

Ch04 Normalization

The document discusses different forms of database normalization including first normal form (1NF), second normal form (2NF), and third normal form (3NF). It explains that normalization is used to decompose relations and eliminate anomalies by removing duplications and dependencies between attributes. The goals of normalization are to organize data to avoid inconsistencies and make the structure more logical and stable.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
213 views

Ch04 Normalization

The document discusses different forms of database normalization including first normal form (1NF), second normal form (2NF), and third normal form (3NF). It explains that normalization is used to decompose relations and eliminate anomalies by removing duplications and dependencies between attributes. The goals of normalization are to organize data to avoid inconsistencies and make the structure more logical and stable.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Chapter 4

Normalization

Data Normalization
Formal process of decomposing
relations with anomalies to
produce smaller, well-
structured and stable relations
Primarily a tool to validate and
improve a logical design so that
it satisfies certain constraints
that avoid unnecessary
duplication of data
2

1
Well-Structured Relations
A relation that contains minimal data redundancy
and allows users to insert, delete, and update rows
without causing data inconsistencies
Goal is to avoid (minimize) anomalies
Insertion Anomaly adding new rows forces user to
create duplicate data
Deletion Anomaly deleting a row may cause loss of
other data representing completely different facts
Modification Anomaly changing data in a row forces
changes to other rows because of duplication

General rule of thumb: a table should not pertain to


more than one entity type
3

Example Figure 4.2b

Question Is this a relation? Answer Yes: unique rows and no


multivalued attributes

Question Whats the primary key? Answer Composite: EmpID,


CourseTitle

2
Anomalies in this Table
Insertion cant enter a new employee without having
the employee take a class
Deletion if we remove employee 140, we lose
information about the existence of a Tax Acc class
Modification giving a salary increase to employee
100 forces us to update multiple records

Why do these anomalies exist?


Because there are two themes (entity types what are they?) in this
one relation (two themes, entity types, were combined). This results
in duplication, and an unnecessary dependency between the entities
5

Figure: 4-22 Steps in Normalization


Table with Multivalued
attributes
Remove Multivalued
Attributes
First normal
form (1NF)

Second normal
form(2NF)

Third normal
form (3NF)

Boyce-Codd normal
form (BC-NF)

Fourth normal
Form (4NF)

Fifth normal
form (5NF)
6

3
First Normal Form (1NF)
Only atomic attributes (simple, single-value)
A primary key has been identified
Every relation is in 1NF by definition
1NF example:
Student
StudentId StuName CourseId CourseName Grade
100 Mike 112 C++ A
100 Mike 111 Java B
101 Susan 222 Database A
140 Lorenzo 224 Graphics B

Figure: 4-22 Steps in Normalization


Table with Multivalued
attributes
Remove Multivalued
Attributes
First normal
form (1NF)
Remove Partial
Dependencies
Second normal
form(2NF)

Third normal
form (3NF)

Boyce-Codd normal
form (BC-NF)

Fourth normal
Form (4NF)

Fifth normal
form (5NF)
8

4
Functional Dependencies
Functional Dependency: The value of one attribute (the
determinant) determines the value of another attribute.
AB reads Attribute B is functionally dependent on A
AB means if two rows have same value of A they
necessarily have same value of B
FDs are determined by semantics: You cant say that a FD
exists just by looking at data. But can say whether it does
not exist by looking at data.

Quick Check

Id Name?
Age Gender?
Name Id?
Name, Age Id?

10

5
Functional Dependencies and Keys
Functional Dependency: The value of one attribute (the
determinant) determines the value of another attribute.
Candidate Key
Attribute that uniquely identifies a row in a relation
Could be a combination of (non-redundant) attributes
Each non-key field is functionally dependent on every
candidate key

12

Figure 4-23: Representing Functional Dependencies (cont.)

EmpID ____________________
EmpID, CourseTitle ____________________

13

6
Practice Exercise #7, page #193

1. Convert this table to a relation (named PART SUPPLIER) in 1NF


2. Draw a relational schema for PART SUPPLIER and show the
functional dependencies. Identify a candidate key.
3. Identify each of the following: an insert anomaly, a delete anomaly, and
a modification anomaly.

14

15

7
Second Normal Form (2NF)

1NF PLUS every non-key attribute is


fully functionally dependent on the
ENTIRE primary key
Every non-key attribute must be defined by
the entire key, not by only part of the key
No partial functional dependencies

16

Functional Dependencies in Student

StudentId StuName CourseId CourseName Grade

Can represent FDs with arrows as above, or


StudentId StuName,
CourseId CourseName
StudentId,CourseId Grade (and StuName,
CourseName)
Any partial FDs ?
17

8
Functional Dependencies in Student

StudentId StuName CourseId CourseName Grade

Can represent FDs with arrows as above, or


StudentId StuName,
CourseId CourseName
StudentId,CourseId Grade (and StuName,
CourseName)

Therefore, NOT in 2nd Normal Form!!


18

2NF: Normalizing
How do we convert the partial dependencies into
normal ones ? By breaking into more tables.

StudentId StuName CourseId CourseName Grade

Becomes (notice above arrows mean functional dependency,


below they mean FK constraints)

StudentId StuName CourseId CourseName

StudentId CourseId Grade

19

9
You Try

SeriesId EpisodeId SeriesTitle EpisodeTitle AiringDate

List all FDs


Eliminate partial FDs, if any

20

Figure: 4-22 Steps in Normalization


Table with Multivalued
attributes
Remove Multivalued
Attributes
First normal
form (1NF)
Remove Partial
Dependencies
Second normal
form(2NF)
Remove Transitive
Dependencies
Third normal
form (3NF)

Boyce-Codd normal
form (BC-NF)

Fourth normal
Form (4NF)

Fifth normal
form (5NF)
22

10
Third Normal Form
2NF and no transitive dependencies
A transitive dependency is when a non-key
attribute depends on another non-key
attribute
Note: This is called transitive, because the
primary key is a determinant for another
attribute, which in turn is a determinant for
a third attribute

23

3NF Example

Course SectNum Classroom Capacity

Classroom Capacity TRANSITIVE


Any partial FDs? NO
Any transitive FDs? YES !
How do we eliminate it?
By breaking into its own table

25

11
3NF Normalization

Course SectNum Classroom

Classroom Capacity

27

You Try

StudentId ProgramId StudentName ProgramName

Partial FDs? Eliminate, if any.


Transitive FDs? Eliminate, if any.

28

12
Practice Exercise #15, page #196

Insertion anomaly?
Deletion anomaly?
Modification anomaly?

1. Develop a diagram that shows the functional dependencies in the


SHIPMENT relation.
2. In what normal form is SHIPMENT? Why?
3. Convert SHIPMENT to 3NF if necessary. Show the resulting
table(s) with the sample data presented in SHIPMENT.

30

31

13
Figure: 4-22 Steps in Normalization
Table with Multivalued
attributes
Remove Multivalued
Attributes
First normal
form (1NF)
Remove Partial
Dependencies
Second normal
form(2NF)
Remove Transitive
Dependencies
Third normal
form (3NF)
Remove remaining
anomalies resulting from
Boyce-Codd normal multiple candidate keys
form (BC-NF)

Fourth normal
Form (4NF)

Fifth normal
form (5NF)
33

Further Normalization
Boyce-Codd Normal form (BCNF)
Slight difference with 3NF
To be in 3NF but not in BNF, needs two composite
candidate keys, with one attribute of one key depending
on one attribute of the other
Not very common J
If a table contains only one candidate key, the 3NF and the
BCNF are equivalent.
Fourth Normal Form (4NF)
To break it, need to have multivalued dependencies, a
generalization of functional dependencies

Usually, if youre in 3NF youre in BCNF, 4NF,


34

14
BCNF Example
Assume that
For each subject, each student is taught by one Instructor
Each Instructor teaches only one subject
Each subject is taught by several Instructors
Course, Student Instructor
Course Instructor Student
CS 121 Dr. A. James Bill Payne
Instructor Course
CS 121 Dr. A. James Tony Perez
CS 121 Dr. A. James James Atkinson
CS 121 Dr. A. James Linda Lee

Course Instructor Student Course Instructor Student


CS 141 Dr. T. Watson Linda Lee CS 101 Dr. M. Jones Linda Lee
CS 141 Dr. T. Watson Judith San CS 101 Dr. M. Jones Tony Perez
CS 141 Dr. T. Watson Bill Jones CS 101 Dr. M. Jones Bill Payne
CS 141 Dr. P. Hold Bill Payne
CS 141 Dr. P. Hold A. White

BCNF: Decompose into (Instructor, Course) and (Student, Instructor) 35

BCNF
Boyce-Codd normal form (BCNF)
A relation is in BCNF, if and only if, every
determinant is a candidate key.
The difference between 3NF and BCNF is that for
a functional dependency A B, 3NF allows this
dependency in a relation if B is a primary-key
attribute and A is not a candidate key,

whereas BCNF insists that for this dependency to


remain in a relation, A must be a candidate key.

36

15
Figure: 4-22 Steps in Normalization
Table with Multivalued
attributes
Remove Multivalued
Attributes
First normal
form (1NF)
Remove Partial
Dependencies
Second normal
form(2NF)
Remove
Third normal
form (3NF)
Remove remaining
anomalies resulting from
Boyce-Codd normal multiple candidate keys
form (BC-NF)
Remove Multivalued
Fourth normal Dependencies
Form (4NF)

Fifth normal
form (5NF)
37

4NF
A multi-valued dependency exists when
There are at least 3 attributes A, B, C in a relation and
For each value of A there is a well defined set of values
for B, and a well defined set of values for C,
But the set of values for B is independent on the set of
values for C

4NF = 3NF with no multi-valued dependency

38

16
4NF Example
Assume that
Each subject is taught by many Instructors
The same books are used in many subjects
Each Instructor uses a different book
Course, Instructor Text
Course, Text Instructor
Course Instructor Text
CS 121 Dr. A. James Int to Com Science
CS 121 Dr. P. Hold Comp Scien Int

Course Instructor Text


CS 141 Dr. T. Watson Int to Com Science
CS 141 Dr. P. Hold Comp Scien Int
CS 101 Dr. M. Jones COMP SCIEN

4NF: Decompose into (Course, Instructor) and (Course, Text)


39

Textbook Example

40

17
The Normalization Example in the Text Book
Figure 4-24 INVOICE (Pine Valley Furniture Company)

41

Figure 4-25 INVOICE Data


Table with multivalued attributes, not in 1st normal form

Note: this is NOT a relation. WHY?


42

18
Figure 4-26 INVOICE relation (1NF)
Table with no multivalued attributes and unique rows

Note: this is relation, but not a well-structured one. WHY?


43

Anomalies in this Table


Insertionif new product is ordered for order 1007 of
existing customer, customer data must be re-entered,
causing duplication

Deletionif we delete the Dining Table from Order


1006, we lose information concerning this item's finish
and price

Updatechanging the price of product ID 4 requires


update in several records

44

19
Figure 4-27 Functional dependency diagram for INVOICE

Order_ID Order_Date, Customer_ID, Customer_Name, Customer_Address


Customer_ID Customer_Name, Customer_Address
Product_ID Product_Description, Product_Finish, Unit_Price
Order_ID, Product_ID Order_Quantity

Therefore, NOT in 2nd Normal Form


45

Figure 4-28 Partial Dependencies were Removed (2NF)

2NF

Partial dependencies are removed, but


there are still transitive dependencies
46

20
Figure 4-29 Transitive Dependencies were Removed (3NF)

Two Relations
Remain

Getting it into
Third Normal
Form

3NF

47

You Try

1. Convert to 1NF Relation


2. Draw dependency diagram showing all functional dependencies
3. Identify anomalies
4. Convert to 3NF Relations
5. Develop EER Diagram with appropriate cardinalities

48

21
49

Logical Database Design


You have just learned
and completed one of
the most important
concepts and theories,
integrity constraints
and normalization,
for developing a
quality of database.

50

22
After learning one of most important
database concepts and theories...
WHATS NEXT ?

51

Steps of Database Development


User view-1 User view-2 User view-3 User view-N

User interview &


Integrated Model
Conceptual Schema (Model)

Logical Model
(ERD or E/ERD)

(more relations
(Six) Relations Transformation produced)

(more tables
NORMALIZATION (up to 3NF) created)

IMPLEMENTATION
52

23

You might also like