0% found this document useful (0 votes)
6 views

Chapter_4_(3)[1]

Chapter 4 discusses database normalization, which is a process aimed at reducing data redundancy and preventing update anomalies such as insertion, deletion, and modification issues. It outlines the steps of normalization from Unnormalized Form to Fifth Normal Form, emphasizing the importance of functional dependencies and the elimination of partial and transitive dependencies. The chapter also highlights the potential performance trade-offs of normalization and introduces advanced forms like Boyce-Codd Normal Form and Fourth Normal Form.

Uploaded by

mulukengashaw21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter_4_(3)[1]

Chapter 4 discusses database normalization, which is a process aimed at reducing data redundancy and preventing update anomalies such as insertion, deletion, and modification issues. It outlines the steps of normalization from Unnormalized Form to Fifth Normal Form, emphasizing the importance of functional dependencies and the elimination of partial and transitive dependencies. The chapter also highlights the potential performance trade-offs of normalization and introduces advanced forms like Boyce-Codd Normal Form and Fourth Normal Form.

Uploaded by

mulukengashaw21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Chapter 4

Normalization

1
Cont…
Database normalization is a series
of steps followed to obtain a
database design that allows for
consistent storage and efficient
access of data in a relational
database.
These steps reduce data
redundancy and the risk of data
becoming inconsistent.
2
Cont…
NORMALIZATION is the process of
identifying the logical associations
between data items and designing a
database that will represent such
associations but without suffering
the update anomalies which are;
1. Insertion Anomalies
2. Deletion Anomalies
3. Modification Anomalies

3
Cont…
Normalization may reduce system
performance since data will be
cross referenced from many tables.
Thus denormalization is sometimes
used to improve performance, at
the cost of reduced consistency
guarantees.

4
Example

 Clearly,
Name and Address are
redundant (larger relation + you have
to update 3 rows to update the
Address)
 Cannot make a record one Student
unless he is not taking any classes
5
Cont…
Mnemonic for remembering the rationale
for normalization could be the following:
1. No Repeating or Redundancy: no
repeating fields in the table
2. The Fields Depend Upon the Key: the
table should solely depend on the key
3. The Whole Key: no partial key
dependency
4. And Nothing But The Key: no inter
data dependency

6
Cont…
 Allthe normalization rules will
eventually remove the update
anomalies that may exist during data
manipulation after the
implementation. The update
anomalies are;
 Pitfalls of Normalization
◦ Requires data to see the problems
◦ May reduce performance of the system
◦ Is time consuming,
◦ Difficult to design and apply and
◦ Prone to human error

7
Cont…
 The underlying ideas in normalization are
simple enough. Through normalization we
want to design for our relational database a
set of tables that;
1. Contain all the data necessary for the
purposes that the database is to serve,
2. Have as little redundancy as possible,
3. Permit efficient updates of the data in the
database, and
4. Avoid the danger of losing data
unknowingly.
◦ The type of problems that could occur in
insufficiently normalized table is called update
anomalies which includes;

8
Insertion anomalies
 An "insertion anomaly" is a failure to place
information about a new database entry into
all the places in the database where
information about that new entry needs to
be stored.
 In a properly normalized database,
information about a new entry needs to be
inserted into only one place in the database;
in an inadequately normalized database,
information about a new entry may need to
be inserted into more than one place and,
human fallibility being what it is, some of the
needed additional insertions may be missed.

9
Deletion anomalies
A "deletion anomaly" is a failure to remove
information about an existing database
entry when it is time to remove that entry.
 In a properly normalized database,
information about an old, to-be-gotten-rid-
of entry needs to be deleted from only one
place in the database; in an inadequately
normalized database, information about
that old entry may need to be deleted from
more than one place, and, human fallibility
being what it is, some of the needed
additional deletions may be missed.

10
Modification anomalies
A modification of a database involves
changing some value of the attribute
of a table.
 In a properly normalized database
table, what ever information is
modified by the user, the change will
be effected and used accordingly.
 The purpose of normalization is to
reduce the chances for anomalies to
occur in a database.

11
Example
EMpI Fname Lname Skil Skill Skill Scho school Skill
D lID type ol Add Leve
l
12 Abebe Mekuria 2 SQL Database BiT Poly 5
16 Lemma Alemu 5 C++ Programm High Kebele 6
ing land 14
28 Chane Kebede 2 SQL Database Bit poly 10
25 Abera Taye 6 VB6 Programm Blue Kebele 8
ing Nile 13
65 Almaz Belay 2 SQL Database Blue Kebele 9
Nile 13
24 Dereje Tamiru 8 Oracle Database High Kebele 5
land 14
51 Selam Belay 4 Prolog Programm AAU Addis 8
ing Ababa
94 Alem Kebede 3 Cisco Networkin BiT Poly 7
g
18 Girma Dereje 1 IP Programm AAU Addis 4 12
Deletion Anomalies:
If employee with ID 16 is deleted
then ever information about skill C+
+ and the type of skill is deleted
from the database.
Then we will not have any
information about C++ and its skill
type.

13
Insertion Anomalies:
What if we have a new employee
with a skill called Pascal? We can not
decide weather Pascal is allowed as
a value for skill and we have no clue
about the type of skill that Pascal
should be categorized as

14
Modification Anomalies:
What if the address for High land is
changed fro kebele 14 to Kebele 7?
We need to look for every
occurrence of High land and change
the value of School_Add from
Kebele 14 to Kebele 7, which is
prone to error.

15
Functional Dependency (FD)
Data Dependency
 The logical association between data items that
point the database designer in the direction of a
good database design are referred to as
determinant or dependent relationships.
 Two data items A and B are said to be in a
determinant or dependent relationship if
certain values of data item B always appears
with certain values of data item A.
 if the data item A is the determinant data item
and B the dependent data item then the
direction of the association is from A to B and
not vice versa.

16
Cont...
 "B is functionally dependent on A." or "A
determines B," or that "B is a function of
A," or that "A functionally governs B."
 Often, the notions of functionality and
functional dependency are expressed
briefly by the statement, "If A, then B."
 It is important to note that the value B
must be unique for a given value of A, i.e.,
any given value of A must imply just one
and only one value of B, in order for the
relationship to qualify for the name
"function."
17
Cont…
X → Y holds if whenever two tuples
have the same value for X, they must
have the same value for Y
The notation is: A → B which is read
as; B is functionally dependent on A
In general, a functional dependency
is a relationship among attributes. In
relational databases, we can have a
determinant that governs one other
attribute or several other attributes.

18
Cont…
 FDsare derived from the real-world
constraints on the attributes
 Example

 Sincethe type of Wine served


depends on the type of Dinner, we say
Wine is functionally dependent on
Dinner.
Dinner → Wine
19
Partial Dependency
 Ifan attribute which is not a member of
the primary key is dependent on some
part of the primary key (if we have
composite primary key) then that
attribute is partially functionally
dependent on the primary key.
 Let {A,B} is the Primary Key and C is no
key attribute.
 Then if {A,B} → C and B → C
 Then C is partially functionally
dependent on {A,B}
20
Full Dependency
 Ifan attribute which is not a member of the
primary key is not dependent on some part
of the primary key but the whole key (if we
have composite primary key) then that
attribute is fully functionally dependent on
the primary key.
 Let {A,B} is the Primary Key and C is no key
attribute
 Then if {A,B} → C and B → C and A → C does
not hold
 Then C Fully functionally dependent on
{A,B}
21
Transitive Dependency
 In mathematics and logic, a transitive relationship is a
relationship of the following form:
 "If A implies B, and if also B implies C, then A implies C."

Example:
 If Abebe is a Human, and if every Human is an Animal,
then Abebe must be an Animal.
 Generalized way of describing transitive dependency is
that:
 If A functionally governs B, AND
◦ If B functionally governs C
 THEN A functionally governs C
 Provided that neither C nor B determines A (B / → A
and C / → A)
 In the normal notation:
 {(A → B) AND (B/ → C)} ==> A/ → C

22
Steps of Normalization:
Normalization towards a logical design consists of the
following steps:
UnNormalized Form:
 Identify all data elements
First Normal Form:
 Find the key with which you can find all data
Second Normal Form:
 Remove part-key dependencies. Make all data
dependent on the whole key.
Third Normal Form
 Remove non-key dependencies. Make all data
dependent on nothing but the key.
 For most practical purposes, databases are
considered normalized if they adhere to third normal
form.

23
First Normal Form (1NF)
 Requires that all column values in a table
are atomic (e.g., a number is an atomic
value, while a list or a set is not).
 We have two ways of achiving this:
1. Putting each repeating group into a
separate table and connecting them with
a primary key-foreign key relationship
2. Moving this repeating groups to a new
row by repeating the common attributes.
If so then Find the key with which you can
find all data

24
Cont…
Definition of a table (relation) in 1NF
If
◦There are no duplicated rows in the
table. Unique identifier
◦Each cell is single-valued (i.e., there are
no repeating groups).
◦Entries in a column (attribute, field) are
of the same kind

25
Example 1
It is not in 1NF

26
Cont…
It is in 1NF

27
Example for first normal
form(1NF)
UNNORMALIZED

28
FIRST NORMAL FORM (1NF)
Remove all repeating groups.
Distribute the multi-valued
attributes into different rows and
identify a unique identifier for the
relation so that is can be said is a
relation in relational database.

29
Cont…

30
Second Normal form 2NF
 No partial dependency of a non key
attribute on part of the primary key.
 Any table that is in 1NF and has a
single-attribute (i.e., a non-composite)
key is automatically also in 2NF.
 Definition of a table (relation) in 2NF
◦ It is in 1NF and
◦ If all non-key attributes are dependent on
all of the key. i.e. no partial dependency.

31
Cont…
Since a partial dependency occurs
when a non-key attribute is
dependent on only a part of the
(composite) key, the definition of
2NF is sometimes phrased as, "A
table is in 2NF if it is in 1NF and if it
has no partial dependencies."
Example for 2NF:

32
Cont…
This schema is in its 1NF since we
don’t have any repeating groups or
attributes with multi-valued property.
To convert it to a 2NF we need to
remove all partial dependencies of
non key attributes on part of the
primary key.

33
Cont…
 {EmpID, ProjNo}→ EmpName,
ProjName, ProjLoc, ProjFund,
ProjMangID
 But in addition to this we have the
following dependencies
 EmpID → EmpName
 ProjNo → ProjName, ProjLoc, ProjFund,
ProjMangID
 As we can see some non key attributes
are partially dependent on some part of
the primary key.
 Thus these collections of attributes
should be moved to a new relation
34
Cont…

35
Third Normal Form (3NF )
Eliminate Columns Not Dependent
On Key
◦ If attributes do not contribute to a
description of the key, remove them to a
separate table.
◦ This level avoids update and delete
anomalies.
Definition of a Table (Relation) in 3NF
◦ It is in 2NF and
◦ There are no transitive dependencies
between attributes.
36
Cont…
It is not in 3NF

37
Cont…
Normalized to 3NF

38
Example 2 for (3NF)
 Assumption: Students of same batch (same year)
live in one building or dormitory

 Thisschema is in its 2NF since the primary key is


a single attribute.
39
Cont…
Let’s
take StudID, Year and
Dormitary and see the
dependencies.
StudID→Year AND Year →
Dormitary
Then transitively StudID →
Dormitary
To convert it to a 3NF we need to
remove all partial dependencies of
non key attributes on part of the
40
Cont…

41
Boyce-Codd Normal Form
(BCNF)

is a type of database normalization
that is a stronger version of the Third
Normal Form (3NF).

A table is in BCNF if, for every one of
its functional dependencies (X → Y),
X is a superkey.

This means that the left-hand side of
every functional dependency must
be a candidate key.

42
Fourth Normal Form (4NF)

is a level of database normalization
that addresses multi-valued
dependencies.

A table is in 4NF if it is in Boyce-Codd
Normal Form (BCNF) and has no multi-
valued dependencies.

A multi-valued dependency occurs
when one attribute in a table uniquely
determines another attribute, but the
second attribute can have multiple
values independently of other
attributes.
43
Fifth Normal Form (5NF)

also known as Project-Join Normal Form
(PJNF), is a level of database normalization
that deals with cases where information can
be reconstructed from smaller pieces of
data.

A table is in 5NF if it is in Fourth Normal
Form (4NF) and has no join dependencies
that are not implied by its candidate keys.
Essentially, 5NF ensures that a table can be
decomposed into smaller tables without
losing information and can be recombined
later.
44
Questions ?

45

You might also like