0% found this document useful (0 votes)
49 views

DBS Normalization

Normalization is a design technique for relational databases that aims to reduce data redundancy and improve data integrity by organizing data into separate tables. It addresses issues such as update, insertion, and deletion anomalies that arise from un-normalized forms, which combine multiple entities in a single table. The process involves progressing through various normal forms (1NF, 2NF, 3NF) to ensure that all non-key attributes are functionally dependent only on the primary key.

Uploaded by

dsmarusha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

DBS Normalization

Normalization is a design technique for relational databases that aims to reduce data redundancy and improve data integrity by organizing data into separate tables. It addresses issues such as update, insertion, and deletion anomalies that arise from un-normalized forms, which combine multiple entities in a single table. The process involves progressing through various normal forms (1NF, 2NF, 3NF) to ensure that all non-key attributes are functionally dependent only on the primary key.

Uploaded by

dsmarusha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

INSTITUTE OF ACCOUNTANCY

ARUSHA

DATABASE SYSTEMS
NORMALIZATION
WHAT IS NORMALIZATION
• Normalization is a design technique that is
widely used as a guide in designing relational
databases
• Normalization theory is based on the concepts
of normal forms.
• The process of converting un-normal to
normal form.
PROBLEMS OF UN-NORMALIZED FORMS

Let us consider the following relation student.

sname address cno cname instructor office


sno

85001 Smith 1, Main CP302 Database Gupta 102


85001 Smith 1, Main CP303 Communication Wilson 102
85001 Smith 1, Main CP304 Software Engg. Williams 1024
85005 Jones 12, 7th CP302 Database Gupta 102
PROLEMS OF UN-NORMALIZED FORMS
Repetition of information --- A lot of information is
being repeated. Student name, address, course
name, instructor name and office number are
being repeated often.
• Every time we wish to insert a student enrolment,
say, in CP302 we must insert the name of the
course CP302 as well as the name and office
number of its instructor.
• Also every time we insert a new enrolment for, say
Smith, we must repeat his name and address.
• Repetition of information results in wastage of
storage as well as other problems.
PROLEMS OF UN-NORMALIZED FORMS
Update Anomalies
• Example, changing the name of the instructor
of CP302 would require that all tuples
containing CP302 enrolment information be
updated.
• If for some reason, all tuples are not updated,
we might have a database that gives two
names of instructor for subject CP302.
PROLEMS OF UN-NORMALIZED FORMS
• Insertional Anomalies -- Inability to represent
certain information
• Example: if one wanted to insert the number
and name of a new course in the database, it
would not be possible until a student enrolls
in the course and we are able to insert values
of sno and cno.
PROLEMS OF UN-NORMALIZED FORMS

Deletion Anomalies -- Loss of Useful Information --- In


some instances, useful information may be lost when a
tuple is deleted.
• For example, if we delete the tuple corresponding to
student 85001 doing CP304, we will loose relevant
information about course CP304 (viz. course name,
instructor, office number) if the student 85001 was the
only student enrolled in that course.
• Similarly deletion of course CP302 from the database
may remove all information about the student named
Jones. This is called deletion anomalies.
PROLEMS OF UN-NORMALIZED FORMS

• Root cause of the problems: combining more


than one entities in single table
• Solution: Decompose the table into individual
entities – that is normalization
Functional dependence
• Saying that column Y is functionally dependent upon X is the
same as saying the values of column X identify the values of
column Y.
• If column X is a primary key, then all columns in the relational
table R must be functionally dependent upon X.
• A short-hand notation for describing a functional dependency
is: R.x —> R.y
• which can be read as in the relational table named R, column
x functionally determines (identifies) column y.
Full functional dependence
• Full functional dependence applies to tables
with composite keys.
• Full functional dependence means that when
a primary key is composite, made of two or
more columns, then the other columns must
be identified by the entire key and not just
some of the columns that make up the key.
Normalization Processes
First Normal Form
• Eliminate duplicative columns from the same
table.
• Create separate tables for each group of
related data and identify each row with a
unique column (the primary key).
Not in 1nf
Manager Subordinates

Bob Jim, Mary, Beth

Mary Mike, Jason, Carol, Mark

Jim Alan
in INF
Manager Subordinate

Bob Jim

Bob Mary

Bob Beth

Mary Mike

Mary Jason

Mary Carol

Mary Mark

Jim Alan
Second Normal Form
• Tables with composite primary keys can be in
1NF but not in 2NF.
• A relational table is in second normal form
2NF if it is in 1NF and every non-key column is
fully dependent upon the primary key.
• That is, every non-key column must be
dependent upon the entire primary key.
• Examine relation of non-PK to PK
In NF but not in 2NF
Functional Dependence in table first
• FIRST is in 1NF but not in 2NF because status
and city are functionally dependent upon only
on the column s# of the composite key (s#,
p#).
• This can be illustrated by listing the functional
dependencies in the table:
s# —> city, status
(s#,p#) —>qty
Anomalies in first (previous) table
• INSERT. The fact that a certain supplier (s5) is located
in a particular city (Athens) cannot be added until
they supplied a part.
• DELETE. If a row is deleted, then not only is the
information about quantity and part lost but also
information about the supplier.
• UPDATE. If supplier s1 moved from London to New
York, then six rows would have to be updated with
this new information.
transforming a 1NF table to 2NF

• Identify any determinants other than the composite


key, and the columns they determine.
• Create new table for each determinant and the
unique columns it determines.
• Move the determined columns from the original
table to the new table. The determinate becomes
the primary key of the new table.
In 2nf
Third Normal Form

• The third normal form requires that all columns in a


relational table are dependent only upon the primary
key.
• A relational table is in third normal form (3NF) if it is
already in 2NF and all nonkey attributes are
functionally dependent only upon the primary key.
• The nonkey attributes should not depend on other
nonkey attributes.
• Examine the relation between non-PK
Third Normal Form
• Table PARTS is already in 3NF. The non-key column, qty, is fully
dependent upon the primary key (s#, p#).
• SUPPLIER is in 2NF but not in 3NF because it contains a
transitive dependency.
• The concept of a transitive dependency can be illustrated by
showing the functional dependencies in SUPPLIER:
– SUPPLIER.s# —> SUPPLIER.status
– SUPPLIER.s# —> SUPPLIER.city
– SUPPLIER.city —> SUPPLIER.status
• Note that SUPPLIER.status is determined both by the primary
key s# and the non-key column city.
transforming a table into 3NF

• Identify any determinants, other than the primary


key, and the columns they determine.
• Create a new table for each determinant and the
columns it determines.
• The determinate becomes the primary key of the
new table.
• To transform SUPPLIER into 3NF, we create a new
table called CITY_STATUS and move the columns city
and status into it.
In 3NF
The table before normalization
In NF but not in 2NF
Final tables after normalization
The Object-Oriented Model
• Easy integration of existing software modules (objects /
components) into newly developed software systems.
• Process begins with OOA and OOD
• Then, acquire suitable components from reusable software
component libraries (or purchase them).
• Otherwise, develop as needed.
• Can involve adding to repertoire of library components.
• Economy: integrating reusable components; much lower
cost than developing
• Improved quality – using tested components
• Shorter development times: integration of reusable
software components.

You might also like