0% found this document useful (0 votes)
50 views

Database Management Systems: Normalization

Normalization is the process of organizing data in a database to minimize redundancy and dependency. It involves breaking tables into smaller tables and linking them through their primary keys. The goals are to simplify inserts, updates, and deletes while reducing data anomalies. There are several normal forms that data passes through, from 1st normal form to ensure atomicity and removal of repeating groups, to higher normal forms like 2nd, 3rd, BCNF, and eventually 5th normal form to remove all dependencies except those imposed by the primary keys. Normalization improves data quality, reduces storage space, and ensures the relationships between data are clearly defined.

Uploaded by

Karthik Krish
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Database Management Systems: Normalization

Normalization is the process of organizing data in a database to minimize redundancy and dependency. It involves breaking tables into smaller tables and linking them through their primary keys. The goals are to simplify inserts, updates, and deletes while reducing data anomalies. There are several normal forms that data passes through, from 1st normal form to ensure atomicity and removal of repeating groups, to higher normal forms like 2nd, 3rd, BCNF, and eventually 5th normal form to remove all dependencies except those imposed by the primary keys. Normalization improves data quality, reduces storage space, and ensures the relationships between data are clearly defined.

Uploaded by

Karthik Krish
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Database Management Systems

Normalization

What is Normalization?
In relational database theory Normalization is the process of restructuring the logical data model of a database to eliminate redundancy, organize data efficiently and reduce repeating data and to reduce the potential for anomalies during data operations. Data normalization also may improve data consistency and simplify future extension of the logical data model. The formal classifications used for describing a relational database's level of normalization are called normal forms Normal Forms is abbreviated as NF

Normalization
Is a process of deleting different anomalies by splitting the relation into two or more classes 1NF 2NF 3NF BCNF( Boyce coded normal form) 4NF 5NF

Normal Forms
The first normal form requires that tables be made up of a primary key and a number of atomic fields, and the second and third deal with the relationship of non-key fields to the primary key. These have been summarized as requiring that all non-key fields be dependent on "the key, the whole key and nothing but the key". In practice, most applications in 3NF are fully normalized. However, research has identified potential update anomalies in 3NF databases. BCNF (Boyce-Codd Normal Form)is a further refinement of 3NF that attempts to eliminate such anomalies. The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-many and one-many relationships. Sixth normal form (6NF) only applies to temporal databases.

Benefits of Normalization
Less storage space Quicker updates Less data inconsistency Clearer data relationships Easier to add data Flexible Structure

Data redundancy and update anomalies


Problems associated with data redundancy are illustrated by comparing the Staff and Branch tables with the StaffBranch table.

Data redundancy and update anomalies

Data redundancy and update anomalies


StaffBranch table has redundant data; the branch information are repeated for every member of staff. In contrast, the branch information appears only once for each branch in the Branch table and only the branch number (branchNo) is repeated in the Staff table, to represent where each member of staff is located.

Data redundancy and update anomalies

Data redundancy and update anomalies


Tables that contain redundant information may potentially suffer from update anomalies.

Types of update anomalies include


insertion deletion modification

Relationship of Normal Forms

Stages of Normalisation
Remove repeating groups

First normal form (1NF)


Remove partial dependencies

Second normal form (2NF)


Remove transitive dependencies

Third normal form (3NF)


Boyce-Codd normal form (BCNF)

Remove remaining functional dependency anomalies

Remove multivalued dependencies

Fourth normal form (4NF)


Remove remaining anomalies

Fifth normal form (5NF)

DB- Introduction

12

First Normal Form


First normal form (1NF) lays the groundwork for an organized database design: Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record. Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately. Atomicity: Each attribute must contain a single value, not a set of values. A table in which the intersection of every column and record contains only one value.

Branch table is not in 1NF

Converting Branch table to 1NF

Second Normal Form


Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key: The database must meet all the requirements of the first normal form. Data which is redundantly duplicated across multiple rows of a table is moved out to a separate table.

TempStaffAllocation table is not in 2NF

Second normal form (2NF)


Formal definition of 2NF is a table that is in 1NF and every non-primary-key column is fully functional dependent on the primary key. Full functional dependency indicates that if A and B are columns of a table, B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A.

Converting TempStaffAllocation table to 2NF

Third Normal Form


Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table. The database must meet all the requirements of the second normal form. Any field which is dependent not only on the primary key but also on another field is moved out to a separate table. The formal definition of 3NF is a table that is in 1NF and 2NF and in which no nonprimary-key column is transitively dependent on the primary key.

Converting the StaffBranch table to 3NF

Boyce-Codd Normal Form


Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey). A relation is in BCNF, if and only if every determinant is a candidate key.

3NF to BCNF
Identify all candidate keys in the relation. Identify all functional dependencies in the relation. If functional dependencies exists in the relation where their determinants are not candidate keys for the relation, remove the functional dependencies by placing them in a new relation along with a copy of their determinant.

33

Fourth Normal Form (4NF)


4NF: A relation that is in Boyce-Codd Normal Form and contains no MVDs. BCNF to 4NF involves the removal of the MVD from the relation by placing the attribute(s) in a new relation along with a copy of the determinant(s).

49

Normalization BCNF to 4NF Relations

50

Fifth Normal Form


Fifth normal form (5NF and also PJ/NF) requires that there are no non-trivial join dependencies that do not follow from the key constraints. A table is said to be in the 5NF if and only if it is in 4NF and every join dependency in it is implied by the candidate keys.

Domain/Key Normal Form


Domain/key normal form (or DKNF) requires that the database contains no constraints other than domain constraints and key constraints.

Sixth Normal Form


This normal form was, as of 2005, only recently proposed: the sixth normal form (6NF) was only defined when extending the relational model to take into account the temporal dimension (time). Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. Tree structure data implementation or Logic-Based Database Hierarchical model using relational model needs to be considered in addition to normalization for a useful database design.

You might also like