Database Normalization - New
Database Normalization - New
Database normalization is the process of organizing the fields and tables of a relational
database to minimize redundancy.
Normalization usually involves dividing large tables into smaller (and less redundant)
tables and defining relationships between them.
The objective is to isolate data so that additions, deletions, and modifications of a field
can be made in just one table and then propagated through the rest of the database
using the defined relationships.
Codd went on to define the Second Normal Form (2NF) and Third Normal Form (3NF)
in 1971.
Codd and Raymond F. Boyce defined the Boyce-Codd Normal Form (BCNF) in 1974.
Each row of data must have a unique identifier i.e Primary key. For example consider a
table which is not in First normal form.
[Unorganized relation]
[ Relation in 1NF ]
Second Normal Form:
Before we learn about second normal form, we need to understand the following:
That is, if X → A holds, then there should not be any proper subset Y of X, for that Y →
A also holds.
We see here in Student_Project relation that the prime key attributes are Stu_ID and
Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be
dependent upon both and not on any of the prime key attribute individually. But we
find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by
Proj_ID independently. This is called partial dependency, which is not allowed in
Second Normal Form.
[Relation in 2NF]
We broke the relation in two as depicted in the above picture. So there exists no partial
dependency.
Third Normal Form:
Transitive dependency
A transitive dependency is an indirect functional dependency, one in
which X→Z only by virtue of X→Y and Y→Z.
Super key
A superkey is a combination of attributes that can be used to uniquely identify a
database record. A table might have many superkeys.
For a relation to be in Third Normal Form, it must be in Second Normal form and the
following must satisfy:
A is prime attribute.
[Relation not in 3NF]
We find that in above depicted Student_detail relation, Stu_ID is key and only prime
key attribute.
We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor City is a prime attribute. Additionally, Stu_ID → Zip → City, so there
exists transitive dependency.
That is the complete process. Having started off with an unnormalized table we
finished with four normalized tables in 3NF. You will notice that duplication has been
removed (apart from the keys needed to establish the links between those tables).
The process may look complicated. However, if you follow the rules completely,
and do not miss out any steps, then you should arrive at the correct solution. If you
omit a rule there is a high probability that you will end up with too few tables or
incorrect keys.
2. Second normal form: A table is in the second normal form if it is in the first
normal form and contains only columns that are dependent on the whole
(primary) key.
3. Third normal form: A table is in the third normal form if it is in the second
normal form and all the non-key columns are dependent only on the primary
key. If the value of a non-key column is dependent on the value of another non-
key column we have a situation known as transitive dependency. This can be
resolved by removing the columns dependent on non-key items to another table.
Solve this Example:
Q. Convert below table in Normal Form.
UNF
Answer:
1NF
2NF
3NF