Database Normalization Updated
Database Normalization Updated
Normalization
What is Database Normalization?
An anomaly is where there is an issue in the data that is not meant to
be there. This can happen if a database is not normalised.
We’ll be using a student database as an example, which records student, class,
and teacher information.
Student
Student Name Fees Paid Course Name Class 1 Class 2 Class 3
ID
This is not a normalised table, and there are a few issues with this.
Insert Anomaly
An insert anomaly happens when we try to insert a record into this table without knowing all
the data we need to know.
For example, if we wanted to add a new student but did not know their course name.
The new record would look like this:
Student Course
Student ID Fees Paid Class 1 Class 2 Class 3
Name Name
1 John Smith 200 Economics Economics 1 Biology 1
Computer Business Programming
2 Maria Griffin 500 Biology 1
Science Intro 2
Susan
3 400 Medicine Biology 2
Johnson
4 Matt Long 850 Dentistry
Jared
5 0 ?
Oldham
We would be adding incomplete data to our table, which can cause issues when trying to analyse this data.
Update Anomaly
An update anomaly happens when we want to update data, and we update some of the data
but not other data.
For example, let’s say the class Biology 1 was changed to “Intro to Biology”. We would have to
query all of the columns that could have this Class field and rename each one that was found.
Student Course
Student ID Fees Paid Class 1 Class 2 Class 3
Name Name
Intro to
1 John Smith 200 Economics Economics 1
Biology
Computer Intro to Business Programming
2 Maria Griffin 500
Science Biology Intro 2
Susan
3 400 Medicine Biology 2
Johnson
4 Matt Long 850 Dentistry
There’s a risk that we miss out on a value, which would cause issues.
Ideally, we would only update the value once, in one location.
Delete Anomaly
A delete anomaly occurs when we want to delete data from the table, but we end up deleting more than
what we intended.
For example, let’s say Susan Johnson quits and her record needs to be deleted from the system. We could
delete her row:
Student ID Student Name Fees Paid Course Name Class 1 Class 2 Class 3
1 John Smith 200 Economics Economics 1 Biology 1
Computer
2 Maria Griffin 500 Biology 1 Business Intro Programming 2
Science
3 Susan Johnson 400 Medicine Biology 2
4 Matt Long 850 Dentistry
But, if we delete this row, we lose the record of the Biology 2 class, because it’s not stored anywhere else. The same
can be said for the Medicine course.
We should be able to delete one type of data or one record without having impacts on other records we don’t want
to delete.
Without any normalization, all information is stored in one table as shown below.
1NF Example
First Normal Form (1NF)
Table 2
Second Normal Form (2NF)
Table 2
Third Normal Form
(3NF)