0% found this document useful (0 votes)
2 views

Normalization-2

The document explains the process of database normalization, which involves organizing a database using defined rules called Normal Forms, specifically the first, second, and third normal forms. Each normal form addresses specific issues such as eliminating repeating groups, ensuring all non-primary columns depend on the entire primary key, and removing non-key attributes that depend on other non-key attributes. The document also touches on denormalization, where normalization rules may be broken for performance improvements.

Uploaded by

balakhatun215
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Normalization-2

The document explains the process of database normalization, which involves organizing a database using defined rules called Normal Forms, specifically the first, second, and third normal forms. Each normal form addresses specific issues such as eliminating repeating groups, ensuring all non-primary columns depend on the entire primary key, and removing non-key attributes that depend on other non-key attributes. The document also touches on denormalization, where normalization rules may be broken for performance improvements.

Uploaded by

balakhatun215
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

https://medium.

com/omarelgabrys-blog/database-normalization-part-7-ef7225150c7f

Database — Normalization
The process of organizing your database through a set of defined steps.

This is a process where you take your database design, and you apply a set of formal criteria of
rules called “Normal Forms”. And we step through them first normal form, second normal form,
and third normal form.

You usually will end up creating a few new tables as part of the process. But, the end result of
normalization is to make your database easier to edit, easier to maintain, and preform operations,
remove any duplicates, & more reliable to work with.

First Normal Form


First normal form requires that every column should have one, and only one value; there is no
multi-valued attribute, and there shouldn’t be repeating groups of data.

So, for example, if you have a customer, and this customer can have two emails; the main email
and the alternative one. Now, you can just put two columns in your table, each column represent
on of these emails (repeating group ☹).

Alternatively, you can have two rows for this customer, one with the main email, and the other
row with the alternative email, while keeping the other columns’ values as they are (repeating
group ☹).

But, “What if you want to extend this, and add more emails?”. We don’t want to have a table that
has repeating group of data.

So, what are going to do is to extract the customer emails in a new table. This new table will have
a foreign key points to the primary key of the original table, and another column for the email
values. Both the foreign key and the email value are together form the primary key of the new
table.

1
First Normal Form

Now, we can say that our tables are in the first normal form, let’s make it in the second normal
form.

Second Normal Form


The problem arise when you have a composite primary key in a table. The second normal form
requires that all of the non-primary columns have to be dependent on the entire (composite)
primary key.

An example could be, let’s say you have a table that stores books information. A book could have
a name, author, and date released.

According to the requirements, we can have book that’s written by more than one author, but we
can’t have more than one book with the same book name and the same author. So, both name and
author form a composite primary key.

2
A book table

Now, the date column relies only on the name of the book, it doesn’t care about who wrote this
book, it cares about the name of the book. And if you noticed we’ve written the date “2011” more
than one time with “Database Systems” book. This is not a good practice when you have
duplicates.

So, what we are going to do is to create a separate table, with one-to-many relationship with the
original table, it has the book name as a primary key, and all the dependent attributes (like date).

Now, the book name in the book table is a foreign key that points to the primary key of the new
created table, and every non-primary column in the book table (if exist) depends on the whole
composite primary key; name and author, and not one of them.

3
Second Normal Form

Now, we can say that our tables are in the second normal form, let’s make it in the third one.

Third Normal Form


The third normal form is concerned with the non-key attribute that rely on another non-key
attributes, and not the primary key.

An example for that could be with the course table, where every course has an name as a primary
key, room number and capacity.

4
Third Normal Form

Now, it’s obviously noted that the capacity depends on the room, it has nothing to do with the
course name. This means we have a non-key attribute that depends on another non-key attribute.

Why this sounds like a problem? Well, “What happens if somebody reached into this table, and
they changed that room for a specific course, but they didn’t change the capacity?”.

So, what we are going to do is the same, we will create a new table with one-to-many relationship
with the original table, it has the room number as a primary key, and all the dependent attributes
like the capacity.

Now, the room number in the course table is a foreign key that points to the primary key of the
new created table.

Third Normal Form

Derived Attribute
Another example of third normal form is when you have a (non-primary) derived attribute that
depends on other non-primary attributes.

5
We would remove that derived column form this table, as we can figure it out when we need to.
One of the main reasons for this is to prevent any conflicts, any inaccurate data to be inserted
mistakenly.

Many database systems offer you the option of defining a computed or calculated column. It’s not
actually stored in the database, it is a convenient read-only fiction. It’s value is automatically
calculated based on the other columns in the table, and you may find that useful from time to
time.

Denormalization
Sometimes you will hit the situation where you will break the normalization rules for performance
improvements, because normalization is often involves splitting data into multiple tables.

If you are following along, you remember at the first normal form, when we explained the
example of emails. “What happens if we added two email columns to the customer table?.”

Technically, this can be described as breaking first normal form. It’s a repeating group. But in
practice, you may find it more convenient to just allow an email1 and email2 columns, because
you’re sure there won’t be a flexible number of email addresses.

This will improve your performance, and save you from creating more tables.

You might also like