Normalization: by JK MCA-Sem-II
Normalization: by JK MCA-Sem-II
By jk MCA-Sem-II
Database Normalization
Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability. In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
History
Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated: There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition nonsimple domains are replaced by domains whose elements are atomic (nondecomposable) values.
Normal Form
Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).
Table 1
Title
Database System Concepts
Author1
Abraham Silberschatz
Author 2
ISBN
Subject
MySQL, Computers
Pages
1168
Publisher
McGraw-Hill
Abraham Silberschatz
Computers
944
McGraw-Hill
Table 1 problems
Title
Table 2
Author ISBN Subject Pages Publisher
Abraham Silberschatz
Henry F. Korth
0072958863
0072958863
MySQL
Computers
1168
1168
McGraw-Hill
McGraw-Hill
0471694665 0471694665
Computers Computers
944 944
McGraw-Hill McGraw-Hill
We now have two rows for a single book. Additionally, we would be violating the Second Normal Form A better solution to our problem would be to separate the data into separate tablesan Author table and a Subject table to store our information, removing that information from the Book table:
Subject Table
Subject_ID 1 Subject MySQL
Computers
Author Table
Author_ID 1 2 Last Name First Name
Book Table
ISBN
0072958863
Title
Database System Concepts Operating System Concepts
Pages
1168
Publisher
McGraw-Hill
0471694665
944
McGraw-Hill
Each table has a primary key, used for joining tables together when querying the data. A primary key value must be unique with in the table (no two books can have the same ISBN number), and a primary key is also an index, which speeds up data retrieval based on the primary key. Now to define relationships between the tables
Relationships
Book_Author Table
Book_Subject Table
ISBN Author_ID
0072958863
0072958863 0471694665 0471694665
1
2 1 2
ISBN
Subject_ID
0072958863
0072958863
1
2
0471694665
As the First Normal Form deals with redundancy of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in vertical columns. As stated earlier, the normal forms are progressive, so to achieve Second Normal Form, the tables must already be in First Normal Form. The Book Table will be used for the 2NF example
2NF Table
Publisher Table
Publisher_ID
1
Publisher Name
McGraw-Hill
Book Table
ISBN 0072958863 Title Database System Concepts Pages 1168 Publisher_ID 1
0471694665
944
2NF
Here we have a one-to-many relationship between the book table and the publisher. A book has only one publisher, and a publisher will publish many books. When we have a one-tomany relationship, we place a foreign key in the Book Table, pointing to the primary key of the Publisher Table. The other requirement for Second Normal Form is that you cannot have any data in a table with a composite key that does not relate to all portions of the composite key.
BCNF requires that the table is 3NF and only determinants are the candidate keys
END