Normalization
Normalization
Step-02:
Recursively add the attributes to the result set which can be
functionally determined from the attributes already contained
in the result set.
Closure
Example-
A → BC
BC → DE
D→F
CF → G
Closure
Closure of attribute A-
A+ = { A }
= { A , B , C } ( Using A → BC )
= { A , B , C , D , E } ( Using BC → DE )
= { A , B , C , D , E , F } ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }
Closure
Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the
result set.
Thus,
D+ = { D , F }
Closure
Closure of attribute set {B, C}-
{ B , C }+= { B , C }
= { B , C , D , E } ( Using BC → DE )
= { B , C , D , E , F } ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Functional Dependency
Functional Dependency
X → Y
The left side of FD is known as a determinant, the right side of
the production is known as a dependent.
Functional Dependency
Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address.
Eid dept
Enamedept
Normalization
What is it and what is need of normalization:
Process of organizing data in DB.
1.Minimize redundancy in relation
2.Divide big table into small tables
Need
To remove anomolies
Normalization
• If a database design is not perfect, it may contain anomalies,
which are like a bad dream for any database administrator.
Managing a database with anomalies is next to impossible.
Problems Without Normalization
If a table is not properly normalized and have data redundancy
then it will not only consume extra memory space but will also
make it difficult to handle and update the database, without
facing data loss.
In this table, a new branch civil cant be added until you have the sid
of the student who took admission in that branch civil. You cant
enter data without sid because entire branch data is kept in a single
table.
Anomalies
2. Update anomalies − If data items are scattered and are not
linked to each other properly, then it could lead to strange
situations.
For example, when we try to update one data item having its
copies scattered over several tables, a few instances get
updated properly while a few others are left with old values.
Such instances leave the database in an inconsistent state.
Anomalies
3. Deletion anomalies − We tried to delete a record that is
unwanted but it leads to deletion of the data that we wanted.
Example: we wanted to delete details of student but did not
want to delete branch information. Now because branch info.
Is not stored separately so it will also be deleted.
Anomalies
Insertion anomaly:
For Example, If we try to delete a record from STUDENT with STUD_NO =1.
Normalization
• Normalization is the process of organizing the data in the
database.
• Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate the
undesirable characteristics like Insertion, Update and Deletion
Anomalies.
• Normalization divides the larger table into the smaller table
and links them using relationship.
• The normal form is used to reduce redundancy from the
database table.
Normalization Types
Normalization Types
First Normal Form (1NF)
1 10 1 82 A
2 10 2 87 B
3 11 1 86 C
4 11 2 85 D
5 11 4 80 E
Example2: check whether this Relation is in 2NF or
not? scor stud subj Mar Teac
eid entid ectid ks her
1.Display teacher of student with sid=10
1 10 1 82 A
2.Display teacher for subjectid=1 2 10 2 87 B
3 11 1 86 C
(studentid,subjectid) teacher
4 11 2 85 D
To know the teacher, subject id is enough. 5 11 4 80 E
Subject:
subjectid subjectname teacher
Final tables:
Student: studentid name regno branch address
Score table
scoreid studentid subjectid marks
subject
subjectid subjectname teacher
3NF: No transitive dependency
When a non prime attribute in a table depends upon other non
prime attribute.
Consider score table with additional 2 attributes:
scoreid studentid subjectid marks examname Total marks
AB
A: non prime and B: prime :{not allowed in BCNF}
BCNF: 3.5NF
studentid subject professor
101 Java P.Java
101 C P.C
102 Java P.Java
103 C++ P.C++
2 subjects
BCNF: 3.5NF
Multiple prof. teach same studenti subject professo
subject d r
101 Java P.Java
(subjectid,subject) professor
101 C P.C
professor subject
102 Java P.Java
Nonprime Prime att. 103 C++ P.C++
104 java P.Java
studentid professor
Ternary relationship
5NF:
2. Customer buying from supplier and supplier can have one or
more products.
5NF:
3. Product used by customer can be supplied by one or more
suppliers
5NF
Binary Relationships can be:
Supplier-customer
Customer-product
Product-supplier
5NF
5NF: final tables
Supplier product customer
Non-prime attributes: In the given table, all attributes except EMP_ID are non-
prime.
That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
Boyce Codd normal form (BCNF)
•For BCNF, the table should be in 3NF, and for every FD, LHS is
super key.
Candidate key: {EMP-ID, EMP-DEPT}
Fourth normal form (4NF)
So to make the above table into 4NF, we can decompose it into two
tables:
Normalization