NORMALIZATION (Autosaved)
NORMALIZATION (Autosaved)
Normalization is the process of organizing the data and the attributes of a database.
It is performed to reduce the data redundancy in a database and to ensure that data is stored
logically.
Data redundancy in DBMS means having the same data but at multiple places.
It is necessary to remove data redundancy because it causes anomalies in a database which
makes it very hard for a database administrator to maintain it.
Why Do We Need Normalization?
• There are six types of normal forms that are usually used in relational
database:
• 1NF: A relation is in 1NF if all its attributes have an atomic value.
• 2NF: A relation is in 2NF if it is in 1NF and all non-key attributes are
fully functional dependent on the candidate key in DBMS.
• 3NF: A relation is in 3NF if it is in 2NF and there is no transitive
dependency.
• BCNF: A relation is in BCNF if it is in 3NF and for every Functional
Dependency, LHS is the super key.
Functional dependency
• Functional dependency is a relationship that exists between two sets
of attributes of a relational table where one set of attributes can
determine the value of the other set of attributes. It is denoted by X -
> Y, where X is called a determinant and Y is called dependent.
First Normal Form (1NF)
<EmployeeProjectDetail>
Employee Employee
Project ID Project Name
Code Name
101 P03 John Project103
101 P01 John Project101
102 P04 Ryan Project104
103 P02 Stephanie Project102
Contd..
• In the above table, the prime attributes of the table are Employee
Code and Project ID. We have partial dependencies in this table
because Employee Name can be determined by Employee Code and
Project Name can be determined by Project ID. Thus, the above
relational table violates the rule of 2NF.
• Prime attribute: they are the key attributes beacause they can be
used to uniquely identify any of the tables record.
• Non- prime attributes: they are those that are not primary key
attributes. They can store a value an unlimited number of times.
Partial dependencies
• In order to understand partial dependency, let us first know some basic
terminologies with the help of an example.
• Consider a relation(table) having four attributes, P, Q, R, S having the following
dependencies:
P,Q→S
Q→R
• Using P and Q, we can derive S, and using only Q, we can derive R. Hence, we can
say that if we use both P and Q together, then we can derive all the attributes of
the table, i.e., P, Q, R, S. (since P→P and Q→Q is self-explanatory).
• We can write as (PQ)+={P,Q,R,S}, or in simple words, we can say the closure
of P and Q gives us all the attributes of the relation. The minimal sets like PQ in a
relation(table) that are capable of deriving all the attributes of a relation(table)
are called Candidate keys. There can be more than one candidate key in a table.
• concept of partial\ dependency.
• If an attribute is a part of any candidate key of the relation, then it is called
a Primary attribute else, it is said to be a Non−Primary attribute. In the
example above, we can say that P and Q are primary attributes,
and R and S are non-primary attributes.
• We now know the basic definitions required to understand the concept of
partial dependency. In the above example, S is dependent on all the
primary attributes, i.e., P and Q. If either P or Q are missing, then we
cannot derive S. In the case of R, it is not the same.
• Even if P, a primary attribute, is missing, we can still derive R using only Q.
Hence, instead of depending totally on the candidate key, R is partially
dependent on Q, part of a candidate key. This is the
Partial Functional Dependency
<ProjectDetail>
Employee Code Employee Name Employee Zipcode Employee Zipcode Employee City
101 John 110033 110033 Model Town
101 John 110044 110044 Badarpur
102 Ryan 110028 110028 Naraina
103 Stephanie 110064 110064 Hari Nagar
Contd..
• Thus, we’ve converted the <EmployeeDetail> table into 3NF by
decomposing it into <EmployeeDetail> and <EmployeeLocation>
tables as they are in 2NF and they don’t have any transitive
dependency.
• The 2NF and 3NF impose some extra conditions on dependencies on
candidate keys and remove redundancy caused by that. However,
there may still exist some dependencies that cause redundancy in the
database. These redundancies are removed by a more strict normal
form known as BCNF.
Boyce-Codd Normal Form (BCNF)
The above table satisfies all the normal forms till 3NF, but it violates the rules of BCNF because the candidate key of the
above table is {Employee Code, Project ID}.
For the non-trivial functional dependency,
Project Leader -> Project ID,
Project ID is a prime attribute but Project Leader is a non-prime attribute. This is not allowed in BCNF.
<EmployeeProject>
<ProjectLead>
Thus, we’ve converted the <EmployeeProjectLead> table into BCNF by decomposing it into <EmployeeProject> and
<ProjectLead> tables.