UNIT-3DBMS (Normalization and Functional Dependency)
UNIT-3DBMS (Normalization and Functional Dependency)
Second and Third Normal Forms, Boyce-Codd Normal Form, Multi valued Dependencies and
Fourth Normal Form, Join Dependencies and Fifth Normal Form and some examples related to
functional dependency and normal forms.
Functional Dependencies
Formal tool for analysis of relational schemas that enables us to detect and describe
some of the problems in precise terms.
Definition of Functional Dependency
From the semantics of the attributes and the relation, we know that the following functional
dependencies should hold:
a) Ssn → Ename
b) Pnumber →{Pname, Plocation}
c) {Ssn, Pnumber}→ Hours
These functional dependencies specify that
a) The value of an employee’s Social Security number (Ssn) uniquely
determines the employee name (Ename)
b) The value of a projects number (Pnumber) uniquely determines the project
name (Pname) and location (Plocation) and
c) A combination of Ssn and Pnumber values uniquely determines the
number of hours the employee currently works on the project per week
(Hours).
Alternatively,we say that Ename is functionally determined by (or functionally
dependent on) Ssn, or given a value of Ssn, we know the value of Ename, and so on.
Relation extensions r(R) that satisfy the functional dependency constraints are called
legal relation states (or legal extensions) of R.
A functional dependency is a property of the relation schema R, not of a particular
legal relation state r of R.
Therefore, a Functional Dependencies cannot be inferred automatically from a given
relation extension r but must be defined explicitly by someone who knows the
semantics of the attributes of R.
EXAMPLE:
INFERENCE
RULES:
EXAMPLE: Given FD’s for Relation R{A,C,D,E,F}, Find Closure of FD set by applying
Amstrong’s Axims.
A->B, A->C, CD->E, CD->F, B->E.
Solution:
EXAMPLE: Compute the closure of the following set F of functional dependencies for
relational schema R = (A,B,C,D,E) A->BC, CD->E, B->D, E->A
EXAMPLE: Give Armstrong’s axioms and using it find the closure of the following FD
set.
A->B, AB->C, D->AC, D->E
Solution:
EXAMPLE: R={A,B,C,D,E,F} and FDs are A->BC, E->CF, B->E, CD->EF compute
closure of {A,B}+
Solution:
We assume that a
Set of functional dependencies is given for each relation Each relation has a
designated primary key.
This information combined with the tests (conditions) for normal forms drives the
normalization process for relational schema design.
First three normal forms for relation takes into account all candidate keys of a relation
rather than the primary key.
Normalization of Relations
The normalization process, as first proposed by Codd (1972a), takes a relation schema
through a series of tests to certify whether it satisfies a certain normal form.
Initially, Codd proposed three normal forms, which he called first, second, and third
normal form
All these normal forms are based on a single analytical tool: the functional
dependencies among the attributes of a relation
A fourth normal form (4NF) and a fifth normal form (5NF) were proposed, based on
the concepts of multivalued dependencies and join dependencies, respectively
Normalization of data can be considered a process of analyzing the given relation
schemas based on their FDs and primary keys to achieve the desirable properties of
1) minimizing redundancy and
2) minimizing the insertion, deletion, and update anomalies
It can be considered as a “filtering” or “purification” process to make the design have
successively better quality.
Unsatisfactory relation schemas that do not meet certain conditions the normal form
tests are decomposed into smaller relation schemas that meet the tests and hence
possess the desirable properties.
Thus, the normalization procedure provides database designers with the following:
A formal framework for analyzing relation schemas based on their keys
and on the functional dependencies among their attributes.
A series of normal form tests that can be carried out on individual relation
schemas so that the relational database can be normalized to any desired
degree.
Definition: The normal form of a relation refers to the highest normal form
condition that it meets, and hence indicates the degree to which it has been
normalized.
In WORKS_ON relation Both Ssn and Pnumber are prime attributes whereas other attributes
are nonprime.
First Normal Form
Defined to disallow multivalued attributes, composite attributes, and their
combinations.
It states that the domain of an attribute must include only atomic (simple, indivisible)
values and that the value of any attribute in a tuple must be a single value from the
domain of that attribute 1NF disallows relations within relations or relations as
attribute values within tuples.
The only attribute values permitted by 1NF are single atomic (or indivisible) values.
Consider the DEPARTMENT relation schema shown in Figure below
As we can see, this is not in 1NF because Dlocations is not an atomic attribute, as
illustrated by the first tuple in Figure
There are three main techniques to achieve first normal form for such a relation:
1) Remove the attribute Dlocations that violates 1NF and place it in a separate relation
DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT. The
primary key of this relation is the combination {Dnumber, Dlocation}. A distinct
tuple in DEPT_LOCATIONS exists for each location of a department. This
decomposes the non-1NF relation into two 1NF relations.
2) Expand the key so that there will be a separate tuple in the original DEPARTMENT
relation for each location of a DEPARTMENT. In this case, the primary key becomes
the combination {Dnumber, Dlocation}. This solution has the disadvantage of
introducing redundancy in the relation
3) If a maximum number of values is known for the attribute for example, if it is known
that at most three locations can exist for a department replace the Dlocations attribute
by three atomic attributes: Dlocation1, Dlocation2, and Dlocation3.
EXAMPLE:
Second Normal Form
Third Normal Forms
EXAMPLE:
What is normalization? Normalize below given relation upto 3NF STUDENT.
EXAMPLE:
Consider the relation R(ABC) with following FD A->B, B->C and C->A. What is the
normal form of R?olution: