Chapter five
Functional Dependency and Normalization
content
Purpose of Normalization
Functional Dependency
Normalization
Prepared by Elias B. Chap 5 1
Purpose of Normalization
Normalization
Is technique for producing set of suitable relations that support
data of enterprise.
Is the process of decomposing unsatisfactory "bad"
relations by breaking up their attributes into smaller
relations.
Why do we need to normalize?
To avoid redundancy (less storage space needed, and data is
consistent)
To avoid Insert/update/delete anomalies.
Organize data efficiently
Prepared by Elias B. Chap 5 2
Normalization
If a database design is not perfect, it may contain anomalies, which are
like a bad dream for any database administrator. Managing a database
with anomalies is next to impossible.
Update anomalies − If data items are scattered and are not linked to
each other properly, then it could lead to strange situations. For
example, when we try to update one data item having its copies
scattered over several places, a few instances get updated properly
while a few others are left with old values. Such instances leave the
database in an inconsistent state.
Deletion anomalies − We tried to delete a record, but parts of it was left
undeleted because of unawareness, the data is also saved somewhere
else.
Insert anomalies − We tried to insert data in a record that does not exist
at all. Normalization is a method to remove all these anomalies and
bring the database to a consistent state.
Prepared by Elias B. Chap 5 3
Functional Dependencies
A functional dependency occurs when the value of one or a
set of attribute(s) determines the value of a second set of
attribute(s) in the same table.
StudentID StudentName
StudentID (DormName, DormRoom, Fee)
The attribute on the left side of the functional dependency is
called the determinant.
Function dependencies are not equations!
Prepared by Elias B. Chap 5 4
Characteristics of Functional Dependencies
Diagrammatic representation
Determinant
Attribute or group of attributes on left-hand side of arrow.
Prepared by Elias B. Chap 5 5
Composite determinant:
A determinant of a functional dependency that consists
of more than one attribute.
(StudentName, ClassName) (Grade)
Functional Dependency Rules
If A (B, C), then A B and A C
If (A,B) C, then neither A nor B determines C by itself.
Prepared by Elias B. Chap 5 6
An Example Functional Dependency
Prepared by Elias B. Chap 5 7
Example Functional Dependency that holds for all Time
EmpNum EmpEmail EmpFname EmpLname
123
[email protected] John Doe
456
[email protected] Peter Smith
555
[email protected] Alan Lee
633
[email protected] Peter Doe
787
[email protected] Alan Lee
If EmpNum is the PK then the FDs:
EmpNum EmpEmail
EmpNum EmpFname
EmpNum EmpLname
must exist.
Prepared by Elias B. Chap 5 8
Functional Dependencies
EmpNum EmpEmail
EmpNum EmpFname 3 different ways you
EmpNum EmpLname might see FDs depicted
EmpEmail
EmpNum EmpFname
EmpLname
EmpNum EmpEmail EmpFname EmpLname
Prepared by Elias B. Chap 5 9
Transitive dependency
Transitive dependency
Consider attributes A, B, and C, and where
A B and B C.
Functional dependencies are transitive, which means that we also
have the functional dependency AC
We say that C is transitively dependent on A through B.
Prepared by Elias B. Chap 5 10
Transitive dependency
EmpNum DeptNum
EmpNum EmpEmail DeptNum DeptNname
DeptNum DeptName
EmpNum EmpEmail DeptNum DeptNname
DeptName is transitively dependent on EmpNum via DeptNum
EmpNum DeptName
Prepared by Elias B. Chap 5 11
Partial dependency
A partial dependency exists when an attribute B is functionally
dependent on an attribute A, and A is a component of a multipart
candidate key.
InvNum LineNum Qty InvDate
Candidate keys: {InvNum, LineNum} InvDate is partially dependent
on {InvNum, LineNum} as InvNum is a determinant of InvDate and
InvNum is part of a candidate key
Prepared by Elias B. Chap 5 12
Multivalued Dependencies
A multivalued dependency occurs when a determinant determines a particular set
of values:
Employee Degree
Employee Sibling
PartKit Part
The determinant of a multivalued dependency can never be a primary key
Prepared by Elias B. Chap 5 13
Characteristics of Functional Dependencies
Determinants should have minimal number of attributes
necessary to maintain functional dependency with
attribute(s) on right hand-side Called full functional
dependency.
Full functional dependency indicates that if A and B are
attributes of relation, B is fully functionally dependent on
A, if B is functionally dependent on A, but not on any
proper subset of A.
Prepared by Elias B. Chap 5 14
Example Full Functional Dependency
Exists in the Staff relation
staffNo, sName → branchNo
Each value of (staffNo, sName) is associated with single
value of branchNo
branchNo also functionally dependent on subset of (staffNo,
sName)
(staffNo, sName) - partial dependency
Prepared by Elias B. Chap 5 15
Con’t....
Main characteristics of functional dependencies used in
normalization:
One-to-one relationship between attribute(s) on left-hand
side (determinant) and right-hand side of functional
dependency
Determinant has minimal number of attributes necessary to
maintain dependency with attribute(s) on right hand-side
Prepared by Elias B. Chap 5 16
Normalization
Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations.
Normal form: Condition using keys and FDs of a
relation to certify whether a relation schema is in a
particular normal form
Prepared by Elias B. Chap 5 17
Normal Forms: Review
Unnormalized – There are multi-valued attributes or
repeating groups
1 NF – No multi-valued attributes or repeating groups.
2 NF – 1 NF plus no partial dependencies
3 NF – 2 NF plus no transitive dependencies
Prepared by Elias B. Chap 5 18
First Normal Form (1NF)
First Normal Form is defined in the definition of relations tables
itself.
This rule defines that all the attributes in a relation must have
atomic domains.
The values in an atomic domain are indivisible units.
All attributes are atomic(no repeating groups).
Disallows composite attributes, multi valued attributes, and nested
relations;
Attributes whose values for an individual tuple are non-atomic
Values must be simple.
Remove horizontal redundancies
No two columns hold the same information
No single column holds more than a single item
Prepared by Elias B. Chap 5 19
Con’t…
Each row must be unique
Use a primary key
Benefits
Easier to query/sort the data
More scalable
Each row can be identified for updating
Prepared by Elias B. Chap 5 20
Converting into 1NF form
Prepared by Elias B. Chap 5 21
cont…
Prepared by Elias B. Chap 5 22
First Normal Form
The following in not in 1NF
EmpNum EmpPhone EmpDegrees
123 233-9876
333 233-1231 BA, BSc, PhD
679 233-1231 BSc, MSc
EmpDegrees is a multi-valued field:
employee 679 has two degrees: BSc and MSc
employee 333 has three degrees: BA, BSc, PhD
To obtain 1NF relations without loss of information, replace the
above with two relations
Prepared by Elias B. Chap 5 23
First Normal Form
EmployeeDegree
Employee
EmpNum EmpDegree
EmpNum EmpPhone
333 BA
123 233-9876
333 BSc
333 233-1231
333 PhD
679 233-1231
679 BSc
679 MSc
An outer join between Employee and EmployeeDegree will produce the
information we saw before
Prepared by Elias B. Chap 5 24
Second Normal Form (2NF)
Table must be in First Normal Form
Remove vertical redundancy
The same value should not repeat across rows
Composite keys
• All columns in a row must refer to BOTH parts of the key
Benefits
• Increased storage efficiency
• Less data repetition
All non-key attributes are fully dependent on the
PK (“no partial dependencies”)
A relation in 2NF will not have any partial
dependencies.
Prepared by Elias B. Chap 5 25
Converting to 2NF forms
Prepared by Elias B. Chap 5 26
Before we learn about the second normal form, we
need to understand the following −
Prime attribute − An attribute, which is a part of the
prime-key, is known as a prime attribute.
Non-prime attribute − An attribute, which is not a part
of the prime-key, is said to be a non-prime attribute. If
we follow second normal form, then every non-prime
attribute should be fully functionally dependent on
prime key attribute.
That is, if X → A holds, then there should not be any
proper subset Y of X, for which Y → A also holds true.
Prepared by Elias B. Chap 5 27
Prepared by Elias B. Chap 5 28
Second Normal Form (2NF)
This table has a composite primary key [Customer ID, Store ID].
The non-key attribute is [Purchase Location]. In this case,
[Purchase Location] only depends on [Store ID], which is only
part of the primary key. Therefore, this table does not satisfy
second normal form.
Prepared by Elias B. Chap 5 29
Second Normal Form (2NF)
To bring this table to second normal form, we break the table into two
tables, and now we have the following:
What we have done is to remove the partial functional dependency that we
initially had. Now, in the table [TABLE_STORE], the column [Purchase
Location] is fully dependent on the primary key of that table, which is [Store
ID].
Prepared by Elias B. Chap 5 30
example
Prepared by Elias B. Chap 5 31
Removing FD
Prepared by Elias B. Chap 5 32
Normalizing into 2NF
Prepared by Elias B. Chap 5 33
Third Normal Form (3NF)
• A relation in 3NF will not have any transitive dependencies of
non-key attribute on a candidate key through another non-key
attribute.
Table must be in Second Normal Form
If your table is 2NF, there is a good chance it is 3NF
All columns must relate directly to the primary key
Benefits
No extraneous data
Prepared by Elias B. Chap 5 34
For a relation to be in Third Normal Form, it must be
in Second Normal form and the following must satisfy
− No non-prime attribute is transitively dependent on
prime key attribute. For any non-trivial functional
dependency, X → A, then either − X is a superkey or, A
is prime attribute.
Prepared by Elias B. Chap 5 35
Prepared by Elias B. Chap 5 36
Third Normal Form
Consider this Employee relation
Candidate keys
are? …
EmpNum EmpName DeptNum DeptName
EmpName, DeptNum, and DeptName are non-key attributes.
DeptNum determines DeptName, a non-key attribute, and DeptNum is not a
candidate key.
Is the relation in 3NF? … no
Is the relation in 2NF? … yes
Prepared by Elias B. Chap 5 37
Third Normal Form
EmpNum EmpName DeptNum DeptName
We correct the situation by decomposing the original relation into two 3NF
relations. Note the decomposition is lossless.
EmpNum EmpName DeptNum DeptNum DeptName
Verify these two relations are in 3NF.
Prepared by Elias B. Chap 5 38
Prepared by Elias B. Chap 5 39
Prepared by Elias B. Chap 5 40
Prepared by Elias B. Chap 5 41
Converting into 3NF
Prepared by Elias B. Chap 5 42
Boyce-Codd Normal Form
Prepared by Elias B. Chap 5 43
Prepared by Elias B. Chap 5 44
Boyce-Codd Normal Form BCNF is an extension of Third Normal
Form on strict terms.
BCNF states that − For any non-trivial functional dependency, X →
A, X must be a super-key.
In the above image, Stu_ID is the super-key in the relation
Student_Detail and Zip is the super-key in the relation ZipCodes.
So, Stu_ID → Stu_Name, Zip and Zip → City Which confirms that
both the relations are in BCNF.
Prepared by Elias B. Chap 5 45
Prepared by Elias B. Chap 5 46
Prepared by Elias B. Chap 5 47
Prepared by Elias B. Chap 5 48
4NF form
Prepared by Elias B. Chap 5 49
Prepared by Elias B. Chap 5 50
Prepared by Elias B. Chap 5 51