0% found this document useful (0 votes)
14 views

Normalization

Uploaded by

Sandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Normalization

Uploaded by

Sandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Normalization

a database schema design technique to


minimize data redundancy and avoid modification anomalies
Chapter 8 of the Textbook

©Anne Haxthausen and Flemming Schmidt


These slides have been prepared by Anne Haxthausen, partly reusing/modifying slides by Flemming Schmidt. Some
examples come from Silberschatz, Korth, Sudarshan, 2010.

7. Normalization 02170 Database Systems 1


Contents
 About Normalization
 Theoretical Foundations I: Functional Dependencies
 Normal Forms 1NF-3NF: Original Definitions
 Normal Forms 2NF-3NF, BCNF: General Definitions
 Higher Normal Forms: 4NF, 5NF, …

7. Normalization 02170 Database Systems 2


About Normalization
 Normalization is a simple, practical technique used to
minimize data redundancy and avoid modification anomalies.
• What: Normalization involves decomposing a table into smaller tables
without losing information and by defining foreign keys.
• The objective is to isolate data so that additions, updates and
deletions can be made in just one place, and then the changes can be
propagated through the rest of the database using the defined foreign
keys. The goal is both to ensure data consistency and to avoid
extensive searches.

7. Normalization 02170 Database Systems 3


Redundancy and Modification Anomalies
 Redundancy is the Evil of Databases
• Redundancy means that the same data is stored in more than one
place.
• The problem with redundancy is the risk of modification anomalies:
i.e. when data are modified (with SQL INSERT, UPDATE and DELETE
commands), there is a risk that the data are not modified everywhere
making the database inconsistent.
• The problem with an inconsistent database is that it might return
different answers to the same question asked in different ways.
• If the database is redundancy free, then the DBMS can modify the
data efficiently, without having to search the entire database.

7. Normalization 02170 Database Systems 4


Redundancy and Modification Anomalies
• Suppose we combine instructor and department:

• This design has two problems:


• It has redundant information: building and budget are repeated
for each instructor working in the same department. This can
give rise to modification anomalies where the data becomes
inconsistent.
• It is not possible to store information about a department unless
at least one instructor works in the department.

7. Normalization 02170 Database Systems 5


Features of Good Relational Design
• To avoid modification anomalies, additional restrictions can be
imposed on tables/relation schemas.
• Tables/relation schemas with such restrictions are said to be in a given
normal form like “Third Normal Form” or 3NF.
• Normalization is the process of bringing tables and their relation
schemas to higher normal forms by avoiding redundancy. Normally it
is done by projections of a table into two or more, smaller tables.
• Normalization is information preserving: it provides data lossless
decompositions (usually projections), and is fully reversible, normally
by joining the normalized tables back into the original table.

7. Normalization 02170 Database Systems 6


Normal Form Hierarchy
• If a table/relation schema is in e.g. 3NF, then by definition it is also in
2NF and 1NF
N
o
1NF – First Normal Form r
m
2NF – Second Normal Form a
l
3NF – Third Normal Form i
z
BCNF – Boyce-Codd Normal Form a
t
4NF – Fourth Normal Form i
o
5NF – Fifth Normal Form n
DKNF – Domain Key Normal Form

7. Normalization 02170 Database Systems 7


Theoretical Foundations: Functional Dependencies
 Mathematics needed to define 1NF, 2NF, 3NF and BCNF
• Functional dependencies: attribute associations within a relation
(schema).
• Trivial and nontrivial functional dependencies
• Keys defined in terms of functional dependencies
• Armstrong’s axioms and derived theorems: to find derived functional dependences

 Mathematics needed to define 4NF (later)


• Multivalued dependencies.

7. Normalization 02170 Database Systems 8


Functional Dependencies
• Describe dependencies between attribute sets A and B of a relation
schema R:
• Basically a many-to-one relation of associations from one set A of attributes to
another set B of attributes within a given relation.
• Example: Consider Shipments(Vendor, Part, Qty):

{Vendor, Part} → {Qty} is a functional dependency holding for Shipments.


The attribute set {Vendor, Part} functionally determines the attribute set {Qty}.
The attribute set {Qty} is functionally dependent of the attribute set {Vendor, Part}.

7. Normalization 02170 Database Systems 9


Formal Definition of Functional Dependency
Let X and Y be subsets of the set of attributes of a relation schema R .
 A functional dependency X → Y holds for R if and only if
• in every legal instance of R,
each X value has associated precisely one Y value (i.e. whenever two rows
have the same X value, they also have the same Y values).
 When X → Y holds for R, we say
• Y is functionally dependent of X and X functionally determines Y
• X is said to be the determinant and Y the dependent.
• If X = {A1, …, An}, Y = {B1, …, Bm} we sometimes write A1, …, An -> B1, …, Bm or A1 … An -> B1 … Bm
 Trivial Dependencies
• X → Y is trivial, if Y  X.
• Example 1: {Vendor, Part} → {Vendor}
• Example 2: {Part} → {Part}

 Nontrivial Dependencies
• Those FDs which are not trivial.
• Nontrivial FDs lead to definitions of integrity constraints like keys.
7. Normalization 02170 Database Systems 10
Functional Dependencies
 Many functional dependencies can be proposed.
 Example: Shipment1(Vendor, City, Part, Qty) (where a vendor only can be in one city)
No Determinant FD Dependent Validity Remark
1 {Vendor, Part} → {Qty} Legal
2 {Vendor} → {City} Legal
3 {Qty} → {Vendor} Illegal
4 {Part} → {Qty} Illegal
5 {Vendor, Part} → {City} Legal Derived(2)

6 {Vendor} → {Vendor} Legal trivial


256 →

• To decide whether a FD is valid (legal for all relation instances), one has to consider the real world.
• Of a set of 4 attributes, it is possible to make 24 = 16 subsets.
• Combining 16 determinants with 16 dependents gives 256 potential FDs.
• Of the 256 potential FDs, some are valid and some are not.
• (Canonical) Cover Set: A (irreducible/minimal) set of valid functional dependencies
from which all valid functional dependencies can be determined. In the example:
{ {Vendor, Part} → {Qty}, {Vendor} → {City}} is a canonical cover set.
• Closure Set F+ of a set F of functional dependencies: The set of all valid functional
dependencies that can be logically derived from the F.

7. Normalization 02170 Database Systems 11


Super Key and Functional Dependencies
Let R be a relation schema and let K be a subset of the set AR of
attributes of R.
 Super key, original definition:
• K is a superkey of R, if, in any legal instance of R, for all pairs t1 and t2 of
tuples in the instance of R if t1 ≠ t2, then t1[K] ≠ t2[K]. Hence, a specific
K value will uniquely define a specific tuple in R.
 Super key defined by functional dependencies:
• K is a superkey of R, if K → X holds for every attribute X of R.
• Super Key examples for Shipment1(Vendor, City, Part, Qty)
• {Vendor, City, Part}
Knowing the values of Vendor, City and Part, the value of Qty is defined.
• However, of interest is a Super Key with a minimum set of attributes:
{Vendor, Part}
Knowing the values of Vendor and Part, the values of City and Qty is
defined.

7. Normalization 02170 Database Systems 12


Candidate Key and Functional Dependencies
 Candidate key is a minimal super key:
• K is a candidate key for R if and only if,
K  AR and for no   K,   AR
• Candidate key example for Shipment1(Vendor, City, Part, Qty):
{Vendor, Part}
• In some cases R can have several candidate keys!

 Primary key
• A candidate key is selected by the DBA to be the primary key for a
relation.

7. Normalization 02170 Database Systems 13


Armstrong’s Rules
Let R be a relation schema and let F be a set of
functional dependencies on R.
 Armstrong’s Rules
• are some Axioms and Derived Theorems for deriving
functional dependences from F
• are
• sound: only valid FDs are derived from valid FDs.
• complete: all valid FDs (the closure set of F) can be derived from a
cover set F.
• can e.g. be used for finding candidate keys

7. Normalization 02170 Database Systems 14


Armstrong’s Axioms and Derived Theorems
Let X, Y, Z, V and O be sets of attributes.
 Armstrong’s axioms
1. Reflexivity: If Y is a subset of X, then X → Y
2. Augmentation: If X → Y, then XZ → YZ
Note: XZ is used as a shorthand for X U Z
Note: XY = YX and XX = X; Sets have no order and no repetitions
3. Transitivity: If X → Y and Y → Z, then X → Z
 Derived theorems
4. Self-determination: X → X
5. Decomposition: If X → YZ, then X → Y and X → Z
6. Union: If X → Y and X → Z, then X → YZ
7. Composition: If X → Y and Z → V, then XZ → YV
8. General Unification: If X → Y and Z → V, then X U (Z – Y) → YV
(where ‘U’ is Set Union and ‘–’ is Set Difference)

7. Normalization 02170 Database Systems 15


Using Armstrong’s Rules
 To propose candidate keys
• Given R(A, B, C, D, E) with
functional dependencies: A → BC, CD → E, B → D, E → A
1. A→A Rule 4 Self-determination
2. A→B Rule 5 Decomposition of given A → BC
3. A→C Rule 5 Decomposition of given A → BC
4. A→D Rule 3 Transitivity of A → B and given B → D
5. A → CD Rule 6 Union A → CD
6. A→E Rule 3 Transitivity of A → CD and given CD → E
7. A → ABCDE Rule 6 Union
8. E → ABCDE Rule 3 Transitivity of given E → A and A → ABCDE
9. Candidate Keys: Both A and E! Select one for the Primary Key!
10. Relation Schema: R(A, B, C, D, E) if A is selected as Primary Key or
R(E, A, B, C, D) if E is selected as Primary key

7. Normalization 02170 Database Systems 16


Normal Forms 1NF-3NF: Original Definitions
• 2NF and 3NF are defined using the notion of functional dependencies.
The starting point for the definitions is:
• a relational schema R
• a cover set FD of functional dependencies for R
• The original definitions (here) assume the tables have one candidate
key which has been chosen as primary key, in the following called the key.
• The generalized definitions of 2NF and 3NF (in the book) generalizes
the original definitions by taking several candidate keys into account.
• When there is only one candidate key, the original and the generalized
definitions are equivalent.

7. Normalization 02170 Database Systems 17


Normal Forms Defined Informally
 1st normal form
• All attributes depend on the key.
 2nd normal form
• All attributes depend on the whole key
 3rd normal form
• All attributes depend on nothing but the key

7. Normalization 02170 Database Systems 18


1NF – First Normal Form
 For a relation to be in First Normal Form 1NF
1. Each attribute value must be a single value (is atomic, not multivalued or
composite).

7. Normalization 02170 Database Systems 19


Normalization to 1NF
 OrdersTable below is not in 1NF
as the values of the attribute ItemNo are not atomic.

OrdersTable(OrderNo, ItemNo)

 Normalization to 1NF
Orders1NF(OrderNo, ItemNo)

7. Normalization 02170 Database Systems 20


2NF – Second Normal Form
 For a relation schema R to be in Second Normal Form 2NF
1. It must be in 1NF.
2. Each non primary key attribute A must not depend on a strict subset Kpart
of the primary key attribute set K, i.e. Kpart -> A and Kpart ⊂ K must not be
the case. A must depend on the entire primary key K.
 Some special cases where a 1NF relation schema is in 2NF:
• The primary key consists only of one attribute.
• The primary key consists of all attributes in the relation.
• Each non primary key attribute depend on all the attributes of the primary
key.
 Normalisation 1NF to 2NF:
• Move the set of all attributes A which depend on a Kpart ⊂ K to a new
relation R2 together with a copy of Kpart, which becomes the primary key
as well as a foreign key. If there is only one such attribute A we have:
R(K, …, A) R1(K, …) foreign key Kpart references R2, R2(Kpart, A)
Key of R1: is K\Kpart ,if K\Kpart -> Kpart, otherwise it is K.
• Repeat the step above, if R1 or R2 are not yet in 2 NF.
• Decomposition is data lossless:
(select K, ... from R) natural join (select Kpart, A from R) = select * from R

7. Normalization 02170 Database Systems 21


Normalization to 2NF - Example
 Orders1NF is in 1NF, but not 2NF:

• Orders1NF(OrderNo, ItemNo, ItemName)


with FD: ItemNo -> ItemName
• ItemName depends on ItemNo only, and not on
the full primary key. Violation of rule 2 for 2NF.

 Normalization to (Orders2NF and Items in) 2NF


• Orders2NF(OrderNo, ItemNo) foreign Key(ItemNo) references Items(ItemNo)
• Items(ItemNo, ItemName)
• Note that ItemNo is included in the key of Orders2NF, as OrderNo -/-> ItemNo

• Associated normalization of tables is


projections:
Orders2NF ≡ ∏OrderNo, ItemNo (Orders1NF)
Items ≡ ∏ItemNo, ItemName (Orders1NF)
• A Natural Join brings back the table:
Orders1NF ≡ Orders2NF |X| Items
• Normalization is information preserving .
7. Normalization 02170 Database Systems 22
Normalization to 2NF - Example
 Teachers in 1NF
• Teachers(InstID, DeptName, InstName, Salary, Building, Budget)
• InstID -> InstName, DeptName, Salary,
DeptName -> Building, Budget
• Violation of 2NF: Building, Budget depend on DeptName only,
and not on the full primary key.

 Normalizing Teachers to Instructor and Department in 2NF


• Instructor(InstID, InstName, DeptName, Salary)
Foreign Key(DeptName) references Department(DeptName)
• Department(DeptName, Building, Budget)
• Note that DeptName is not included in the key of Instructor as InstID -> DeptName.
• Associated normalization of tables is
projections:
Orders2NF ≡ ∏OrderNo, ItemNo (Orders1NF)
Items ≡ ∏ItemNo, ItemName (Orders1NF)
• A Natural Join brings back the table:
Orders1NF ≡ Orders2NF |X| Items
• Normalization is information preserving .

7. Normalization 02170 Database Systems 23


3NF – Third Normal Form
 For a relation schema R(K, …, A) to be in Third Normal Form 3NF
1. It must be in 2NF.
2. Each non primary key attribute A must depend directly on the entire
primary key K. It must not depend transitively via other attributes B,
like K -> B -> A, where B -/-> K and B -> A is non-trivial.
 Normalization 2NF to 3NF
• Move attributes A which are transitively dependent on K via B,
i.e. K -> B -> A for some attribute set B, to a new relation R2 together
with a copy of the dependent attributes in B, which become the
primary key and constitute a foreign key:
R(K, …, B, A) R1(K, ..., B) foreign key B references R2, R2(B, A)
• Repeat the step above, if R1 or R2 are not yet in 3 NF.
• Decomposition is data lossless:
(select K, ..., B from R) natural join (select B, A from R) = select * from R

7. Normalization 02170 Database Systems 24


Normalization to 3NF - Example
 Customer2NF in 2NF
• Customers2NF(CustomerNo, PostNo, CityName)
• CustomerNo -> PostNo, PostNo -> CityName
• CityName depends on the full Primary Key, but
transitively via PostNo. Violation of rule 2 for 3NF.

 Normalization to 3NF
• Customers3NF(CustomerNo, PostNo),
foreign key(PostNo) references Post(PostNo)
• Post(PostNo, CityName)
• Associated normalization of tables is
projections:
Orders2NF ≡ ∏OrderNo, ItemNo (Orders1NF)
Items ≡ ∏ItemNo, ItemName (Orders1NF)
• A Natural Join brings back the table:
Orders1NF ≡ Orders2NF |X| Items
• Normalization is information preserving .
7. Normalization 02170 Database Systems 25
Normal Forms 2NF-3NF, BCNF: General Definitions
• Are defined using the notion of functional dependencies.
• The original definitions (on previous slides) of 2NF-3NF assume the
tables have one candidate key which has been chosen as primary key.
• The generalized definitions of 2NF-3NF and BCNF (in the book) take
several candidate keys into account.
• When there is only one candidate key, the original and the generalized
definitions of 2NF-3NF are equivalent.
• When there is only one candidate key: 3NF and BCNF are the same.
• 3NF and BCNF are the same (according to Date) unless
• there are several composite candidate keys CK1 and CK2
• which are overlapping (CK1 ∩ CK2 ≠ {})
This exception is very rare.

7. Normalization 02170 Database Systems 26


2NF-3NF General Definitions
 For a relation schema R to be in Second Normal Form 2NF
1. It must be in 1NF.
2. Each non candidate key attribute A must not depend on a strict
subset Kpart of any candidate key attribute set K, i.e. Kpart -> A and
Kpart ⊂ K must not be the case. A must depend on the entire K.
 For a relation schema R(K, …, A) to be in Third Normal Form
3NF
1. It must be in 2NF.
2. No non candidate key attribute A depends transitively on any
candidate key K via other attributes B, like K -> B -> A, where B -/-> K
and B -> A is non-trivial.
In the book there is another general definition of 3NF. The two
definitions are equivalent.

7. Normalization 02170 Database Systems 27


BCNF - Boyce-Codd Normal Form

Boyce-Codd Normal Form BCNF


• A relation schema is in Boyce-Codd Normal Form BCNF, if and only if, every
nontrivial, left-irreducible FD    has a candidate key as its determinant .
• Left-irreducible means that there is no proper subset s of  such that s   .
 Normalization of R(A1, …, An , B1, …, Bn , C) to BCNF:
• Assume B1, …, Bn  C is nontrivial, left-irreducible, but B1, …, Bn is not a
candidate key.
• R(A1, …, An , B1, …, Bn , C)
R1(A1, …, An , B1, …, Bn) foreign key B1, …, Bn references R2 and R2(B1, …, Bn , C)
• Repeat the step above, if R1 or R2 are not yet in BCNF.
 Normalization to BCNF, example:
• R(A, B, C ) with FD = {A  B,B  C}
is not in BCNF (B  C, but B is not candidate key)
• Decomposition of R: R1(A,B) and R2(B, C).

7. Normalization 02170 Database Systems 28


BCNF versus 3NF - Example
Given functional dependencies
Pizza, ToppingType  Topping Pizza Topping ToppingType
Topping  ToppingType 1 mozarella cheese
on R(Pizza, ToppingType, Topping) 1 pepperoni meat
Then R has two candidate keys: 1 olives vegetable
• Pizza, ToppingType
2 mozarella cheese
• Pizza, Topping
2 sausage meat
The first is chosen as primary key.
Then 2 pebbers vegetable
R(Pizza, Topping, ToppingType)
is in 3NF, but not in BCNF as Topping  ToppingType and Topping is not a candidate
key.

7. Normalization 02170 Database Systems 29


Higher Normal Forms: 4NF, 5NF, …
 Are defined using the notion of multivalued dependencies.
 Fourth Normal Form
A relation schema with multiple unrelated multivalued dependencies
must be broken into relation schemas that separate the unrelated
attributes.
 Fifth Normal Form neither covered in the book nor here.

7. Normalization 02170 Database Systems 30


Multivalued Dependencies
Let R be a relation schema and  and  be disjoint subsets of the attributes of
R and let ϒ be the remaining attributes of R.
• A multivalued dependency    holds for R if and only if, in every
legal instance of R, the set of  values matching a given  ϒ value pair
depends only on the  value and is independent of the ϒ value.
• When    we say  multivalue determines .
• When    holds for R, then   ϒ also holds for R.
• When    holds for R, then    also holds for R.

7. Normalization 02170 Database Systems 31


Multivalued Dependencies - Example
 Example
• CTX(Course, Teacher, Text)
Course →→ Teacher ‘Physics’ →→ ‘Prof. Green’ and ‘Prof. Brown’
Course →→ Text ‘Physics’ →→ ‘Principles of Optics’ and ‘Basic Mechanics’

7. Normalization 02170 Database Systems 32


4NF - Fourth Normal Form

 Fourth Normal Form 4NF


• A relation schema R is in Fourth Normal Form 4NF, if and only if,
whenever
a non-trivially    holds, then  is a key of R (all attributes of R
are also functionally dependent on ).
   is trivial means    or  ∪  = set of all attributes in R.

 Normalization from BCNF to 4NF


• Assume    non-trivially holds and  is NOT a key of R.
• This usually arises from many-to-many relationship sets or
multivalued entity sets.
• R(, , ϒ) R1(, ), R2(, ϒ)
 Decomposition is data lossless:
(select ,  from R) natural join (select , ϒ from R) = select * from R
7. Normalization 02170 Database Systems 33
Multivalued Dependencies and 4NF
 Example
• CTX(Course, Teacher, Text)
Course →→ Teacher ‘Physics’ →→ ‘Prof. Green’ and ‘Prof. Brown’
Course →→ Text ‘Physics’ →→ ‘Principles of Optics’ and ‘Basic Mechanics’
• CTX is in 3NF, but not in 4NF as Course is not a key!

• Decompose to reach 4NF:


CT(Course, Teacher), CX(Course, Text)
These are in 4NF, as Course →→ Teacher resp. Course →→ Text now are
trivial
• A Natural Join of CT and CX over Course will restore CTX

7. Normalization 02170 Database Systems 34


Summary
• Redundancy is the evil of all databases leading to modification
anomalies (i.e. problems with SQL INSERT, UPDATE and DELETE).
• A good relational design avoids redundancy by decomposition of
tables into multiple tables without redundancy.
• Decomposition of tables are done by data lossless projections and a
natural join of the decomposed tables will bring back the original
table.
• The decomposition process is called normalization, and the tables
and their relation schemas gradually continues to improve in quality
in higher and higher normal forms.
• Supplementary Literature:
An Introduction to Database Systems, C.J. Date, Addison-Wesley, Eight Edition,
2004, Chapters 11, 12 & 13.

7. Normalization 02170 Database Systems 35


Demo Exercises

Demo Exercises clarify ideas


and concepts from the Lecture
to provide you with good
Database Skills.

Discuss and do the


Demo Exercises.

7. Normalization 02170 Database Systems 36


Functional Dependencies
7.1.1 Number of Functional Dependencies
How many functional dependencies can be
proposed for the relation schema R(A, B, C)?

7.1.2 Candidate Keys with Armstrong’s Rules


Use Armstrong’s rules to make a list of
Candidate Keys for the following relation
schema R(A, B, C, D) with the following
functional dependencies:
B → AD, D → B, A → C. Please specify which of
Armstrong’s rules are used in each step.
Finally select a Primary Key.

7. Normalization 02170 Database Systems 37


First Normal Form
7.1.3 Violation of First Normal Form 1NF
What is the problem, and how can the table
below be normalized?
employee_no Phone_numbers

67810101 28801633, 28801724

67810210 28801325, 28805647, 28801518

67810324 28801713

7. Normalization 02170 Database Systems 38


Third Normal Form
7.1.4 Violation of Third Normal Form 3NF
What is the problem, and how can the table
below be normalized?

7. Normalization 02170 Database Systems 39


BNCF
7.1.5 Violation of BNCF 7.1.5 Solution:
Consider the table

with relation schema


R(Pizza, Topping, ToppingType)
and functional dependencies
Pizza, ToppingType  Topping
Topping  ToppingType

Normalize the relation schema and table to


BCNF.

7. Normalization 02170 Database Systems 40


Solutions to Demo Exercises

7. Normalization 02170 Database Systems 41


Functional Dependencies
7.1.1 Number of Functional Dependencies 7.1.1 Of a set of 3 attributes, 23 = 8 subsets can be
made (i.e. {}, {A}, {B}, {C}, {A,B}, {A,C}, {B,C}, and {A, B,
How many functional dependencies can be
C}. Each subset can be a Determinant or Dependent,
proposed for the relation schema R(A, B, C)? thus 8 x 8 = 64 functional dependencies can be
proposed, like {A, B} → {C}, {A} → {B}.

7.1.2 Candidate Keys with Armstrong’s Rules 7.1.2 By using Armstrong’s rules:
Use Armstrong’s rules to make a list of 1. B → B by rule 4 Self-determination
Candidate Keys for the following relation 2. B → A by rule 5 Decomposition of B → AD
schema R(A, B, C, D) with the following 3. B → D by rule 5 Decomposition of B → AD
functional dependencies: 4. B → C given B → A ^ A → C by rule 3 Transitivity
5. B → ABCD by rule 6 Union
B → AD, D → B, A → C. Please specify which of
6. D → ABCD given D → B and B → ABCD by rule 3
Armstrong’s rules are used in each step. Candidate Keys are B and D!
Finally select a Primary Key. I select B to be the Primary Key, but I could have
selected D instead.

7. Normalization 02170 Database Systems 42


First Normal Form
7.1.3 Violation of First Normal Form 1NF 7.1.3 It looks like the table has employee_no as
primary key.
What is the problem, and how can the table
The normalization could be the one shown below,
below be normalized?
with a primary key
employee_no Phone_numbers
• (employee_no, Phone_no), if one phone number
67810101 28801633, 28801724 can be shared between several employees.
67810210 28801325, 28805647, 28801518 • Phone_no, if a phone number can belong only to
67810324 28801713 one employee:

As can be seen, there is no limit to the number of


phone numbers that an employee can have, and a
search of an employee’s phone numbers is simple.

7. Normalization 02170 Database Systems 43


Third Normal Form
7.1.4 Violation of Third Normal Form 3NF 7.1.4 Inspecting the tables and using domain knowledge we
identify the following cover set of functional dependencies:
What is the problem, and how can the table Driver_license -> Full_name
below be normalized? License_plate -> Driver_License
License_plate -> Car_manufacturing
Using Armstrong’s transitivity rule we also have
License_plate -> Full_name
As {License_plate} functionally determines all attributes and is
minimal, License_plate can be chosen as primary key. The
Table is in 2NF as all attributes depend fully on the key, but
not in 3NF as Full_name transitively depends on the key via
Driver_license. The problem by this is that the information
that ‘Finn Jensen’ has driver license ‘31237248’ is recorded
twice (i.e. redundant), and an update of Driver_license for
‘Finn Jensen’ might lead to ambiguity. Normalization gives the
two 3NF tables below. The first has driver_license and the
second has license_ plate as primary key. A natural join of the
two (join over attribute driver_license), will bring back the
original table.

7. Normalization 02170 Database Systems 44


BNCF
7.1.5 Violation of BNCF 7.1.5 Normalization to BCNF gives tables with
schemas R1(Pizza, Topping) and R2(Topping,
Consider the table
ToppingType)

with relation schema Topping ToppingType


R(Pizza, Topping, ToppingType) mozarella cheese

and functional dependencies pepperoni meat

Pizza, ToppingType  Topping olives Vegetable

Topping  ToppingType sausage meat


pebbers vegetable

Normalize the relation schema and table to


BCNF.

7. Normalization 02170 Database Systems 45


Exercises

Please answer all exercises


to demonstrate your
Database Skills.

Pencil and Paper Exercises


Solutions are available at 11:45

7. Normalization 02170 Database Systems 46


Normalization
7.2.1 Violation of First Normal Form 1NF 7.2.4 Violation of Fourth Normal Form 4NF
Consider the relation schema:
Employees(EmpNo, Jobs) We are told that in a company, an employee can
where Jobs is a multivalued attribute containing one or more work on many projects and be employed in many
jobs. departments. Departments as well as projects can
Make changes needed to ensure 1NF. have many employees.
7.2.2 Violation of Second Normal Form 2NF Consider the relation schema:
Consider the relation schema: Company(EmpNo, DepNo, ProjectNo)
Clients(ClientNo, SalesRepNo, CName, SName, Date) with the dependency:
where ClientNo and CName is the number and name EmpNo → → ProjectNo
of a client, SalesRepNo and SName is the number
and name of a sales representative, and Date is the Explain why Company is not in 4NF.
date where the sales representative was assigned to Normalize Company to 4NF.
the client.
Explain why Clients is not in 2NF. Hint: First decide 7.2.5 Functional Dependencies
the minimal, non-trivial functional dependences. How many functional dependencies can be proposed
Normalize Clients to 2NF. for the schema: R(A, B, C, D, E )?
7.2.6 Candidate keys with Armstrong’s rules
7.2.3 Violation of Third Normal Form 3NF
Use Armstrong’s rules to propose a primary key for
Consider the relation schema:
Winners(Tournament, Year, Winner, Birthday) the following Relation Schema:
where each row in the table tells who was the R(A, B, C, D, E, F, G, H)
Winner of a particular Tournament in a particular with the following set of functional dependencies:
Year. The Birthday of the Winner is also given. {A → BC, E → FG, AB → D, EG → H}
Explain why Winners is not in 3NF and why that is a Please specify which of Armstrong’s rules are used in
problem. Is Winners in 2NF? Normalize Winners to each step.
3NF.

7. Normalization 02170 Database Systems 47

You might also like