0% found this document useful (0 votes)
50 views

Logical Database Design: Unit 7

The document discusses logical database design and normalization. It introduces key concepts like logical vs physical database design, normalization, functional dependencies, and normal forms. The goal of normalization is to reduce data redundancy and update anomalies by organizing data into tables and relations according to dependencies between attributes. Normalization results in relations in first normal form (1NF) up to fifth normal form (5NF).

Uploaded by

lekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Logical Database Design: Unit 7

The document discusses logical database design and normalization. It introduces key concepts like logical vs physical database design, normalization, functional dependencies, and normal forms. The goal of normalization is to reduce data redundancy and update anomalies by organizing data into tables and relations according to dependencies between attributes. Normalization results in relations in first normal form (1NF) up to fifth normal form (5NF).

Uploaded by

lekha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

Unit 7

Logical Database Design


Contents
 7.1 Introduction

 7.2 Functional Dependency

 7.3 First, Second, and Third Normal Forms (1NF, 2NF, 3NF)

 7.4 Boyce/Codd Normal Form (BCNF)

 7.5 Fourth Normal Form (4NF)

 7.6 Fifth Normal Form (5NF)

 7.7 The Entity/Relationship Model

Wei-Pang Yang, Information Management, NDHU 7-2


7.1 Introduction
 Logical Database Design vs. Physical Database Design
 Problem of Normalization
 Normal Forms

7-3
Logical Database Design
 Logical database design vs. Physical database design
 Logical Database Design
• Normalization
• Semantic Modeling, eg. E-R model
 Problem of Normalization
• Given some body of data to be represented in a database,
how to decide the suitable logical structure they should
have?
• what relations should exist?
• what attributes should they have?

Wei-Pang Yang, Information Management, NDHU 7-4


Problem of Normalization
<e.g.> S1, Smith, 20, London, P1, Nut, Red, 12, London, 300
S1, Smith, 20, London, P2, Bolt, Green, 17, Paris, 200
.
.
S4, Clark, 20, London, P5, Cam, Blue, 12, Paris, 400

Normalization
S P SP
S# SNAME STATUS CITY P# ... ... ... S# P# QTY
s1 . . London . . . . . . .
. . . . . . . . . . .

S' P SP'
S# SNAME STATUS P# ... ... ... S# CITY P# QTY
or S1 London P1 300
S1 Smith .
S2 . . . . . . S1 London P2 200
. . . . . . . . . . .

Redundancy Update Anomalies! (異常)


Wei-Pang Yang, Information Management, NDHU 7-5
Normal Forms
 A relation is said to be in a particular normal form if it satisfies
a certain set of constraints.
<e.g.> 1NF: A relation is in First Normal Form (1NF) iff it
contains only atomic values.

universe of relations (normalized and un-normalized)


1NF relations (normalized relations)

2NF relations
3NF relations
BCNF relations
4NF relations
5NF relations

Wei-Pang Yang, Information Management, NDHU Fig. 7.1: Normal Forms 7-6
7.2 Functional Dependency
 Functional Dependency (FD)
 Fully Functional Dependency (FFD)

7-7
Functional Dependency
 Functional Dependency
• Def: Given a relation R, R.Y is functionally dependent on R.X iff
each X-value has associated with it precisely one Y-value (at any
time).
• Note: X, Y may be the composite attributes. R
 Notation: . . X Y
. .
R.X R.Y
.

read as "R.X functionally determines R.Y"

Wei-Pang Yang, Information Management, NDHU 7-8


Functional Dependency (cont.)
<e.g.1>
S
S.S# S.SNAME S S# SNAME STATUS CITY
S1 Smith 20 London
S.S# S.STATUS S2 Jones 10 Paris
S.S# S.CITY S3 Blake 30 Paris
S4 Clark 20 London
S.STATUS S.CITY S5 Adams 30 Athens

Note: Assume STATUS is some factor of Supplier


FD Diagram: and no any relationship with CITY.

S# STATUS

SNAME CITY

Wei-Pang Yang, Information Management, NDHU 7-9


Functional Dependency (cont.)
<e.g.2> P
PNAME P P# PNAME COLOR WEIGHT CITY
P1 Nut Red 12 London
P2 Bolt Green 17 Paris
COLOR P3 Screw Blue 17 Rome
P# P4 Screw Red 14 London
P5 Cam Blue 12 Paris
WEIGHT P6 Cog Red 19 London

CITY
<e.g.3> SP
SP S# P# QTY
S1 P1 300
S1 P2 200
S# S1 P3 400
QTY S1 P4 200
S1 P5 100
P# S1 P6 100
S2 P1 300
S2 P2 400
S3 P2 200
 If X is a candidate key of R, then all attributes Y S4 P2 200
S4 P4 300
of R are functionally dependent on X. (i.e. X Y) S4 P5 400

Wei-Pang Yang, Information Management, NDHU 7-10


Fully Functional Dependency (FFD)
 Def: Y is fully functionally S# city1 city2

dependent on X iff S1
London Taipei
• (1) Y is FD on X
• (2) Y is not FD on any proper
subset of X.
SP'
<e.g.> SP' (S#, CITY, P#, QTY) S# CITY P# QTY

S1 London P1 300
S# FFD S1 London P2 200
QTY
P# … …. … ...

S# FD FD
CITY S# CITY S# CITY
not FFD … …..
P#

Wei-Pang Yang, Information Management, NDHU 7-11


Fully Functional Dependency (cont.)
<Note> 1. Normally, we take FD to mean FFD.
2. FD is a semantic notion.
<e.g.> S# CITY (Ref P.10-9)

Means: each supplier is located in precisely one city.


3. FD is a special kind of integrity constraint.
CREATE INTEGRITY RULE SCFD
CHECK FORALL SX FORALL SY
(IF SX.S# = SY.S# THEN SX.CITY = SY.CITY);

4. FDs considered here applied within a single relation.


<e.g.> SP.S# S.S# is not considered!

Wei-Pang Yang, Information Management, NDHU 7-12


7.3 First, Second, and Third Normal
Forms (1NF, 2NF, 3NF)

7-13
Normal Forms: 1NF
 Def: A relation is in 1NF iff all underlying simple domains contain atomic
values only.
fact
FIRST
S# STATUS CITY P# QTY
S1 20 London P1 300
S# STATUS CITY (P#, QTY) S1 20 London P2 200
S1 20 London P3 400
S1 20 London P4 200
S1 20 London {(P1, 300), (P2, 200), ..., (P6, 100)} S1 20 London P5 100
S2 10 Paris {(P1, 300), (P2, 400)} S1 20 London P6 100
S3 10 Paris {(P2, 200)} S2 10 Paris P1 300
S4 20 London {(P2, 200), (P4, 300), (P5, 400)} S2 10 Paris P2 400
S3 10 Paris P2 200
S4 20 London P2 200
Suppose 1. CITY is the main office of the supplier. S4 20 London P4 300
S4 20 London P5 400
2. STATUS is some factor of CITY
Key:(S#,P#),
Normalized 1NF

Wei-Pang Yang, Information Management, NDHU 7-14


1NF Problem: Update Anomalies!
<1> Update
FIRST
If suppler S1 moves from London to Paris, then 6 S# STATUS CITY P# QTY
tuples must be updated!
S1 20 London P1 300
<2> Insertion S1 20 London P2 200
S1 20 London P3 400
Cannot insert a supplier information if it doesn't S1 20 London P4 200
supply any part, because that will cause a null key S1 20 London P5 100
S1 20 London P6 100
value. S2 10 Paris P1 300
FIRST S2 10 Paris P2 400
S3 10 Paris P2 200
S# STATUS CITY P# QTY S4 20 London P2 200
S4 20 London P4 300
. . . . . S4 20 London P5 400
S3 20 Paris P2 300
. . . . . Key:(S#,P#),
. . . . . Normalized 1NF
S5 30 Athens NULL NULL
<3> Deletion
Delete the information that "S3 supplies P2", then
the fact "S3 is located in Paris" is also deleted.

Wei-Pang Yang, Information Management, NDHU 7-15


1NF Problem: Update Anomalies! (cont.)
FIRST
<e.g.> Suppose 1. CITY is the main office of the supplier. S# STATUS CITY P# QTY
2. STATUS is some factor of CITY (ref.p.7-9) S1 20 London P1 300
S1 20 London P2 200
Primary key of FIRST: (S#, P#) S1 20 London P3 400
S1 20 London P4 200
FFD S1 20 London P5 100
FD diagram of FIRST: S1 20 London P6 100
x FFD S2 10 Paris P1 300
STATUS S2 10 Paris P2 400
FF S# S3 10 Paris P2 200
QTY x FFD S4 20 London P2 200
D S4 20 London P4 300
P# CITY S4 20 London P5 400
FFD
Key:(S#,P#),
Normalized 1NF
FD:
1. S# STATUS
primary key (S#, P#) FFD
STATUS
2. S# CITY
3. CITY STATUS primary key (S#, P#) CITY
FFD
4. (S#, P#) QTY

Wei-Pang Yang, Information Management, NDHU 7-16


Normal Form: 2NF
 Def: A relation R is in 2NF iff
(1) R is in 1NF (i.e. atomic )
(2) Non-key attributes are FFD on primary key. (e.g. QTY, STATUS, CITY in FIRST)
<e.g.> FIRST is in 1NF, but not in 2NF FFD

(S#, P#) STATUS, and x FFD


FFD STATUS
(S#, P#) CITY FFD S#
FFD QTY x FFD
Decompose FIRST into: P# CITY
FFD

<1> SECOND (S#, STATUS, CITY): <2> SP (S#, P#, QTY):


primary key: S# Primary key: (S#, p#)
STATUS FD:
S#
1. S# STATUS QTY
S#
P#
CITY 2. S# CITY
3. CITY STATUS FD: 4. (S#, P#) QTY

Wei-Pang Yang, Information Management, NDHU 7-17


Normal Form: 2NF (cont.)
FIRST
S# STATUS CITY P# QTY SECOND (in 2NF)
S1 20 London P1 300 S# STATUS CITY
S1 20 London P2 200 S1 20 London
S1 20 London P3 400 S2 10 Paris
S1 20 London P4 200 S3 10 Paris
S1 20 London P5 100 S4 20 London
S1 20 London P6 100 S5 30 Athens
S2 10 Paris P1 300
S2 10 Paris P2 400
S3 10 Paris P2 200
S4 20 London P2 200
S4 20 London P4 300 SP (in 2NF)
S4 20 London P5 400
S# P# QTY)
S1 P1 300
<1> Update: S1 moves from London to Paris S1 P2 200
S1 P3 400
S1 P4 200
<2> Insertion: (S5 30 Athens) S1 P5 100
S2 P1 300
<3> Deletion S2 P2 400
S3 P2 200
Delete "S3 supplies P2 200", then the fact S4 P4 300
S4 P5 400
"S3 is located in Paris" is also deleted.

Wei-Pang Yang, Information Management, NDHU 7-18


Normal Form: 2NF (cont.)
 A relation in 1NF can always be reduced to an equivalent collection of 2NF relations.
 The reduction process from 1NF to 2NF is non-loss decomposition.
FIRST(S#, STATUS, SCITY, P#, QTY)
1st
projections natural joins

2nd SP
SECOND(S#, STATUS, CITY), SP(S#,P#,QTY)

 The collection of 2NF relations may contain “more” information than the equivalent
1NF relation.
<e.g.> (S5, 30, Athens)

Wei-Pang Yang, Information Management, NDHU 7-19


Problem: Update Anomalies in SECOND!
• Update Anomalies in SECOND
<1> UPDATE: if the status of London is changed from 20
to 60, then two tuples must be updated
<2> DELETE: delete supplier S5, then the fact "the
status of Athens is 30" is also deleted! SECOND (in 2NF)
S# STATUS CITY
<3>INSERT: cannot insert the fact "the status of Rome
S1 20 London
is 50"! S2 10 Paris
S3 10 Paris
• Why: S4
S5
20
30
London
Athens

S.S# S.STATUS
S.S# S. CITY
S.CITY S.STATUS cause a transitive dependency
FD:
STATUS 1. S# STATUS
S# 2. S# CITY
CITY 3. CITY STATUS

Wei-Pang Yang, Information Management, NDHU 7-20


Normal Forms: 3NF
 Def : A relation R is in 3NF iff
(1) R is in 2NF
(2) Every non-key attribute is non-transitively dependent on the primary key.
e.g. STATUS is transitively on S#
(i.e., non-key attributes are mutually independent)
<e.g.> SP is in 3NF, but SECOND is not!

SECOND (not 3NF)


SP FD diagram SECOND FD diagram
S# STATUS CITY
S1 20 London
STATUS
S# S2 10 Paris
QTY S3 10 Paris S#
P# S4 20 London
S5 30 Athens CITY

Wei-Pang Yang, Information Management, NDHU 7-21


Normal Forms: 3NF (cont.)
 Decompose SECOND into: SECOND
<1> SC(S#, CITY) S# STATUS CITY
primary key : S# S1 20 London
FD diagram: S2 10 Paris
S3 10 Paris
S# CITY S4 20 London
S5 30 Athens
<2> CS(CITY, STATUS):
primary key: CITY
FD diagram: STATUS
CITY STATUS S# CITY

SC (in 3NF) CS (in 3NF)


S# CITY
CITY STATUS
S1 London
S2 Paris Athens 30
S3 Paris London 20
S4 London Paris 10
S5 Athens Rome 50

Wei-Pang Yang, Information Management, NDHU 7-22


Normal Forms: 3NF (cont.)
 Note:
(1) Any 2NF diagram can always be reduced to a collection
of 3NF relations.
(2) The reduction process from 2NF to 3NF is non-loss
decomposition.
(3) The collection of 3NF relations may contain "more
information" than the equivalent 2NF relation.

Wei-Pang Yang, Information Management, NDHU 7-23


Good and Bad Decomposition
 Consider transitive FD STATUS Suppose 1. CITY is the main office of the supplier.
S# 2. STATUS is some factor of CITY (ref. p.7-9)
CITY

STATUS
STATUS
S# CITY
S# CITY

①Decomposition A: ② Decomposition B:
③ Decomposition C:

SC: SC:
S# CITY S# CITY S# -> status
CS:
CITY STATUS
CS:
S# STATUS
city -> status

Good ! Bad ! ‘Good’ or ‘Bad’ is


(Rome, 50) can be inserted. (Rome, 50) can not be inserted dependent on item’s
unless there is a supplier located at semantic meaning!!
Rome.

Wei-Pang Yang, Information Management, NDHU 7-24


Good and Bad Decomposition (cont.)
 Independent Projection: (by Rissanen '77, ref[10.6])
Def: Projections R1 and R2 of R are independent iff
(1) Any FD in R can be reduced from those in R1 and R2.
(2) The common attribute of R1 and R2 forms a candidate key for at least one of R1 and R2.
<e.g.> Projection
R1 R2
R
• Decomposition A:
SECOND(S#, STATUS, CITY)
(R) { SC (S#, CITY) (R1)
CS (CITY, STATUS) (R2)
(1) FD in SECOND (R):
S# CITY
CITY STATUS
S# STATUS
FD in SC and CS
R1: S# CITY S# STATUS
R2: CITY STATUS
(2) Common attribute of SC and CS is CITY, which is the primary key of CS.
SC and CS are independent Decomposition A is good!

Wei-Pang Yang, Information Management, NDHU 7-25


Good and Bad Decomposition (cont.)
Decomposition B:
SC (S#, CITY)
SECOND(S#, STATUS, CITY)
{ SS (S#, STATUS)
(1) FD in SC and SS:
S# CITY
S# STATUS CITY STATUS

SC and SS are dependent  decomposition B is bad!


though common attr. S# is primary key of both SC, SS.

Decomposition C:
SECOND(S#, STATUS, CITY) { SS (S#, STATUS)
CS (CITY, STATUS)

(1) FD in SS and CS S# CITY


S#  STATUS
CITY  STATUS
(2) Common attribute STATUS is not only a
candidate key of either SS or CS.
 Decomposition C is not only a bad decomposition,
but also an invalid decomposition, since it is not non-loss.

Wei-Pang Yang, Information Management, NDHU 7-26


Atomic Relation
 Def: Atomic Relation -- A relation that cannot be decomposed
into independent components.
<e.g.> SP (S#, P#, QTY) is an atomic relation, while
S (S#, SNAME, STATUS, CITY) is not.

Wei-Pang Yang, Information Management, NDHU 7-27


7.4 Boyce/Codd Normal Form (BCNF)
Problems of 3NF:
Do not deal with the cases:
<1> A relation has multiple candidate keys,
<2> Those candidate keys were composite,
<3> The candidate keys are overlapped.

7-28
Example
– S: student SJT(S, J, T)
– J: subject S J T
– T: teacher Smith Math. Prof. White
Smith Physics Prof. Green
Jones Math. Prof. White
Jones Physics Prof. Brown

– Meaning of a tuple: student S is taught subject J by teacher T.


– Suppose
– For each subject, each student of that subject is taught by only one teacher.
i.e. (S, J)  T
– Each teacher teaches only one subject.
i.e. T J
– Candidate keys S T
(S, J) and (S, T)
J
– FD diagram

Wei-Pang Yang, Information Management, NDHU 7-29


BCNF
 Def: A relation R is in BCNF iff every determinant is a candidate key.
•A B: A determines B, and A is a determinant.

• <e.g.1> [only one candidate key]


SECOND(S#, STATUS, CITY): not in 3NF &
SP (S#, P#, QTY): in 3NF & BCNF not in BCNF
S# STATUS
QTY
P# S#
CITY
SC (S#, CITY): in 3NF & BCNF
1NF
2NF
S# CITY 3NF
BCNF
CS (CITY, STATUS): in 3NF & BCNF
CITY STATUS in 3NF, not in BCNF e.g.3, e.g.4 (P.7-33)

Wei-Pang Yang, Information Management, NDHU 7-30


BCNF (cont.)
 <e.g.2> [two disjoint (nonoverlapping) candidate keys]
S(S#, SNAME, STATUS, CITY)

S# STATUS

SNAME CITY

Assume :
(1) CITY, STATUS are independent
3NF • 3NF but not BCNF
(2) SNAME is a candidate key
BCNF
S#, SNAME (determinants) are candidate keys. ‧e.g.2
 S is in BCNF (also in 3NF).

Wei-Pang Yang, Information Management, NDHU 7-31


BCNF (cont.)
<e.g.3> [overlapping candidate keys -1] SSP
SSP (S#, SNAME, P#, QTY) S# SName P# QTY
key in SSP: (S#, P#), (SNAME, P#)
FD in SSP
1. S# SNAME
- in 3NF nonkey attribute is FFD on primary key and
2. SNAME S#
3. {S#, P#} QTY mutually independent. e.g. QTY only

4. {SNAME, P#} QTY - not in BCNF  S# is a determinant but not a


candidate key. S# SNAME
Decompose:
S# SNAME SS (S#, SNAME): in BCNF SP (S#, P#, QTY): in BCNF

P#
QTY S#
S# SNAME QTY
P#

Wei-Pang Yang, Information Management, NDHU 7-32


BCNF (cont.)
<e.g.4> [overlapping candidate keys-2]
SJT(S, J, T) – Candidate keys
– S: student
(S, J) and (S, T)
– J: subject
– FD diagram
– T: teacher
S J T
Smith Math. Prof. White S T
Smith Physics Prof. Green
Jones Math. Prof. White J
Jones Physics Prof. Brown
– meaning of a tuple: student S is taught subject J by teacher T.
– Suppose
– For each subject, each student of that subject is taught by only one teacher.
i.e. (S, J)  T
– Each teacher teaches only one subject.
i.e. T J

Wei-Pang Yang, Information Management, NDHU 7-33


BCNF (cont.)
– In 3NF, no nonkey attribute. – Is this decomposition Good or Bad?
– not in BCNF, T J but T is not a candidate key – In Rissanen's sense, ST(S, T) and TJ(T, J) are not
– update anomalies occur! independent!
e.g. (delete "Jones is studying Physics" the fact the FD: (S,T) T cannot be deduced from FD: T J
"Brown teaches Physics" is also deleted!) The two objectives:
Decompose 1: Decompose 2: <1> decomposing a relation into BCNF, and
ST (S, T) TJ(T, J) <2> decomposing it into independent components may
be in conflict!
S T
T J S J T J
Smith Prof. White
Smith Prof. Green Prof. White Math.
Jones Prof. White Prof. Green Physics
Jones Prof. Brown Prof. Brown physics S J T
Smith Math. Prof. White
Smith Physics Prof. Green
S T T J Jones Math. Prof. White
Jones Physics Prof. Brown
in BCNF in BCNF

Wei-Pang Yang, Information Management, NDHU 7-34


BCNF (cont.)
<e.g.5> [overlapping candidate keys-3]
EXAM(S, J, P); S: student, J: subject, P: position.
– meaning of a tuple: student S was examined in subject J and
achieved position P in the class.
– suppose no two students obtained the same position in the same subject.
i.e. (S, J)  P and (J, P)  S
– FD diagram:
EXAM
S J P
S J A DBMS 5
B DBMS 8
A Network 1
P

– candidate keys: (S,J) and (J, P), overlap key: J.


– in BCNF !

Wei-Pang Yang, Information Management, NDHU 7-35


Why Normal Form?
 Avoid update anomalies
 Consider the SSP(S#, SNAME, P#, QTY)
Common sense will tell us SS(S#, SNAME) &
SP(S#, P#, QTY) is a better design.
 The concepts of FD, 1NF, 2NF, 3NF and BCNF to formalize
common sense.
 Mechanization is possible!
• i.e., we can write a program to do the work of normalization for us!

Wei-Pang Yang, Information Management, NDHU 7-36


7.5 Fourth Normal Form (4NF)

7-37
Un-Normalized Relation
CTX Text
COURSE TEACHER TEXT 1
Physics {Prof. Green, {Basic Mechanics, Math 2
Prof. Brown} Principle of Optics}
Math. {Prof. Green} {Basic Mechanics, 3
Vector Analysis,
Trigonometry}

 meaning of a record: the specified course can be taught by any of the specified
teachers and uses all of the specified texts as references.
 Assume:
- For a given course, there exists any number of teachers and any number of texts.
- Teachers and texts are independent.
- A given teacher or a given text can be associated with any number of courses.

Wei-Pang Yang, Information Management, NDHU 7-38


Un-normalized Relation (cont.)
 Note: No FD exists in this relation! Function
. .
Normalized .
.
.
C T X
COURSE TEACHER TEXT
Physics(c) Prof. Green(t1) Basic Mechanics (x1)
physics(c) Prof. Green(t1) Principle of Optics (x2)
physics(c) Prof. Brown(t2) Basic Mechanics (x1)
physics(c) prof. Brown(t2) Principles of Optics(x2)
Math prof. Green Basic Mechanics
Math prof. Green Vector Analysis
Math prof. Green Trigonometry

Wei-Pang Yang, Information Management, NDHU 7-39


Un-normalized Relation (cont.)
 Meaning of a tuple: course C can be taught by teacher T and uses text X as a
reference.
• primary key: (COURSE, TEACHER, TEXT) COURSE
Physics(c)
TEACHER
Prof. Green(t1)
TEXT
Basic Mechanics (x1)
physics(c) Prof. Green(t1) Principle of Optics (x2)
physics(c) Prof. Brown(t2) Basic Mechanics (x1)
physics(c) prof. Brown(t2) Principles of Optics(x2)
Math prof. Green Basic Mechanics
Math prof. Green Vector Analysis

 Check: Math prof. Green Trigonometry

• in 1NF (simple domain contains atomic value only)


• in 2NF (Nonkey attributes are FFD on primary key,
no key attributes)
• in 3NF (Nonkey attributes are mutually independent.)
• in BCNF (Every determinant is a candidate key)

Wei-Pang Yang, Information Management, NDHU 7-40


Un-normalized Relation (cont.)
 Problem: a good deal of redundancy! COURSE TEACHER TEXT

• property: Physics(c)
physics(c)
Prof. Green (t1)
Prof. Green (t1)
Basic Mechanics (x1)
Principle of Optics (x2)
if (c, t1, x1), (c, t2, x2) both appear physics(c) Prof. Brown (t2) Basic Mechanics (x1)
then (c, t1, x2) , (c, t2, x1) both appear also! physics(c) prof. Brown (t2) Principles of Optics (x2)
• reason: No FD, but has MVD! Math prof. Green Basic Mechanics
Math prof. Green Vector Analysis
intuitively decomposed Math prof. Green Trigonometry

CT: CX:
COURSE TEACHER COURSE TEXT
Physics Prof. Green Physics Basic Mechanics
Physics Prof. Brown Physics Principles of Optics
Math Prof. Green Math Basic Mechanics
Math Vector Analysis
Physics Math Trigonometry

Math
Not FD!
• the decomposition cannot be made on the basis of FD.
Wei-Pang Yang, Information Management, NDHU 7-41
MVD ( Multi-Valued Dependencies)
 Def: Given R(A, B, C), the multivalued dependence (MVD)
R.A R.B holds in R iff the set of B-values matching a given (A-value, C-value)
pair is R, depend only on A-value, and is independent of C-value.
<e.g> COURSE TEACHER, COURSE TEXT
Green
{ physics , { Brown
{ , Basic Mechanics
{
A B C

 Thm: Given R(A, B, C), the MVDR.A R.B holds iff the MVD R.A R.C also holds.
• Notation: R.A R.B | R.C
<e.g.> COURSE TEACHER | TEXT MVD .
<Note> 1. FD is a special case of MVD
. .
all FD's are also MVD's .
2. MVDs (which are not also FD's) can exist only
if the relation R has at least 3 attributes. FD ‧ ‧

Wei-Pang Yang, Information Management, NDHU 7-42


Norma Forms: 4NF
 Problem of CTX: involves MVD's that are not also FD's.
 Def: A relation R is in 4NF
iff whenever there exists an MVD in R, say A R,
then all attributes of R are also FD on A.
i.e. R is in 4NF iff (i) R is in BCNF, (ii) all MVD's in R are in fact FD's.
i.e. R is in 4NF iff (i) R is in BCNF, (ii) no MVD's in R.
<e.g.1> CTX (COURSE, TEACHER, TEXT)
COURSE TEACHER
COURSE TEXT  not in 4NF
<e.g.2> S (S#, SNAME, STATUS, CITY)
S# STATUS

no MVD which is not FD


SNAME CITY
in 4NF

Wei-Pang Yang, Information Management, NDHU 7-43


Norma Forms: 4NF (cont.)
 Thm: Relation R(A, B, C) can be no loss decomposed
into R1(A, B) and R2(A, C) iff A B | C holds in R.
<e.g.> CTX (COURSE, TEACHER, TEXT)
COURSE TEACHER | TEXT

CT (COURSE, TEACHER) no MVD  in 4NF


CX (COURSE, TEXT) no MVD in 4NF

Wei-Pang Yang, Information Management, NDHU 7-44


7.6 Fifth Normal Form (5NF)

7-45
A Surprise
 There exist relations that cannot be nonloss-decomposed into two projections,
but can be decomposed into three or more.
 Def: n-decomposable (for some n > 2)
the relation can be nonloss-decomposed into n projections,
but not into m projection for any m < n.
 <e.g.> SPJ (S#, P#, J#); S: supplier, P: part, J: project.
• Suppose in real world
if (a) Smith supplies monkey wrenches, and
(b) Monkey wrenches are used in Manhattan project, and
(c) Smith supplies Manhattan project.
then
(d) Smith supplies Monkey wenches to Manhatan project.
i.e.
If (s1, p1, j2), (s2, p1, j1), (s1, p2, j1) appear in SPJ
Then (s1, p1, j1) appears in SPJ also.
– no MVD in 4NF

Wei-Pang Yang, Information Management, NDHU 7-46


A Surprise (cont.)
 update problem of SPJ
SPJ: S# P# J#
S1 P1 J2
S1 P2 J1
• If (S2, P1, J1) is to be inserted
then (S1, P1, J1) must also be inserted
SPJ: S# P# J#
S1 P1 J2
S1 P2 J1
S2 P1 J1
S1 P1 J1
• If (S1, P1, J1) is to be deleted, then one of the following must also be deleted
(i) (S1, P1, J2): means S1 no longer supplies P1.
(ii) (S1, P2, J1): means S1 no longer supplies J1.
(iii) (S2, P1, J1): means J1 no longer needs P1.

Wei-Pang Yang, Information Management, NDHU 7-47


A Surprise (cont.)
 SPJ is not 2-decomposable, but is 3-decomposable!
SPJ S# P# J# SP S# P# PJ P# J# JS J# S#
S1 P1 J2 S1 P1 P1 J2 J2 S1
S1 P2 J1 S1 P2 P2 J1 J1 S1
S2 P1 J1 S2 P1 P1 J1 J1 S2
S1 P1 J1
join
over P#

S# P# J#
S1 P1 J2
S1 P1 J1
S1 P2 J1
spurious S2 P1 J2 join over (J#, S#)
S2 P1 J1

ORIGINAL SPJ

Wei-Pang Yang, Information Management, NDHU 7-48


Join Dependency (JD)
 Def: A Relation R satisfies the join dependency (JD)
* (X, Y, ..., Z)
iff R is equal to the join of its projections on X, Y, ..., Z,
where X, Y, ..., Z are subsets of the set of attributes of R.
 <e.g.> SPJ satisfies the JD *(SP, PJ, JS) i.e. SPJ is 3-decomposable.
 MVD is a special case of JD.
Thm: R (A, B, C) can be nonloss-decomposed into

R1(A, B) and R2(A, C) iff A B|C holds.
Thm: R (A, B, C) satisfies the JD *(AB, AC)
iff A B|C holds.
 <Note>
JD's are the most general form of dependency possible,
so long as we concentrate on the dependencies that deal
with a relation being decomposed via projection and recomposed via join.

Wei-Pang Yang, Information Management, NDHU 7-49


Norma Forms: 5NF
 Def: A relation R is in 5NF (or PJ/NF) iff every JD in R is
a consequence of the candidate keys of R.
 <e.g.1> Suppose S# and SNAME are candidate keys of
S (S#, SNAME, STATUS, CITY).
<i> * ((S#, SNAME, STATUS), (S#, CITY))
is a consequence of S# (a candidate key of S)
<ii> * ((S#, SNAME), (S#, STATUS), (SNAME, CITY))
is a consequence of the candidate keys S# and SNAME.

Wei-Pang Yang, Information Management, NDHU 7-50


Norma Forms: 5NF (cont.)
 <e.g.2> Consider SPJ (S#, P#, J#), the candidate key of SPJ is
(S#, P#, J#).
However, there exists a JD
*((S#, P#), (P#, J#), (J#, S#))
which is not a consequence of (S#, P#, J#)
SPJ not in 5NF!
decomposed:
SP (S#, P#), PJ (P#, J#), JS (J#, S#):
( no JD in them  all in 5NF!)

Note:
1. Discovering all the JD's is a nontrivial operation.
2. Intuitive meaning of JD may not be obvious.
3. A relation in 4NF but not in 5NF is a pathological case, and likely to be rare in
practice.

Wei-Pang Yang, Information Management, NDHU 7-51


Concluding Remarks
 The technique of non-loss decomposition is an aid to logical database design .
 The overall processes of Normalization:
• step1: eliminate non-full dependencies.
• step2: eliminate any transitive FDs.
• step3: eliminate those FDs in which the determinant is not a candidate key.
• step4: eliminate any MVDs that are not FDs.
• step5: eliminate any JDs that are not a consequence of candidate keys.
 General objective:
• reduce redundancy, and then
• avoid certain update anomalies.
 Normalization Guidelines are only guidelines.
• Sometime there are good reasons for not normalizing all the way.

Wei-Pang Yang, Information Management, NDHU 7-52

You might also like