Unit 4 - PDF
Unit 4 - PDF
1 2
Wages R W
8 10
Example: Constraints on Entity Set Example (Contd.) Hourly_Emps2 5 7
S N L R H
Are the two smaller
Consider a relation obtained from Hourly_Emps: tables better? 123-22-3666 Attishoo 48 8 40
Hourly_Emps (ssn, name, lot, rating, hrly_wages, 231-31-5368 Smiley 22 8 30
hrs_worked) Problems in single “wide”
table due to RW: 131-24-3650 Smethurst 35 5 30
Notation: We will denote this relation schema by Update anomaly: Can we 434-26-3751 Guldu 35 5 32
listing the attributes: SNLRWH change W in just the first
tuple of SNLRWH? 612-67-4134 Madayan 35 8 40
This is really the set of attributes {S,N,L,R,W,H}. Insertion anomaly: What
S N L R W H
if we want to insert an
Sometimes, we will refer to all attributes of a relation by employee and don’t know 123-22-3666 Attishoo 48 8 10 40
using the relation name. (e.g., Hourly_Emps for SNLRWH) the hourly wage for his
rating? 231-31-5368 Smiley 22 8 10 30
Some FDs on Hourly_Emps: Deletion anomaly: If we
131-24-3650 Smethurst 35 5 7 30
delete all employees with
ssn is the key: SSNLRWH rating 5, we lose the
information about the
434-26-3751 Guldu 35 5 7 32
rating determines hrly_wages: RW wage for rating 5. 612-67-4134 Madayan 35 8 10 40
5 6
Reasoning About FDs Reasoning About FDs (Contd.)
Given some FDs, we can infer additional FDs: Additional rules (that follow from the AA):
ssndid, didlot implies ssnlot Union: If XY and XZ, then XYZ
An FD f is implied by a set of FDs F if f holds whenever all Decomposition: If XYZ, then XY and XZ
FDs in F hold. Example: Contracts(cid, sid, jid, did, pid, qty, value) and:
F+ = closure of F; is the set of all FDs that are implied by F. C is the key: CCSJDPQV
Armstrong’s Axioms (X, Y, Z are sets of attributes): Project purchases each part using single contract: JPC
Reflexivity: If XY, then YX. Dept purchases at most one part from a supplier: SDP
Augmentation: If XY, then XZYZ for any Z. JP C, CCSJDPQV imply JPCSJDPQV
Transitivity: If XY and YZ, then XZ. SDP implies SDJJP
These are sound (generate only FDs in F+) and complete SDJJP, JPCSJDPQV imply SDJCSJDPQV
(generate all FDs in F+) inference rules for FDs.
7 8
11 12
Problems Prevented By BCNF Third Normal Form (3NF)
If BCNF is violated by (non-trivial) FD XA, one of the Reln R with FDs F is in 3NF if, for all XA in F+
following holds: AX (called a trivial FD), or
X is a subset of some key K. X is a superkey for R, or
• We store (X, A) pairs redundantly. A is part of some key for R.
• E.g., Reserves(S, B, D, C) with SBD as only key and FD SC
• Credit card number of a sailor stored for each reservation
Minimality of a key is crucial in third condition above.
X is not a proper subset of any key. If R is in BCNF, is it automatically in 3NF? What about the
• Redundant storage of (X, A) pairs as above other direction?
• And there is a chain of FDs KXA, which means that we cannot If R is in 3NF, some redundancy is possible.
associate an X value with a K value unless we also associate an A value
with an X value. 3NF is a compromise, used when BCNF is not achievable (e.g.,
• E.g., Hourly_Emps(S, N, L, R, W, H) with S as only key and FD RW no ``good’’ decomposition, or performance considerations).
• Have chain SRW, hence cannot record the fact that employee S has Lossless-join, dependency-preserving decomposition of R into a
rating R without knowing the hourly wage for that rating collection of 3NF relations is always possible. (covered soon)
13 14
A B
Lossless Join Decompositions More on Lossless Join 1 2
A B C 4 5
Decomposition of R into X and Y is lossless-join w.r.t. The decomposition of R 1 2 3 7 2
a set of FDs F if, for every instance r that satisfies F: into X and Y is lossless-join 4 5 6
X(R) ⋈ Y(R) = R w.r.t. F if and only if the B C
7 2 8
closure of F contains: 2 3
It is always true that R X(R) ⋈ Y(R)
X Y X, or
5 6
In general, the other direction does not hold.
2 8
If it does, the decomposition is lossless-join. XYY A B C
Definition extended to decomposition into three or Special case: 1 2 3
more relations in a straightforward way. For FD U V, the 4 5 6
decomposition of R into UV 7 2 8
It is essential that all decompositions used to deal
and R V is lossless-join. 1 2 8
with redundancy be lossless. Why? 7 2 3
21 22
Dependency-Preserving Decomposition
Finding The Minimal Cover
into 3NF
Using minimal cover F of given FD set, we can now achieve a lossless-join,
F = {AB, ABCDE, EFGH, ACDFEG} dependency-preserving decomposition into 3NF.
Decomposition to have single attribute on right side 1. Lossless-join decomposition until all smaller relations are in 3NF
2. For each FD XA in F that is not preserved, add relation XA
AB, ABCDE, EFG, EFH, ACDFE, ACDFG Result is lossless-join (X is superkey of XA) and dependency-preserving
(obviously), but is it still in 3NF?
Check if any attribute on left side can be deleted All relations after step 1 are in 3NF, but what about XA?
XA is not a problem for 3NF because X is a superkey of XA
without changing closure What if another FD on XA is a problem for 3NF?
AB, ABCDE, EFG, EFH, ACDFE, ACDFG •
•
Any FD on XA can only contain attributes from X{A}
If right-hand side of FD in FXA contains A, left must be X (otherwise XA would not have
been in minimal cover)
Delete FDs that are implied by others • If right-hand side does not contain A, it must be a subset of X, i.e., is a subset of a key
• Why is X a key? It is a superkey, but is it minimal?
AB, ACDE, EFG, EFH, ACDE, ACDFG • Yes: if X’X was a key, then XA would not have been in the minimal cover and X’A
would have been there
• ACDFG from ACDE, EFG Why not use the same algorithm for lossless-join, dependency –
preserving decomposition into BCNF?
29 30
Update on DB Design Process Refining Entity Sets
Create ER diagram Consider Hourly_Emps(ssn, name, lot, rating,
hourly_wages, hours_worked)
Translate ER diagram into set of relations FDs: SSNLRWH and RW
Check relations for redundancy problems (not in Assume designer created entity set Hourly_Emps as
above
3NF, BCNF) Redundancy problem with RW
Perform decomposition to fix problems Could not discover it in ER diagram (only shows primary key
constraints)
Update ER diagram To fix redundancy problem, create new entity set
Wage_Table(rating, hourly_wages)
Add relationship to connect Hourly_Emps2(S, N, L, H) and
Wage_Table(R, W)
Similar for refining of relationship sets (see book)
31 32