Week 5
Week 5
Week 5 Lecture 1
Class BSCCS2001
Materials
Module # 21
Type Lecture
Week # 5
Week 5 Lecture 1 1
Consider combining relations
It leads to anomalies
Anomaly: Inconsistencies that can arise due to data changes in a database with insertion, deletion and update
These problems occur in poorly planned, un-normalized DBs where all the data is stored in one table (a flat-file
DB)
Insertions anomaly
Deletion anomaly
Update anomaly
Insertions anomaly
When the insertion of a data record is not possible without adding some additional unrelated data to the record
We cannot add an Instructor in instructor_with_department if the department does not have a building or budget
Deletion anomaly
When deletion of a data record results in losing some unrelated information that was stored as part of the record
that was deleted from a table
We delete the last Instructor of a Department from instructor_with_department, we lose building and budget
information
Update anomaly
When a data is changed, which could involve many records having to be changed, leading to the possibility of
some changes being made incorrectly
When the budget changes for a Department having a large number of Instructors in instructor_with_department
application may miss some of them
Week 5 Lecture 1 2
We have observed the following:
Redundancy ⇒ Anomaly
Dependency ⇒ Redundancy
No
Decomposition
Suppose we had started with inst_dept
Write a rule "if there were a schema (dept_name, building, budget), then dept_name would be a candidate key"
In inst_dept, because dept_name is not a candidate key, the building and budget of a department may have to
repeated
Suppose we decompose
Note that if name can be duplicate, then employee2 is a weak entity set and cannot exist without an identifying
relationship
The next slide shows how we lose information — we cannot reconstruct the original employee relation — and so, this
is a lossy decomposition
Week 5 Lecture 1 3
Decomposition: Lossless-join Decomposition
Lossless Join Decomposition
Decomposition of R = (A, B, C)
R1 = (A, B), R2 = (B, C)
Lossless Join Decomposition is a decomposition of a relation R into relations R1 , R2 such that if we perform
natural join of two smaller relations it will return the original relation
R1 ∪ R2 = R, R1 ∩ R2 = ϕ
∀r ∈ R, r1 = ⊓R 1 (r), r2 = ⊓R 2 (r)
r1 ⋈ r2 = r
This is effective in removing the redundancy from DBs while preserving the original data
Week 5 Lecture 1 4
In other words, by lossless decomposition it becomes feasible to reconstruct the relation R from decomposed tables
R1 and R2 by using Joins
the value of each attribute contains only a single value from that domain
Non-atomic values complicate storage and encourage redundant (repeated) storage of data
Example: Set of accounts stored with each customer, and set of owners stored with each account
Atomicity is actually a property of how the elements of the domain are used
Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127
If the first two characters are extracted to find the department, the domain of the roll numbers is not atomic
It leads to encoding of the information in application program rather than in the database
Consider:
Week 5 Lecture 1 5
How to search telephone numbers
Duplicated information
Week 5 Lecture 1 6
📚
Week 5 Lecture 2
Class BSCCS2001
Materials
Module # 22
Type Lecture
Week # 5
In the case that a relation R is not in "good" form, decompose it into a set of relations {R1 , R2 , ..., Rn } such that
Functional dependencies
Multi-valued dependencies
Other dependencies
Functional Dependencies
Constraints on the set of legal relations
Require that the values for a certain set of attributes determines uniquely the value for another set of attributes
α ⊆ R and β ⊆ R
Week 5 Lecture 2 1
The functional dependency or FD
α→β
holds on R if and only if for any legal relations r(R), whenever any two tuples t1 and t2 on r agree on the attributes
α, they also agree on the attributes β
That is:
dept_name → building
dept_name → budget
ID → budget
dept_name → salary
test relations to see if they are legal under a given set of functional dependencies
We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F
NOTE: A specific instance of a relation schema may satisfy a functional dependency even if the functional
dependency does not hold on all legal instances
name → ID
Example:
ID, name → ID
name → name
Week 5 Lecture 2 2
In general, α → β is trivial if β ⊆ α
StudentID → Semester
StudentID, Lecture → TA
EmployeeID → EmployeeName
EmployeeID → DepartmentID
DepartmentID → DepartmentName
Reflexivity: if β ⊆ α, then α → β
Augmentation: if α → β , then γα → γβ
Transitivity: if α → β and β → γ , then α → γ
These axioms can be repeatedly applied to generate new FDs and added to F
The process of generations of FDs terminate after infinite number of steps and we call it the Closure Set F + for FDs
F
This is the set of all FDs logically implied by F
Clearly, F ⊆ F+
These axioms are:
Week 5 Lecture 2 3
Complete (eventually generate all functional dependencies that hold)
Week 5 Lecture 2 4
📚
Week 5 Lecture 3
Class BSCCS2001
Materials
Module # 23
Type Lecture
Week # 5
A→H
by transitivity from A → B and B → H
AG → I
by augmenting A → C with G, to get AG → CG and then transitivity with CG → I
CG → HI
by augmenting CG → I with CG to infer CG → CGI and augmenting CG → H with I to infer CGI →
HI and then transitivity
Week 5 Lecture 3 1
To compute the closure of a set of functional dependencies F :
F+ ← F
repeat
Reflexivity: if β ⊆ α, then α → β
Augmentation: if α → β , then γα → γβ
Transitivity: if α → β and β → γ , then α → γ
Prove the rules from:
Basic axioms
result ← α
while (changes to result) do
for each β → γ in F do
begin
Week 5 Lecture 3 2
Is AG a super key?
Does AG → R? == Is (AG)+ ⊇ R
Is any subset of AG a superkey?
Does A → R? == Is (A)+ ⊇ R
Does G → R? == Is (A)+ ⊇ R
Computing closure of F
For each γ ⊆ R, we find the closure γ + and for each S ⊆ γ + , we output a functional dependency γ → S
because the non-trivial dependency dept_name → building, budget holds on instr_dept, but dept_name is not a
superkey
BCNF: Decomposition
If in schema R and non-trivial dependency α → β causes a violation of BCNF, we decompose R into:
α∪β
(R − (β − α))
In our example:
α = dept_name
β = building, budget
dept_name → building, budget
inst_dept is replaced by
Lossless Join
If we decompose a relation R into relations R1 and R2 :
Decomposition is lossy if R1 ⋈ R2 ⊃ R
Week 5 Lecture 3 3
Decomposition is lossless if R1 ⋈ R2 = R
To check if lossless join decomposition using FD set, the following must hold:
R1 ∪ R2 = R
Intersection of Attributes of R1 and R2 must not be NULL
R1 ∩ R2 =
ϕ
Common attribute must be a key for at least one relation (R1 or R2 )
R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Prove that BCNF ensures Lossless Join
If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that
all functional dependencies hold, then that decomposition is dependency preserving
Consider:
R = CSZ, F = {CS → Z, Z → C}
Key = CS
α → β ∈ F+
at least one of the following holds:
If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions must hold)
Goals of Normalization
Let R be a relation scheme with a set F of functional dependencies
In the case that a relation scheme R is not in "good" form, decompose it into a set of relation scheme
{R1 , R2 , ..., Rn } such that
each relation scheme is in good form
Week 5 Lecture 3 4
Problems with Decomposition
There are 3 potential problems to consider:
Consider a relation
where an instructor may have more than one phone and can have multiple children
There are no non-trivial functional dependencies and therefore the relation is in BCNF
Insertion anomalies — that is, if we add a phone 981-992-3443 to 99999, we need to add two tuples
This suggests the need for higher normal forms such as the Fourth Normal Form (4NF)
Week 5 Lecture 3 5
Week 5 Lecture 3 6
📚
Week 5 Lecture 4
Class BSCCS2001
Materials
Module # 24
Type Lecture
Week # 5
is AG a super key?
Does AG → R? == is (AG)+ ⊇ R
is any subset of AG a superkey?
Does A → R? == is (A)+ ⊇ R
Does G → R? == is (G)+ ⊇ R
Week 5 Lecture 4 1
There are several uses of the attribute closure algorithm
Computing closure of F
For each γ ⊆ R, we find the closure γ + and for each S ⊆ γ + , we output a functional dependency γ → S
Extraneous Attributes
Consider a set F of FDs and the FD α → β in F
Attribute A is extraneous in α if A ∈ α and F logically implies
(F − {α → β}) ∪ {(α − A) → β}
Attribute A is extraneous in β if A ∈ β and the set of FDs
(F − {α → β}) ∪ {α → (β − A)} logically implies F
NOTE: Implication in the opposite direction is trivial in each of the cases above, since a "stronger" functional
dependency always implies a weaker one
Example: Given F = {A → C, AB → C}
B is extraneous in AB → C because {A → C, AB → C} logically implies A → C
(that is, the result of dropping B from AB → C)
A+ = AC in {A → C, AB → C}
Example: Given F = {A → C, AB → CD}
C is extraneous in AB → CD since AB → C can be inferred even after deleting C
AB+ = ABCD in {A → C, AB → D}
F ′ = (F − {α → β}) ∪ {α → (β − A)}
Check that α+ contains A; if it does, A is extraneous in β
Week 5 Lecture 4 2
G covers F : Means that all functional dependency of F are logically numbers of functional dependency set
G ⇒ G+ ⊇ F
Canonical Cover
Sets of FDs may have redundant dependencies that can be inferred from the others
Can we have some kind of "optimal" or "minimal" set of FDs to work with?
A Canonical Cover for F is a set of dependencies Fc such that ALL the following properties are satisfied:
F + = Fc+
F logically implies all dependencies in Fc
Fc logically implies all dependencies in F
No functional dependency in Fc contains an irrelevant attribute
Equivalent to F
Week 5 Lecture 4 3
Canonical Cover: LHS
Canonical Cover
To compute a canonical cover for F :
repeat
α1 → β1 and α1 → β2 with α1 → β1 β2
Find a functional dependency α → β with an
irrelevant attribute either in α or in β
NOTE: Union rule may become applicable after some irrelevant attributes have been deleted, so it has to be re-
applied
Week 5 Lecture 4 4
Practice Problems on Functional Dependencies
Week 5 Lecture 4 5
Week 5 Lecture 4 6
📚
Week 5 Lecture 5
Class BSCCS2001
Materials
Module # 25
Type Lecture
Week # 5
R1 ∩ R2 → R1
R1 ∩ R2 → R2
The above functional dependencies are a sufficient condition for lossless join decomposition; the dependencies are a
necessary condition only if all constraints are functional dependencies
Decompose as: Supplier (S#, Sname, City, Qty): Parts (P#, Qty)
Week 5 Lecture 5 1
Take natural join to reconstruct: Supplier ⋈ Parts
Decompose as: Supplier (S#, Sname, City, Qty): Parts (P#, Qty)
R = (A, B, C)
F = {A → B, B → C}
Can be decomposed in two different ways
R1 ∩ R2 = {B} and B → BC
Dependency preserving
R1 ∩ R2 = {A} and A → AB
Not dependency preserving
Week 5 Lecture 5 2
Dependency Preservation
Let Fi be the set of dependencies F + that include only attributes in Ri
(F1 ∪ F2 ∪ ... ∪ Fn )+ = F +
If it is not, then checking updates for violation of functional dependencies may require computing join, which is
expensive
Let R be the original relational schema having FD set F. Let R1 and R2 having FD set F1 and F2 respectively, are the
decomposed sub-relations of R. The decomposition of R is said to be preserving if
The restriction of F + to Ri is the set of all functional dependencies in F + that include only attributes of Ri
compute F + ;
begin
Fi = the restriction of F + to Ri ;
end
F′ = ϕ
for each restriction Fi do
begin
F ′ = F ′ ∪ Fi
end
compute F ′+ ;
The procedure for checking dependency preservation takes exponential time to compute F + and (F1 ∪ F2 ∪ ... ∪
+
Fn )
Week 5 Lecture 5 3
F = {A → BCD, A → EF , BC → AD, BC → E, BC → F , B → F , D → E}
Decomposition: R1(A, B, C, D) R2(B, F ) R3(D, E)
F ′ = F1 ∪ F2 ∪ F3
Checking for: A → E , A → F in F ′+
A → D (from R1), D → E (from R3) : A → E (By transitivity)
A → B (from R1), B → F (from R2) : A → F (By transitivity)
Checking for: BC → E , BC → F in F ′+
BC → D (from R1), D → E (from R3) : BC → E (by transitivity)
B → F (from R2) : BC → F (by augmentation)
R(A, B, C, D)
F = {A → B, B → C, C → D, D → A}
Decomposition: R1(A, B) R2(B, C) R3(C, D)
A → B is preserved on table R1
B → C is preserved on table R2
C → D is preserved on table R3
We have to check whether the one remaining FD: D → A is preserved or not
F ′ = F1 ∪ F2 ∪ F3
Checking for: D → A in F ′+
D → C (from R3), C → B (from R2), B → A (from R1) : D → A (by transitivity)
Hence, all dependencies are preserved
result = α
t = (result ∩ Ri )+ ∩ Ri
result = result ∪ t
Week 5 Lecture 5 4
This procedure takes polynomial time, instead of the exponential time required to compute F + and (F1 ∪ F2 ∪ ... ∪
+
Fn )
Week 5 Lecture 5 5