0% found this document useful (0 votes)
27 views

Unit 3 Slides

Uploaded by

Shreya M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Unit 3 Slides

Uploaded by

Shreya M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 234

Database Management Systems

Functional Dependencies

Prof Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Functional Dependencies

• Each relation schema consists of a number of attributes, and the relational database schema
consists of a number of relation schemas.
• We have assumed that attributes are grouped to form a relation schema by using the common
sense of the database designer or by mapping a database schema design from a conceptual data
model such as the ER data model.
• we did not develop any measure of appropriateness or goodness to measure the quality of
the design, other than the intuition of the designer.
• There are two levels at which we can discuss the goodness of relation schemas. The first is the
logical (or conceptual) level—how users interpret the relation schemas and the meaning of their
attributes. The second is the implementation (or physical storage) level—how the tuples in a
base relation are stored and updated.
Database Management Systems
Functional Dependencies

• Database design may be performed using two approaches: bottom-up or top-down.

• A bottom-up design methodology (also called design by synthesis) considers the basic
relationships among individual attributes as the starting point and uses those to construct relation
schemas. This approach is not very popular because it suffers from the problem of having to collect
a large number of binary relationships among attributes as the starting point. For practical
situations, it is next to impossible to capture binary relationships among all such pairs of attributes.

• In contrast, a top-down design methodology (also called design by analysis) starts with a number
of groupings of attributes into relations that exist together naturally, for example, on an invoice, a
form, or a report. The relations are then analyzed individually and collectively, leading to further
decomposition until all desirable properties are met.
Database Management Systems
Functional Dependencies

• The theory is applicable primarily to the top-down design approach, and as such is more
appropriate when performing design of databases by analysis and decomposition of sets of
attributes that appear together in files, in reports, and on forms in real-life situations.
• Relational database design ultimately produces a set of relations. The implicit goals of the design
activity are information preservation and minimum redundancy.
• A functional dependency is a constraint between two sets of attributes from the database.
• Suppose that our relational database schema has n attributes A1, A2, … , An; let us think of the
whole database as being described by a single universal relation schema R = {A1, A2, … , An}.
• We do not imply that we will actually store the database as a single universal table;
• We use this concept only in developing the formal theory of data dependencies
Database Management Systems
Functional Dependencies

■ Some of the most commonly used types of real-world constraints can be represented
formally as keys (super keys, candidate keys, and primary keys), or as functional
dependencies
■ Functional dependencies (FDs)
■ Are used to specify formal measures of the "goodness" of relational designs
■ And keys are used to define normal forms for relations

■ Are constraints that are derived from the meaning and interrelationships of the data attributes

■ The functional dependency (FD) is a relationship that exists between two attributes, Where one
attribute can directly or indirectly derived from the other attribute.
■ A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique
value for Y

Dependent
Database Management Systems
Defining Functional Dependencies

■ A functional dependency, denoted by X → Y, between two sets of attributes X and Y that are
subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. The
constraint is that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y]
= t2[Y].
➔The values of the X component of a tuple uniquely (or functionally) determine the values of the
Y component. (Or) there is a functional dependency from X to Y, or that Y is functionally dependent
on X.
➔X functionally determines Y in a relation schema R if, and only if, whenever two tuples of r(R) agree
on their X-value, they must necessarily agree on their Y-value.
Note the following:
• If a constraint on R states that there cannot be more than one tuple with a given X-value in any
relation instance r(R)—that is, X is a candidate key of R—this implies that X → Y for any subset
of attributes Y of R (because the key constraint implies that no two tuples in any legal state r(R)
will have the same value of X). If X is a candidate key of R, then X → R.
• If X → Y in R, this does not say whether or not Y → X in R
Database Management Systems
Use of Functional Dependencies

Use functional dependencies in two ways:


1. To test instances of relations to see whether they satisfy a given set F of functional dependencies.
2. To specify constraints on the set of legal relations.

To Summarize
■ X → Y in R specifies a constraint on all relation instances r(R)
■ Written as X → Y; can be displayed graphically on a relation schema
■ FDs are derived from the real-world constraints on the attributes
■ A functional dependency is a property of the semantics or meaning of the attributes.
■ Relation extensions r(R) that satisfy the functional dependency constraints are called legal relation
states (or legal extensions) of R.
■ For example, {State, Driver_license_number} → Ssn should normally hold for any adult in the
United States and hence should hold whenever these attributes appear in a relation
Database Management Systems
Notational Conventions

• Set of attributes -> α


• Relate a Schema (R) for relation (r) -> r(R)
• If a set of attributes is a super key -> K
• We use a lowercase name for relations -> “teacher”

For simplicity, we assume that attribute names have only one meaning within
the database schema.
Database Management Systems
Defining Functional Dependencies

In this example, let's say we want to explore the functional dependency


"Department → Salary," which means that for any two tuples with the same value for
the "Department" attribute, they must have the same value for the "Salary" attribute.

Is the functional dependency satisfied?


Database Management Systems
Examples of FD constraints (1)

■ Social security number determines


employee name
■ SSN → ENAME

■ Project number determines project name


and location
■ PNUMBER → {PNAME,
PLOCATION}

■ Employee ssn and project number


determines the hours per week that the
employee works on the project
■ {SSN, PNUMBER} → HOURS
Database Management Systems
Examples of FD constraints (2)

■ The main use of FD is to describe further a relational schema R by specifying


constraints on its attributes that must hold all the times.

■ An FD is a property of the attributes in the schema R

■ The constraint must hold on every relation instance r(R)

■ If K is a key of R, then K functionally determines all attributes in R


■ (since we never have two distinct tuples with t1[K]=t2[K])
Database Management Systems
Diagrammatic Notation for displaying FDs

■ Each FD is displayed as a horizontal line.


■ The left hand side attributes are connected by vertical lines to the line representing the FD,
whereas the right hand side attributes are connected by the lines with arrows pointing toward the
attributes.

Try and take a guess at what these FDs are?


Database Management Systems
Types of Functional Dependency

• An attribute of relation schema R is called a prime attribute of R if it is a member of some


candidate key of R.
• An attribute is called nonprime if it is not a prime attribute—that is, if it is not a member of
any candidate key.
• Both Ssn and Pnumber are prime attributes of WORKS_ON, whereas other attributes of
WORKS_ON are nonprime
Full Functional Dependency:
■ A functional dependency X → Y is a full functional dependency if removal of any attribute A
from X means that the dependency does not hold anymore; that is, for any attribute A ε X, (X −
{A}) does not functionally determine Y.
Database Management Systems
Types of Functional Dependency

■ A functional dependency is said to be full dependency “if and only if the determinant of
the functional dependency if either candidate key or super key, and the dependent can
be either prime or non-prime attribute”.
Example: Consider the following determinant ABC → D
i.e., ABC determines D
BC→D, C→D, A→D These subsets of ABC, cannot determine D
So D is Fully Functional Dependent.
■ {Ssn, Pnumber} → Hours is a full dependency (neither Ssn → Hours nor Pnumber →
Hours holds).
Database Management Systems
Types of Functional Dependency

■ Partial Functional Dependency:


■ A functional dependency X → Y is a partial dependency “if some attribute A ε X
can be removed from X and the dependency still holds; that is, for some A ε X,
(X − {A}) → Y”.
■ However, the dependency If a non-prime attribute of the relation is getting
derived by only a part of the candidate key, then such dependency is known as
Partial Dependency.

Example: Consider the following


{Ssn, Pnumber} → Ename is partial because Ssn → Ename holds.
Database Management Systems
Types of Functional Dependency

■ Transitive Functional Dependency:


If a non-prime attribute of a relation is getting derived by either another non-prime attribute or
the combination of the part of the candidate key along with non-prime attribute, then such
dependency is defined as Transitive dependency. i.e., in a relation, there may be dependency
among non-key fields. Such dependency is called Transitive Functional Dependency.

Example: X→Y, and Y→Z Determine X→Z holds.

■ Trivial Functional Dependency:


It is basically related to Reflexive rule. i.e., if X is a set of attributes, and Y is subset of X then X→Y
holds.
Example: ABC→BC is a Trivial Dependency,
Wherein X{ABC} and Y{BC}
Database Management Systems
Types of Functional Dependency

■ Multi-Valued Dependency:
Consider 3 fields X, Y, and Z in a relation. If for each value of X, there is a well-defined
set of values Y and Well-defined set of values of Z and set of values of Y is independent of
the set values of Z. This dependency is Multi-valued Dependency. i.e., X →→Y / Z.
CAR_MODEL → → MANUF_MONTH CAR_MODEL → → COLOUR
Database Management Systems
Ruling Out FDs
Consider the following TEACH relation.
we can say that the FD: Text → Course may exist.

However, the FDs Teacher → Course, Teacher → Text, Course → Text are ruled out.
(This is because the element on the LHS have similar values in the current column, but are not the same on the
RHS column)
Database Management Systems
What FDs may exist?

• A relation R(A, B, C, D) with its extension.


• The following FDs may hold because the 4 tuples in the current extension have no
violation of these constraints: B -> C; C->B; {A,B} -> C; {A,B} -> D; and {C,D} -> B
• The following do not hold: A-> B, B-> A ; D -> C
THANK YOU
Prof Nivedita Kasturi
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Decomposition and Inference Rules

Prof Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Decomposition and Inference Rules

• We will discuss the rules of inference for functional dependencies and use them to
define the concepts of a cover, equivalence, and minimal cover among functional
dependencies.
• first we will describe two desirable properties of decompositions, namely, the
dependency preservation property and the nonadditive (or lossless) join property, which
are both used by the design algorithms to achieve desirable decompositions.
• We now turn our attention to the process of decomposition that we will use to get rid of
unwanted dependencies and achieve higher normal forms.
Database Management Systems
Decomposition

• We use Decomposition to get rid of unwanted dependencies and achieve higher normal
forms.
• The only way to avoid the repetition-of-information problem in a schema is to
decompose it into two schemas. One must be careful while doing this as this could lead to
the loss of interesting relationships.
• e.g:
employee (ID, name, street, city, salary)
INTO
employee1 (ID, name) + employee2 (name, street, city, salary)

The flaw in this decomposition arises from the possibility that the enterprise has two
employees with the same name.
Database Management Systems
Loss of information via a bad decomposition (Lossy decomposition)

Figure 7.3 shows tuples, the


resulting tuples using the schemas
resulting from the decomposition,
and the result if we attempted to
regenerate the original tuples
using a natural join. As we see in
the figure, the two original tuples
appear in the result along with
two new tuples that incorrectly
mix data values pertaining to the
two employees named Kim.
we have more tuples but we
actually have less information.
Database Management Systems
Lossless Decomposition

• Let R be a relation schema and let R1 and R2 form a decomposition of R—that is,
viewing R, R1, and R2 as sets of attributes, R = R1 ∪ R2.
• We say that the decomposition is a lossless decomposition if there is no loss of
information by replacing R with two relation schemas R1 and R2.
• Essentially we must pass this sql statement and get back exactly r
select * from (select R1 from r) natural join (select R2 from r)
• Let R, R1, R2, and F be as above. R1 and R2 form a lossless decomposition of R if at
least one of the following functional dependencies is in F+
1. R1 ∩ R2 → R1
2. R1 ∩ R2 → R2
Database Management Systems
Lossless Decomposition

Then the following SQL constraints must be imposed on the decomposed schema to ensure their
contents are consistent with the original schema.
1. R1 ∩ R2 is the primary key of r1. This constraint enforces the functional dependency.
2. R1 ∩ R2 is a foreign key from r2 referencing r1. This constraint ensures that each tuple in r2
has a matching tuple in r1, without which it would not appear in the natural join of r1 and r2.

If r1 or r2 is decomposed further, as long as the decomposition ensures that all attributes in R1 ∩


R2 are in one relation, the primary or foreign-key constraint on r1 or r2 would be inherited by that
relation.

NOTE: r(R) into r1(R1) and r2(R2)


Database Management Systems
Lossless Decomposition

Consider the intersection of these two


schemas, which is dept name. Because
dept name→ dept name, building,
budget, the lossless-decomposition rule
is satisfied

using instructor and department, we stored the amount of each


Decomposed into: budget exactly once. This suggests that using in_dep is a bad
idea since it stores the budget amounts redundantly and
runs the risk that some user might update the budget
amount in one tuple but not all, and thus create
inconsistency.
Database Management Systems
Inference Rules for FDs (1)

■ Definition: An FD X → Y is inferred from or implied by a set of dependencies F


specified on R if X → Y holds in every legal relation state r of R; that is, whenever r
satisfies all the dependencies in F, X → Y also holds in r.
■ Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold.
■ Examples : if each department has one manager, so that Dept_no uniquely determines
Mgr_ssn (Dept_no → Mgr_ssn), and a manager has a unique phone number called
Mgr_phone (Mgr_ssn → Mgr_phone), then these two dependencies together imply that
Dept_no → Mgr_phone. This is an inferred or implied FD and need not be explicitly stated
in addition to the two given FDs.
Database Management Systems
Inference Rules for FDs (2)

■ Armstrong's inference rules:


■ IR1. (Reflexive) If Y is a subset-of X, then X → Y always holds.

■ IR2. (Augmentation) If X → Y, then XZ → YZ always holds.


■ (Notation: XZ stands for X U Z)
■ Example: if {Ssn} → {Ename} then {Ssn, Bdate}→ {Ename, Bdate}

■ IR3. (Transitive) If X → Y and Y → Z, then X → Z always holds.


■ Example: if {Ssn} → {Dnumber} and {Dnumber} → {Dname} then {Ssn}→ {Dname}

■ IR1, IR2, IR3 form a sound and complete set of inference rules
■ These are rules hold and all other rules that hold can be deduced from these

NOTE: By applying these rules repeatedly, we can find all of F+, given F. This collection of
rules is called Armstrong’s axioms (Sound + Complete)
Database Management Systems
Inference Rules for FDs (3)

■ Some additional inference rules that are useful:

■ Decomposition: If X → YZ, then X → Y and X → Z

■ Union/Additive: If X → Y and X → Z, then X → YZ


Example: if {Ssn} → {Ename} and {Ssn} → {Address} then {Ssn}→ {Ename, Address}

■ Pseudo-Transitivity: If X → Y and WY → Z, then WX → Z

■ The last three inference rules, as well as any other inference rules, can be deduced from
IR1, IR2, and IR3 (completeness property)
■ One important cautionary note regarding the use of these rules: Although X → A and X →
B implies X → AB by the union rule stated above, X → A and Y → B does imply that XY
→ AB. Also, XY → A does not necessarily imply either X → A or Y → A.
Database Management Systems
Example 1

Supposing we are given a relation R{A, B, C, D, E, F} with a set of FDs as shown


below:
A → BC
B→E
CD → EF
Let us show that the FD AD → F holds for R and is a member of the closure.
(1) A → BC {Given, A->B & A->C}
(2) A → C {Decomposition of (1)}
(3) AD → CD {Augmentation of (2) by adding D}
(4) CD → EF {Given}
(5) AD → EF {Transitivity of (3) and (4)}
(6) AD → F {Decomposition of (5)}
Database Management Systems
Example 2

Consider R = (Street, Zip, City) ; and the dependencies


F = { City Street -> Zip, Zip -> City }
Show that: Street Zip ->Street Zip City

1. Zip ->City – Given


2. Street Zip -> Street City – Augmentation of (1) by Street
3. City Street -> Zip – Given
4. City Street -> City Street Zip – Augmentation of (3) by City Street
5. Street Zip -> City Street Zip – Transitivity of (2) and (4)
Database Management Systems
Example 3

Consider the relation schema <R,F> where R= (ABCDEGHI) and dependencies


F= { AB->E, BE->I, E ->G GI ->H }. Show that AB ->GH is derived by F.

1. AB ->E Given
2. E ->G Given
3. BE->I Given
4. GI ->H Given
5. AB ->G Transitivity on (1) and (2)
6. AB -> BE Augmentation (1) by B
7. AB -> I Transitivity on (6) and (3)
8. AB -> GI Union on (5) and (7)
9. AB -> H Transitivity on (8) and (4)
10. AB -> GH Union on (5) and (9)
THANK YOU

Prof. Nivedita Kasturi


Department of Computer Science and Engineering
[email protected]
Database Management Systems
Closure

Prof Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Recap of Functional Dependencies

■ X → Y holds if whenever two tuples have the same value for X, they must have
the same value for Y
■ For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then

t1[Y]=t2[Y]

■ X → Y in R specifies a constraint on all relation instances r(R)

■ FDs are derived from the real-world constraints on the attributes.


Database Management Systems
Functional Dependencies : Inference Rules and closure

In real life, it is impossible to specify all possible functional dependencies for a given situation.

For example, if each department has one manager, so that Dept_no uniquely determines
Mgr_ssn (Dept_no → Mgr_ssn), and a manager has a unique phone number called Mgr_phone
(Mgr_ssn → Mgr_phone), then these two dependencies together imply that Dept_no →
Mgr_phone.

This is an inferred or implied FD and need not be explicitly stated in addition to the two given
FDs.

Therefore, it is useful to define a concept called closure formally that includes all possible
dependencies that can be inferred from the given set F
Database Management Systems
Closure

• Suppose we are given a relation schema r(A, B, C, G, H, I) and the set of functional
dependencies: A → B A → C CG → H CG → I B → H

• The functional dependency: A → H is logically implied.

• That is, we can show that, whenever a relation instance satisfies our given set of functional
dependencies, A → H must also be satisfied by that relation instance.

• Suppose that t1 and t2 are tuples such that: t1[A] = t2[A] Since we are given that A → B, it
follows from the definition of functional dependency that: t1[B] = t2[B] Then, since we are
given that B → H, it follows from the definition of functional dependency that: t1[H] = t2[H]

• Therefore, we have shown that, whenever t1 and t2 are tuples such that t1[A] = t2[A], it must
be that t1[H] = t2[H]. But that is exactly the definition of A → H.
Database Management Systems
Closure

Let F be a set of functional dependencies. The closure of F, denoted by F+, is the set of all
functional dependencies logically implied by F. Given F, we can compute F+ directly from
the formal definition of functional dependency.

NOTE:
Since a set of size n has 2n subsets, there are a
total of 2^n × 2^n = 2^(2^n)
possible functional dependencies, where n is the
number of attributes in R
Database Management Systems
Closure

• Example:
F = {Ssn → {Ename, Bdate, Address, Dnumber}, Dnumber → {Dname, Dmgr_ssn} }

• Some of the additional functional dependencies that we can infer from F are the following:

1. Ssn → {Dname, Dmgr_ssn}


2. Ssn → Ssn
3. Dnumber → Dname

• The closure F+ of F is the set of all functional dependencies that can be inferred from F.

• To determine a systematic way to infer dependencies, we need to use all inference rules
learnt in previous lecture.
Database Management Systems
Closure

■ We can compute F+, the closure of F, by repeatedly applying Armstrong’s


Axioms:
1. Reflexive rule: if β ⊆ α, then α → β
2. Augmentation rule: if α → β, then γ α → γ β
3. Transitivity rule: if α → β, and β → γ, then α → γ

■ These rules are


1. Sound -- generate only functional dependencies that actually hold,
2. Complete -- generate all functional dependencies that hold.

❏ Formally, a functional dependency X → Y is trivial if X ⊇ Y; otherwise, it is


nontrivial
Database Management Systems
Example of closure
▪ R = (A, B, C, G, H, I)
F={A→B
A→C
CG → H
CG → I
B → H}
▪ Some members of F+
• A→H
▪ by transitivity from A → B and B → H
• AG → I
▪ by augmenting A → C with G, to get AG → CG
and then transitivity with CG → I
• CG → HI
▪ by augmenting G on both sides of CG → I to infer CG → CGI,
and augmenting I on both sides of CG → H to infer CGI → HI,
and then transitivity
Database Management Systems
Closure of Attribute Sets

• We say that an attribute B is functionally determined by α if α → B. To test whether a


set α is a superkey, we must devise an algorithm for computing the set of attributes
functionally determined by α.

• One way of doing this is to compute F+, take all functional dependencies with α as the
left-hand side, and take the union of the right-hand sides of all such dependencies.

• Let α be a set of attributes. We call the set of all attributes functionally determined by α
under a set F of functional dependencies the closure of α under F; we denote it by α+

• Definition. For each such set of attributes X, we determine the set X+ of attributes that
are functionally determined by X based on F; X+ is called the closure of X under F.
Database Management Systems
Algorithm to determine Closure of X under F

■ Algorithm 15.1. Determining X+, the Closure of X under F


■ Input: A set F of FDs on a relation schema R, and a set of attributes X, which is
a subset of R.
X+ := X;
repeat
oldX+ := X+;
for each functional dependency Y → Z in F do
if X+ ⊇ Y then X+ := X+ ∪ Z;
until (X+ = oldX+);
Database Management Systems
Closure of Attribute Sets

For example, consider the following relation schema about classes held at a university in a
given academic year.
CLASS ( Classid, Course#, Instr_name, Credit_hrs, Text, Publisher, Classroom, Capacity).

Let F, the set of functional dependencies for the above relation include the following f.d.s:
FD1: Classid → Course#, Instr_name, Credit_hrs, Text, Publisher, Classroom, Capacity;
FD2: Course# → Credit_hrs;
FD3: {Course#, Instr_name} → Text, Classroom;
FD4: Text → Publisher
FD5: Classroom → Capacity

Note that the above FDs express certain semantics about the data in the relation CLASS.
Find the closure for 1. Classid 2. Course# 3. Course#, Instr_name
Database Management Systems
Closure of Attribute Sets

For example, FD1 states that each class has a unique Classid.

FD3 states that when a given course is offered by a certain instructor, the text is fixed and the
instructor teaches that class in a fixed room.
Using the inference rules about the FDs and applying the definition of closure,

we can define the following closures:


{ Classid } + = { Classid , Course#, Instr_name, Credit_hrs, Text, Publisher, Classroom,
Capacity } = CLASS
{ Course#} + = { Course#, Credit_hrs}
{ Course#, Instr_name } + = { Course#, Instr_name ,Credit_hrs, Text, Publisher, Classroom,
Capacity }
Database Management Systems
Closure of Attribute Sets

• Note that each closure above has an interpretation that is revealing about the attribute(s)
on the left-hand side.

• For example, the closure of Course# has only Credit_hrs besides itself. It does not
include Instr_name because different instructors could teach the same course; it does not
include Text because different instructors may use different texts for the same course.

• Note also that the closure of {Course#, Instr_nam} does not include Classid, which
implies that it is not a candidate key.

• This further implies that a course with given Course# could be offered by different
instructors, which would make the courses distinct classes.
Database Management Systems
Closure of Attribute Sets

Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-

A → BC
BC → DE
D→F
CF → G

Now, let us find the closure of D and BC


Closure of attribute D-

D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
Database Management Systems
Closure of Attribute Sets

Closure of attribute set {B, C}-

{ B , C }+ = { B , C }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Database Management Systems
Closure of Attribute Sets

Problem-

Consider the given functional dependencies-


AB → CD
AF → D
DE → F
C→G
F→E
G→A

Which of the following options is false?


(A) { CF }+ = { A , C , D , E , F , G }
(B) { BG }+ = { A , B , C , D , G }
(C) { AF }+ = { A , C , D , E , F , G }
(D) { AB }+ = { A , C , D , F ,G }
Database Management Systems
Closure of Attribute Sets

Solution-

Let us check each option one by one-

Option-(A):

{ CF }+ = { C , F }
={C,F,G} ( Using C → G )
={C,E,F,G} ( Using F → E )
={A,C,E,E,F} ( Using G → A )
= { A , C , D , E , F , G } ( Using AF → D )

Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Database Management Systems
Closure of Attribute Sets

Option-(B):

{ BG }+ = { B , G }
={A,B,G} ( Using G → A )
={A,B,C,D,G} ( Using AB → CD )

Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Database Management Systems
Closure of Attribute Sets

Option-(C):

{ AF }+ = { A , F }
={A,D,F} ( Using AF → D )
={A,D,E,F} ( Using F → E )

Since, our obtained result set is different from the given result set, so,it means it is not correctly
given.
Database Management Systems
Closure of Attribute Sets

Option-(D):

{ AB }+ = { A , B }
={A,B,C,D} ( Using AB → CD )
={A,B,C,D,G} ( Using C → G )

Since, our obtained result set is different from the given result set, so,it means it is not
correctly given.
Thus,
Option (C) and Option (D) are correct.
THANK YOU

Prof Nivedita Kasturi


Department of Computer Science and Engineering
[email protected]
Database Management Systems
Equivalence

Prof Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Equivalence of Sets of FDs

■ A set of functional dependencies F is said to cover another set of functional


dependencies E if every FD in E is also in F+; that is, if every dependency in E can be
inferred from F; alternatively, we can say that E is covered by F.
■ Two sets of functional dependencies E and F are equivalent
■ if E+ = F+.
■ equivalence means that every FD in E can be inferred from F, and every FD in F can
be inferred from E; that is, E is equivalent to F if both the conditions—E covers F and
F covers E—hold.
■ We can determine whether F covers E by calculating X+ with respect to F for each FD
X → Y in E, and then checking whether this X+ includes the attributes in Y.

➔If this is the case for every FD in E, then F covers E.


Database Management Systems
Equivalence of Sets of FDs

Equivalence of Two Sets of Functional Show that the following two sets of
Dependencies- FDs are equivalent:
F = {A → C, AC → D, E → AD, E
• Two different sets of functional dependencies → H} and
for a given relation may or may not be G = {A → CD, E → AH}
equivalent. Which of the following holds true?
• If F and G are the two sets of functional (A) G ⊇ F
dependencies, then following 3 cases are (B) F ⊇ G
(C) F = G
possible- (D) All of the above
Case-01: F covers G (F ⊇ G) (solution is discussed after
Case-02: G covers F (G ⊇ F) algorithm steps)
Case-03: Both F and G cover each other (F = G)
Database Management Systems
Equivalence of Sets of FDs

Case-01: Determining Whether F Covers G-


Following steps are followed to determine whether F covers G or not-

Step-01:
• Take the functional dependencies of set G into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.

Step-02:
• Take the functional dependencies of set G into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.

Step-03:
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set F has determined all those attributes that were determined by the
functional dependencies of set G, then it means F covers G.
• Thus, we conclude F covers G (F ⊇ G) otherwise not.
Database Management Systems
Equivalence of Sets of FDs

Case-02: Determining Whether G Covers F-


Following steps are followed to determine whether G covers F or not-

Step-01:
• Take the functional dependencies of set F into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.

Step-02:
• Take the functional dependencies of set F into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.

Step-03:
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set G has determined all those attributes that were determined by the
functional dependencies of set F, then it means G covers F.
• Thus, we conclude G covers F (G ⊇ F) otherwise not.
Database Management Systems
Equivalence of Sets of FDs

Case-03: Determining Whether Both F and G Cover Each Other-

• If F covers G and G covers F, then both F and G cover each other.


• Thus, if both the above cases hold true, we conclude both F and G cover each other (F
= G).
Database Management Systems
Equivalence of Sets of FDs :Solution

Determining whether F covers G-


Step-01:
• (A)+ = { A , C , D } // closure of left side of A → CD using set G
• (E)+ = { A , C , D , E , H } // closure of left side of E → AH using set G

Step-02:
• (A)+ = { A , C , D } // closure of left side of A → CD using set F
• (E)+ = { A , C , D , E , H } // closure of left side of E → AH using set F

Step-03:
Comparing the results of Step-01 and Step-02, we find Functional dependencies of set F can determine
all the attributes which have been determined by the functional dependencies of set G.
• Thus, we conclude F covers G i.e. F ⊇ G.
Database Management Systems
Equivalence of Sets of FDs : Solution

Determining whether G covers F-


Step-01:
• (A)+ = { A , C , D } // closure of left side of A → C using set F
• (AC)+ = { A , C , D } // closure of left side of AC → D using set F
• (E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set F

Step-02:
• (A)+ = { A , C , D } // closure of left side of A → C using set G
• (AC)+ = { A , C , D } // closure of left side of AC → D using set G
• (E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set G

Step-03:
Comparing the results of Step-01 and Step-02, we find-
• Functional dependencies of set G can determine all the attributes which have been determined by the
functional dependencies of set F.
• Thus, we conclude G covers F i.e. G ⊇ F.
Database Management Systems
Equivalence of Sets of FDs

Determining whether both F and G cover each other-

• From Step-01, we conclude F covers G.


• From Step-02, we conclude G covers F.
• Thus, we conclude both F and G cover each other i.e. F = G.

Thus, Option (D) is correct.


Database Management Systems
Examples of equivalence

Q.1 Let us take an example to show the relationship between two


FD sets. A relation R(A,B,C,D) having two FD sets F = {A->B, B->C, AB->D} and
G = {A->B, B->C, A->C, A->D}

Step 1: Checking whether all FDs of FD1 are present in FD2


• A->B in set FD1 is present in set FD2.
• B->C in set FD1 is also present in set FD2.
• AB->D is present in set FD1 but not directly in FD2 but we will check whether
we can derive it or not. For set FD2, (AB)+ = {A, B, C, D}. It means that AB can
functionally determine A, B, C, and D. So AB->D will also hold in set FD2.

As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Database Management Systems
Examples of equivalence

Step 2: Checking whether all FDs of FD2 are present in FD1


• A->B in set FD2 is present in set FD1.
• B->C in set FD2 is also present in set FD1.
• A->C is present in FD2 but not directly in FD1 but we will check whether we can derive it or
not. For set FD1, (A)+ = {A, B, C, D}. It means that A can functionally determine A, B, C,
and D. SO A->C will also hold in set FD1.
• A->D is present in FD2 but not directly in FD1 but we will check whether we can derive it or
not. For set FD1, (A)+ = {A, B, C, D}. It means that A can functionally determine A, B, C,
and D. SO A->D will also hold in set FD1.
As all FDs in set FD2 also hold in set FD1, FD1 ⊃ FD2 is true.

Step 3: As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets
are semantically equivalent.
Database Management Systems
Examples of equivalence

Q.2 Let us take another example to show the relationship between two FD sets. A
relation R2(A,B,C,D) having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B,
B->C, A->D}

Step 1: Checking whether all FDs of FD1 are present in FD2


• A->B in set FD1 is present in set FD2.
• B->C in set FD1 is also present in set FD2.
• A->C is present in FD1 but not directly in FD2 but we will check whether we
can derive it or not. For set FD2, (A)+ = {A, B, C, D}. It means that A can
functionally determine A, B, C, and D. SO A->C will also hold in set FD2.

As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Database Management Systems
Examples of equivalence

Step 2: Checking whether all FDs of FD2 are present in FD1


• A->B in set FD2 is present in set FD1.,

• B->C in set FD2 is also present in set FD1.


• A->D is present in FD2 but not directly in FD1 but we will check whether we
can derive it or not. For set FD1, (A)+ = {A,B,C}. It means that A can’t
functionally determine D.
• So A->D will not hold in FD1.
As all FDs in set FD2 do not hold in set FD1, FD2 ⊄ FD1.

Step 3: In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically
equivalent.
Database Management Systems
Exercise questions

15.4. When are two sets of functional dependencies equivalent? How can we determine their
equivalence?

Answer : Two sets of functional dependencies


𝐹 and 𝐺 are considered equivalent if they imply the same set of functional dependencies. In other words,
the closure of F (𝐹 + ) must be the same as the closure of 𝐺 (𝐺
+).

To determine their equivalence:


Compute the closure
𝐹+ and the closure 𝐺+for 𝐺.
If 𝐹+=𝐺+
F + =G + , the two sets are equivalent.
THANK YOU

Prof Nivedita Kasturi


Department of Computer Science and Engineering
[email protected]
Database Management Systems
Dependency Preservation

Prof. Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Dependency preservation

• It would be useful if each functional dependency X → Y specified in F either appeared


directly in one of the relation schemas Ri in the decomposition D or could be inferred from
the dependencies that appear in some Ri.
• Informally, this is the dependency preservation condition. We want to preserve the
dependencies because each dependency in F represents a constraint on the database.
• If one of the dependencies is not represented in some individual relation Ri of the
decomposition, we cannot enforce this constraint by dealing with an individual relation.
• We may have to join multiple relations so as to include all attributes involved in that
dependency
Database Management Systems
Dependency preservation

• It is not necessary that the exact dependencies specified in F appear themselves in


individual relations of the decomposition D.
• It is sufficient that the union of the dependencies that hold on the individual
relations in D be equivalent to F.

Definition. Given a set of dependencies F on R, the projection of F on Ri, denoted by πRi


(F) where Ri is a subset of R, is the set of dependencies X → Y in F+ such that the
attributes in X ∪ Y are all contained in Ri. Hence, the projection of F on each relation
schema Ri in the decomposition D is the set of functional dependencies in F+, the closure
of F, such that all the left- and right-hand-side attributes of those dependencies are in Ri.
We say that a decomposition D = {R1, R2, … , Rm} of R is dependency-preserving with
respect to F if the union of the projections of F on each Ri in D is equivalent to F; that is,
((πR1 (F)) ∪ K ∪ (πRm(F)))+ = F+.
Database Management Systems
Dependency preservation

• If a decomposition is not dependency-preserving, some dependency is lost in the


decomposition.

• To check that a lost dependency holds, we must take the JOIN of two or more
relations in the decomposition to get a relation that includes all left- and right-hand-
side attributes of the lost dependency, and then check that the dependency holds on the
result of the JOIN—an option that is not practical.
• An example of a decomposition that does not preserve dependencies is shown in
Figure 14.13(a), in which the functional dependency FD2 is lost when LOTS1A is
decomposed into {LOTS1AX, LOTS1AY}. The decompositions in Figure 14.12,
however, are dependency-preserving.
Database Management Systems
Dependency preservation

Figure 14.13

Figure 14.12
Database Management Systems
Dependency preservation

▪ Testing functional dependency constraints each time the database is updated can be
costly
▪ It is useful to design the database in a way that constraints can be tested efficiently.
▪ If testing a functional dependency can be done by considering just one relation, then the
cost of testing this constraint is low
▪ When decomposing a relation it is possible that it is no longer possible to do the testing
without having to perform a Cartesian Produced.
▪ A decomposition that makes it computationally hard to enforce functional dependency is
said to be NOT dependency preserving
● We say that a decomposition having the property F′+ = F+ is a dependency-
preserving decomposition.
Database Management Systems
Dependency preservation example

Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
With function dependencies:
i_ID → dept_name
s_ID, dept_name → i_ID
In the above design we are forced to repeat the department name once for each time an instructor
participates in a dept_advisor relationship.
To fix this, we need to decompose dept_advisor
Any decomposition that we will do for dept_advisor will not include all the attributes i.e s_ID,
dept_name, i_ID so following fd will not be preserved at all
s_ID, dept_name → i_ID
Thus, the decomposition NOT be dependency preserving
Database Management Systems
Dependency preservation exercise

1. What is the dependency preservation property for a decomposition? Why is it important?


2. Between the properties of dependency preservation and losslessness, which one must definitely be
satisfied? Why?
Database Management Systems
Dependency preservation exercise answers

1. What is the dependency preservation property for a decomposition? Why is it important?


A: Dependency preservation is a property of a database decomposition where the functional
dependencies of the original relation are preserved in the decomposed relations.
Importance of Dependency Preservation:
● Maintains Data Integrity
● Simplifies Enforcement of Constraints
● Reduces Redundancy

1. Between the properties of dependency preservation and losslessness, which one must definitely be
satisfied? Why?
A: Losslessness must definitely be satisfied because it ensures that no information is lost during
decomposition. This guarantees that the original relation can be perfectly reconstructed from the
decomposed relations, maintaining data integrity. Dependency preservation is important but secondary
to losslessness.
THANK YOU

Prof. Nivedita Kasturi


Department of Computer Science and Engineering
[email protected]
Database Management Systems
Minimal Cover

Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Unit 3

❏ Minimal cover

❏ Algorithm for finding minimal cover of FD

❏ Examples of the minimal cover of FD

❏ Algorithm to determine the key of a relation

❏ Finding candidate keys


Database Management Systems
Finding Minimal Cover of FD’s

■ Just as we applied inference rules to expand on a set F of FDs to arrive at F+,


its closure, it is possible to think in the opposite direction to see if we could
shrink or reduce the set F to its minimal form so that the minimal set is still
equivalent to the original set F.

■ Will use the concept of an extraneous attribute in a FD for defining the


minimum cover.

■ Definition: An attribute in a functional dependency is considered extraneous


attribute if we can remove it without changing the closure of the set of
dependencies. Formally, given F, the set of functional dependencies and a
functional dependency X → A in F , attribute Y is extraneous in X if Y is a
subset of X, and F logically implies ( F - (X → A) ∪ { (X – Y) → A } )
Database Management Systems
Minimal Sets of FDs

■ A set of FDs is minimal if it satisfies the following conditions:

1. Every dependency in F has a single attribute for its RHS.

2. We cannot replace any dependency X → A in F with a dependency Y → A,


where Y is a proper subset of X and still has a set of dependencies that is
equivalent to F. (removing the redundancies via extraneous attributes on
the LHS of a dependency)
3. We cannot remove any dependency from F and still have a set of
dependencies that is equivalent to F. (removing the redundancies via having a
dependency that can be inferred from the remaining FDs in F)
Database Management Systems
Minimal Sets of FDs

■ DEFINITION: A minimal cover of a set of functional


dependencies E is a minimal set of dependencies ( in the
standard canonical form and without redundancy) that is
equivalent to E.
Database Management Systems
Algorithm: Finding a Minimal Cover F for a Set of Functional Dependencies E

■ Input: A set of functional dependencies E.

1. Set F:= E.

2. Replace each functional dependency X → {A1, A2, ..., An} in F by the n functional
dependencies X →A1, X →A2, ..., X → An.
(*Place FDs in a canonical form, Preparatory step*)

3. For each functional dependency X → A in F


for each attribute B that is an element of X
if { {F – {X → A} } ∪ { (X – {B} ) → A} } is equivalent to F
then replace X → A with (X – {B} ) → A in F.
(* The above constitutes a removal of the extraneous attribute B from X *)

4. For each remaining functional dependency X → A in F


if {F – {X → A} } is equivalent to F,
then remove X → A from F.
(* The above constitutes a removal of the redundant dependency X → A from F *)
Database Management Systems
Computing the Minimal Sets of FDs: Example 1

Let the set of FDs be E: {B → A, D → A, AB → D}.


Find the minimum cover of E.

❑ Step 1: All above dependencies are in canonical (they have only one attribute on the
RHS) ;

❑ In step 2: we need to determine if AB → D has any redundant (EXTRANEOUS)


attribute on the left-hand side; that is, can it be replaced by B → D or A → D?
❑ Since B → A, by augmenting with B on both sides (IR2), we have BB → AB, or B → AB
(i). However, AB → D as given (ii).
❑ Hence by the transitive rule (IR3), we get from (i) and (ii), B → D. Hence AB → D may
be replaced by B → D.
❑ We now have a set equivalent to original E , say E′ : {B → A, D → A, B → D}. No further
reduction is possible in step 2 since all FDs have a single attribute on the left-hand
side.
Database Management Systems
Computing the Minimal Sets of FDs: Example 1

■ In step 3 we look for a redundant FD in E′. By using the transitive rule on


B → D and D → A, we derive B → A. Hence B → A is redundant in E’ and
can be eliminated.
■ Hence the minimum cover of E is F: {B → D, D → A}.

The original set F can be inferred from E; In other words, the two sets F
and E are equivalent.
Database Management Systems
Computing the Minimal Sets of FDs: Example 2

Let the given set of FDs be G : {A → BCDE, CD → E}.We have to find the
minimum cover of G.
Step 1: All above dependencies are not in canonical (they do not have one
attribute on the RHS) ; so we have to convert them into:

■ Step 1 : F: { A -> B, A -> C, A -> D, A -> E, CD -> E}

■ Step 2 : For CD - > E, neither C nor D is extra on the LHS, since we can not
show that C -> E OR D -> E from the given FDs. Hence we can not replace it
with either.

■ Step 3 : Check any FD is redundant? Since A -> CD and CD -> E, by transitive


rule (IR3), we get A -> E, Thus A -> E is redundant in G

Hence Minimum Cover of G, F : { A -> BCD, CD -> E}


Database Management Systems
Minimal Cover: Other Method – Example3.

■ Find the minimal cover of the set of functional dependencies given: {A → C,


AB → C, C → DI, CD → I, EC → AB, EI → C}
Answer: minimal cover Fc = {A → C, C → D, C → I, EC → A, EC → B, EI → C}.
Solution: Step 1: Decompose RHS of FD
FD= { A → C
AB → C
C→D
C→I
CD → I
EC → A
EC →B
EI → C}
Database Management Systems
Example3.

Step 2: Remove the redundant FD from above set


a. Check if A → C is redundant.
Find A+ = {ACDI} {Including A→C in FD}
A+={ A} {Excluding A→C from FD}
As we got different attributes for A+ when included it in FD for finding closure and even when we
excluded it from FD.
A → C is not redundant

b. Check if AB→C is redundant.


AB+={ABCDI} {Including AB→C in FD}
AB+={ABCDI} {Excluding AB→C from FD}
As we have got same closure for including AB→C and Excluding AB→C. then we can drive AB→C
from given FD without AB→C so it’s a redundant FD. Remove it from FD
AB → C is redundant

Therefore updated FD is FD1 = {A → C,C → D ,C → I ,CD → I, EC → A, EC →B, EI → C}


Database Management Systems
Example3.

Now for the below calculation use FD1

c. Check if C→D is redundant.


Find C+= {CDI} {Including C→D in FD1}
C+={CI} {Excluding C→D from FD1}
C→D is not redundant

d. Check if C→I is redundant.


Find C+ ={CDI} {Including C→I in FD1}
C+={CDI} { Excluding C→I from FD1}
As we got same set of attributes for C+ even when we excluded the C→I from FD1. Therefore
we can infer the C→I from given set of FD1 so it can be removed from FD1 as its redundant.
C→I is redundant.

So updated FD is FD2={ A → C,C → D ,CD → I, EC → A, EC →B, EI → C}


Database Management Systems
Example3.

Now for below calculation use FD2

e. Check if EC→A is redundant.


Find EC+= {ECABDI} {Including EC→A in FD2}
EC+={ECBDI} {Excluding EC→A in FD2}
EC→A is Not redundant

f. Check if EC→B is redundant.


Find EC+= {ECBADI} {Including EC→B in FD2}
EC+={ECADI} {Excluding EC→B in FD2}
EC→B is Not redundant

g. Check if CD→I is redundant.


Find CD+={CDI} {Including CD→I in FD2}
CD+={CD} {Excluding CD→I in FD2}
CD→I is not redundant.
Database Management Systems
Example3.

h. Check if EI→C is redundant.


Find EI+={EICDAB} {Including EI→C in FD2}
EI+= {EI} {Excluding EI→C in FD2}
EI→C is not redundant.

Therefore FD is FD2={ A → C,C → D ,CD → I, EC → A, EC →B, EI → C}

Step 3: Check for extraneous attribute

Here consider the FD where we have more than one attribute on LHS
Following FD has more than one attribute on LHS
CD → I, EC → A, EC →B, EI → C
Now we will find extraneous attributes:
Database Management Systems
Example3.
a. CD→I
Find closure for CD+, C+ and D+ from FD2. check if any closure has got same attributes, then only that
attribute alone can be sufficient for FD
CD+ = {CDI}
C+= {CDI}
D+= {D}
In the above result the C+ has the same set of attributes as CD+ so C attribute alone can determine the I.
Hence D is an extraneous attribute.
Consider FD as C→I.

Therefore, updated FD is FD3= {A → C, C → D, C → I, EC → A, EC →B, EI → C}

b. EC→A
Find closure for EC+, E+ and C+ using FD3
EC+={ECABID}
E+={E}
C+={CDI}
In the above closure, we didn’t get closure of a single attribute similar to EC so no extraneous attribute.
Database Management Systems
Example3.

c. EC→B
Find closure of EC+, E+ and C+ using FD3
EC+={ECBAID}
E+={E}
C+={CDI}
In the above closure we didn’t get closure of a single attribute similar to EC so no extraneous attribute

Therefore, updated FD is FD3= {A → C, C → D, C → I, EC → A, EC →B, EI → C}

d. EI→C
Find closure of EI+, E+ and I+
EI+={EICDIAB}
E+={E}
I+={I}
In the above closure, we didn’t get closure of a single attribute similar to EI so no extraneous attribute

Therefore, minimal set of FD is {A → C, C → D, C → I, EC → A, EC →B, EI → C}


Database Management Systems
Algorithm to determine the key of a relation

■ Algorithm: Finding a Key K for R, given a set F of Functional Dependencies


■ Input: A universal relation R and a set of functional dependencies F on
the attributes of R.
1. Set K := R;
2. For each attribute A in K
{ Compute (K - A)+ with respect to F;
If (K - A)+ contains all the attributes in R,
then set K := K - {A} };
In algorithm, we start by setting K to all the attributes of R; we can say that R itself is
always a default super key. We then remove one attribute at a time and check whether
the remaining attributes still form a super key.
This algorithm, determines only one key out of the possible candidate keys for R; the
key returned depends on the order in which attributes are removed from R in step 2.
Database Management Systems
Finding Candidate Keys

We can determine the candidate keys of a given relation using the following steps-

Step-01: Determine all essential attributes of the given relation.


•Essential attributes are those not present on RHS of any functional dependency.
•Essential attributes are always a part of every candidate key.
•This is because other attributes can not determine them.

Step-02: The remaining attributes of the relation are non-essential attributes. This
is because they can be determined by using essential attributes.

Now, the following two cases are possible-


Database Management Systems
Finding Candidate Keys

We can determine the candidate keys of a given relation using the following steps-

Case-01: If all essential attributes together can determine all remaining non-essential
attributes, then-
•The combination of essential attributes is the candidate key.
•It is the only possible candidate key.

Case-02: If all essential attributes together can not determine all remaining non-
essential attributes, then-
•The set of essential attributes and some non-essential attributes will be the candidate
key(s).
•In this case, multiple candidate keys are possible.
•To find the candidate keys, we check different combinations of essential and non-
essential attributes.
Database Management Systems
Practice Problem: Finding Candidate Keys

Problem-01: Let R = (A, B, C, D, E, F) be a relation scheme with the following dependencies-


C→F
E→A
EC → D
A→B

Which of the following is a key for R?


CD
EC
AE
AC

Also, determine the total number of candidate keys and super keys.
Database Management Systems
Practice Problem: Finding Candidate Keys

Solution- We will find candidate keys of the given relation in the following steps-

Step-01: Determine all essential attributes of the given relation.


Essential attributes of the relation are- C and E. So, attributes C and E will definitely be
a part of every candidate key.

Step-02: Now, We will check if the essential attributes together can determine all
remaining non-essential attributes. To check, we find the closure of CE.
So, we have-
{ CE }+
={C,E}
= { C , E , F } ( Using C → F )
= { A , C , E , F } ( Using E → A )
= { A , C , D , E , F } ( Using EC → D )
= { A , B , C , D , E , F } ( Using A → B )
Database Management Systems
Practice Problem: Finding Candidate Keys

We conclude that CE can determine all the attributes of the given relation.
So, CE is the only possible candidate key of the relation.

Thus, Option (B) is correct.

Total Number of Candidate Keys- Only one candidate key CE is possible.


Database Management Systems
Practice Problem: Finding Candidate Keys
Total Number of Super Keys- There are total 6 attributes in the given relation of which-
•There are 2 essential attributes- C and E.
•Remaining 4 attributes are non-essential attributes.
•Essential attributes will be definitely present in every key.
•Non-essential attributes may or may not be taken in every super key.

So, number of super keys possible = 2 x 2 x 2 x 2 = 16.


Thus, total number of super keys possible = 16.
Database Management Systems
Practice Problem: Finding Candidate Keys
Practice Problem-02:
Let R = (A, B, C, D, E) be a relation scheme with the following dependencies-
AB → C
C→D
B→E
Determine the total number of candidate keys and super keys.

Practice Problem-03:Consider the relation scheme R(E, F, G, H, I, J, K, L, M, N) and


the set of functional dependencies-
{ E, F } → { G }
{F}→{I,J}
{ E, H } → { K, L }
{K}→{M}
{L}→{N}

Determine the total number of candidate keys and super keys.


THANK YOU

Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Relational Decomposition (Matrix Method)

Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Unit 3: Database Design

Properties of Relational Decomposition (matrix method)


Database Management Systems
Designing A Set Of Relations

■ The Approach of Relational Synthesis (Bottom-up Design):

■ Assumes that all possible functional dependencies are known.


■ First constructs a minimal set of FDs
■ Then applies algorithms that construct a target set of 3NF or BCNF relations.
■ Additional criteria may be needed to ensure the set of relations in a relational
database are satisfactory (see Algorithm 15.3 from Text book 1).
Database Management Systems
Designing A Set Of Relations
■ Goals:
■ Lossless join property (a must)

■ Algorithm to tests for general losslessness.

■ Dependency preservation property


■ Observe as much as possible

■ Algorithm related to decomposition of a relation into BCNF components by

sacrificing the dependency preservation.

■ Additional normal forms


■ 4NF (based on multi-valued dependencies)

■ 5NF (based on join dependencies)


Database Management Systems
Properties of Relational Decompositions

1) Relation Decomposition and Insufficiency of Normal Forms:


■ Universal Relation Schema:
■ A relation schema R = {A1, A2, …, An} that includes all the attributes of the database.

■ Universal relation assumption:


■ Every attribute name is unique.

■ Decomposition:
■ The process of decomposing the universal relation schema R into a set of relation schemas
D = {R1,R2, …, Rm} that will become the relational database schema by using the functional
dependencies.
■ Attribute preservation condition:
▪ Each attribute in R will appear in at least one relation schema Ri in the decomposition so that
no attributes are “lost”.
■ Another goal of decomposition is to have each individual relation Ri in the decomposition D be
in BCNF or 3NF.

■ Additional properties of decomposition are needed to prevent from generating spurious tuples
Database Management Systems
Properties of Relational Decompositions (Cont.)

2) Dependency Preservation Property of a Decomposition:


• Given a set of dependencies F on R, the projection of F on Ri, denoted by πRi(F) where Ri is a subset of R,
is the set of dependencies X → Y in F+ such that the attributes in X υ Y are all contained in Ri.
• Hence, the projection of F on each relation schema Ri in the decomposition D is the set of functional
dependencies in F+, the closure of F, such that all their left- and right-hand-side attributes are in Ri.

■ Dependency Preservation Property:


■ A decomposition D = {R1, R2, ..., Rm} of R is dependency-preserving with respect to F if the union
of the projections of F on each Ri in D is equivalent to F; that is ((π R1(F)) υ . . . υ (π Rm(F)))+ = F+
(See examples in Fig 14.13a and Fig 14.12)

■ Claim 1: It is always possible to find a dependency-preserving decomposition D with respect to F such


that each relation Ri in D is in 3NF.
Database Management Systems
Properties of Relational Decompositions (Cont.)

3) Non-additive (Lossless) Join Property of a Decomposition:


■ Definition: Lossless join property: a decomposition D = {R1, R2, ..., Rm} of R has the lossless
(non-additive) join property with respect to the set of dependencies F on R if, for every relation
state r of R that satisfies F, the following holds, where * is the natural join of all the relations in
D: * (π R1(r), ..., π Rm(r)) = r

■ Note: The word loss in lossless refers to loss of information, not to loss of tuples. In fact, for
“loss of information” a better term is “addition of spurious information”
■ Non-additive join term means no spurious tuples results after the application of PROJECT and
JOIN operations.
Database Management Systems
Properties of Relational Decompositions (Cont.)

Lossless (Non-additive) Join Property of a Decomposition :

■ Algorithm: Testing for Lossless Join Property


■ Input: A universal relation R, a decomposition D = {R1, R2, ..., Rm} of R, and a set F
of functional dependencies.
1. Create an initial matrix S with one row i for each relation Ri in D, and one column j for
each attribute Aj in R.
2. Set S(i,j):=bij for all matrix entries. (* each bij is a distinct symbol associated with indices
(i,j) *).
3. For each row i representing relation schema Ri
{for each column j representing attribute Aj
{if (relation Ri includes attribute Aj) then set S(i,j):= aj;};};
■ (* each aj is a distinct symbol associated with index (j) *)
Database Management Systems
Properties of Relational Decompositions (Cont.)

Lossless (Non-additive) Join Property of a Decomposition (cont.):

Algorithm: Testing for Lossless Join Property (continued)

4. Repeat the following loop until a complete loop execution results in no changes to S
{for each functional dependency X →Y in F
{for all rows in S which have the same symbols in the columns corresponding to
attributes in X
{make the symbols in each column that correspond to an attribute in Y be the
same in all these rows as follows:
If any of the rows has an “a” symbol for the column, set the other rows
to that same “a” symbol in the column.
If no “a” symbol exists for the attribute in any of the rows, choose one of
the “b” symbols that appear in one of the rows for the attribute and set the other rows to that
same “b” symbol in the column ;};
};
};
5. If a row is made up entirely of “a” symbols, then the decomposition has the lossless join
property; otherwise it does not.
Database Management Systems
Example-1

Figure: Nonadditive join test for n-ary decompositions.


(a) Case 1: Decomposition of EMP_PROJ into EMP_PROJ1 and EMP_LOCS fails test.
(b) A decomposition of EMP_PROJ that has the lossless join property.
Database Management Systems
Properties of Relational Decompositions (Cont.)

Nonadditive join test for n-ary decompositions. (above Figure)


(c) Case 2: Decomposition of EMP_PROJ into EMP, PROJECT, and WORKS_ON satisfies test.
Database Management Systems
Example-2

R(S,A,I,P)
F = {S -> A, SI -> P}

S A I P
R1 a1 a2 b13 b14
R2 a1 b32 a3 a4
S A I P
R1 a1 a2 b13 b14
R2 a1 a2 a3 a4

Lossless (Row R2 has all a’s)


Database Management Systems
Example-3

R(A,B,C,D,E)
F = {A -> C, B -> C, C -> D, DE -> C, CE -> A}
R1(ADC), R2(AB), R3(BE), R4(CDE), R5(AE)

A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 b23 b24 b25
R3 b31 a2 b33 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 b53 b54 A5
Database Management Systems
Example-3

R(A,B,C,D,E)
FD = A -> C

A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 b24 b25
R3 b31 a2 b33 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 b54 a5
Database Management Systems
Example-3

R(A,B,C,D,E)
FD = B -> C

A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 b24 b25
R3 b31 a2 a3 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 b54 a5
Database Management Systems
Example-3

R(A,B,C,D,E)
FD = C -> D

A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 a4 b25
R3 b31 a2 a3 a4 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 a4 a5
Database Management Systems
Example-3

R(A,B,C,D,E)
FD = CE -> A

A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 a4 b25
R3 a1 a2 a3 a4 a5
R4 a1 b42 a3 a4 a5
R5 a1 b52 a3 a4 a5
Database Management Systems
Properties of Relational Decompositions (Cont.)

4) Testing Binary Decompositions for Non-additive Join (Lossless Join)


Property
■ Binary Decomposition: Decomposition of a relation R into two relations.

■ PROPERTY NJB (non-additive join test for binary decompositions): A


decomposition D = {R1, R2} of R has the lossless join property with respect to a
set of functional dependencies F on R if and only if either
■ The f.d. ((R1 ∩ R2) → (R1- R2)) is in F+, or
■ The f.d. ((R1 ∩ R2) → (R2 - R1)) is in F+.
Database Management Systems
Properties of Relational Decompositions (Cont.)

5)Successive Non-additive Join Decomposition:


■ Claim 2 (Preservation of non-additivity in successive decompositions):

■ If a decomposition D = {R1, R2, ..., Rm} of R has the lossless (non-additive)


join property with respect to a set of functional dependencies F on R,
■ and if a decomposition Di = {Q1, Q2, ..., Qk} of Ri has the lossless (non-
additive) join property with respect to the projection of F on Ri,
■ then the decomposition D2 = {R1, R2, ..., Ri-1, Q1, Q2, ..., Qk, Ri+1, ...,
Rm} of R has the non-additive join property with respect to F.
THANK YOU

Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Exercise questions

Prof Nivedita Kasturi


Department of Computer Science and Engineering
Database Management Systems
Exercise questions

14.24 Consider the universal relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies F
= {{A, B}→{C}, {A}→{D, E}, {B}→{F}, {F}→{G, H}, {D}→{I, J}}. What is the key for R?

Answer:

Attribute set closures

AB+=CDEFGHIJ
AF+=CDEGHIJ
A+=CDEIJ
A+=CDEIJ
B+=CFGH
Candidate keys {AB,AF}
Prime attributes {A,B,F}
Non-prime attributes {C,D,E,F,G,H,I,J}

Repeat Exercise 14.24 for the following different set of functional dependencies G = {{A, B}→{C}, {B, D}→{E,
F}, {A, D}→{G, H}, {A}→{I}, {H}→{J}}
Database Management Systems
Exercise questions

15.1. What is the role of Armstrong’s inference rules (inference rules IR1 through IR3) in the development
of the theory of relational design?

ans: Armstrong's inference rules (IR1 through IR3) are a set of formal rules used for reasoning about
functional dependencies (FDs) in relational databases. These rules are crucial in developing the theory of
relational design, as they allow database designers to derive new functional dependencies from existing
ones, helping ensure that a relational schema is well-structured. The rules are:

IR1 (Reflexivity Rule): If Y is a subset of X, then X → Y (trivial dependency).


IR2 (Augmentation Rule): If X → Y, then XZ → YZ (adding attributes to both sides).
IR3 (Transitivity Rule): If X → Y and Y → Z, then X → Z (transitive dependency).
These rules are used to verify the soundness and completeness of functional dependencies, helping in
normalization and database schema refinement.
Database Management Systems
Exercise questions

15.2. What is meant by the completeness and soundness of Armstrong’s infer- ence rules?

ans: Soundness means that any dependency derived using Armstrong's rules must be a valid functional
dependency. In other words, the rules should not derive incorrect functional dependencies.
Completeness means that the rules are sufficient to derive all possible valid functional dependencies from a
given set of dependencies. No valid dependencies are left out.
Database Management Systems
Exercise questions

15.3. What is meant by the closure of a set of functional dependencies? Illustrate with an example.

ans: The closure of a set of functional dependencies, denoted as 𝐹+ is the set of all functional dependencies that can
be inferred from the original set of functional dependencies 𝐹 using Armstrong's inference rules.

Example:
Given a set F={𝐴→𝐵,𝐵→𝐶} the closure 𝐹+ would include:

A→B (from the original set)


B→C (from the original set)
A→C (by applying the transitivity rule:

A→B and B→C, hence


A→C)
In this case, the closure of

F + ={A→B,B→C,A→C}.
Database Management Systems
Exercise questions

15.4. When are two sets of functional dependencies equivalent? How can we determine their equivalence?

ans: Two sets of functional dependencies 𝐹 and 𝐺 are considered equivalent if they imply the same set of functional
dependencies. In other words, the closure of F (𝐹 + ) must be the same as the closure of 𝐺 (𝐺 +).

To determine their equivalence:

Compute the closure


𝐹+ and the closure 𝐺+for 𝐺.
If 𝐹+=𝐺+
F + =G + , the two sets are equivalent.

15.5. What is a minimal set of functional dependencies? Does every set of dependencies have a minimal equivalent
set? Is it always unique?
Database Management Systems
Exercise questions

15.5. What is a minimal set of functional dependencies? Does every set of dependencies have a minimal equivalent
set?
ans: A minimal set of functional dependencies is a set where:

Each functional dependency has a single attribute on the right-hand side (canonical form).
No functional dependency can be removed without changing the closure.
No attribute on the left-hand side of any functional dependency can be removed without changing the closure.
Does every set of dependencies have a minimal equivalent set?
Yes, every set of functional dependencies has a minimal equivalent set.
Database Management Systems
Exercise questions

7.6 Compute the closure of the following set F of functional dependencies for rela-
tion schema R = (A, B, C, D, E).

A → BC
CD → E
B→D
E→A

List the candidate keys for R.

ans: Given F:A→BC CD→E B→D E→A We will compute the closure of 𝐴 + A + (the set of attributes that can be
functionally determined by A) using the functional dependencies: Start with 𝐴 + = { 𝐴 } . From 𝐴 → 𝐵 𝐶, add 𝐵 and C
to A + , so 𝐴 + = { 𝐴 , 𝐵 , 𝐶 } . From B→D, add D to A + , so 𝐴 + = { 𝐴 , 𝐵 , 𝐶 , 𝐷 } . From 𝐶 𝐷 → 𝐸, add E to A + , so 𝐴 +
= { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } . From E→A, A + still remains { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } {(no new information is added). Thus, the closure
of 𝐴 + = { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } . Candidate Keys: A candidate key is a minimal set of attributes that can determine all
attributes in the relation. Since A + ={A,B,C,D,E}, the attribute 𝐴 A can determine all other attributes. Therefore, 𝐴 A is
a candidate key for the relation schema R.
Database Management Systems
Exercise questions

7.7 Using the functional dependencies of Exercise 7.6, compute the canonical cover Fc.

ans: A canonical cover is a minimal set of functional dependencies that is equivalent to the original set but satisfies
three properties: Each functional dependency has a single attribute on the right-hand side. There are no redundant
dependencies. There are no extraneous attributes on the left-hand side of any dependency. Given F: 𝐴 → 𝐵 𝐶 , 𝐶 𝐷 →
𝐸 B→D 𝐸 → 𝐴
Step 1: Decompose dependencies with multiple attributes on the right-hand side. 𝐴 → 𝐵 𝐶 becomes two
dependencies: 𝐴 → 𝐵 and A→C. 𝐶 𝐷 → 𝐸 stays the same. B→D stays the same. E→A stays the same. Now we have
the following set: A→B, 𝐴 → 𝐶, 𝐶 𝐷 → 𝐸 , 𝐵 → 𝐷, 𝐸 → 𝐴

Step 2: Remove extraneous attributes. In this case, none of the attributes on the left-hand side of any dependency are
extraneous, so no changes are needed.
Step 3: Remove redundant dependencies. None of the dependencies in this set are redundant because removing any
of them would change the closure of the set. Thus, the canonical cover F c ​ is: 𝐴 → 𝐵 ,𝐴 → 𝐶, 𝐶 𝐷 → 𝐸, 𝐵 → 𝐷, 𝐸 → 𝐴
Database Management Systems
Exercise questions

1. Given relational schema R( P Q R S T U V) having following attribute P Q R S T


U and V, also there is a set of functional dependency denoted by FD = { P->Q, QR-
>ST, PTU->V }.
Determine Closure of (QR)+ and (PR)+
QR + = {Q,R,S,T}
PR+ = {P,R,Q,S,T}
2. Given relational schema R( P Q R S T) having following attributes P Q R S and
T, also there is a set of functional dependency denoted by FD = { P->QR, RS->T,
Q->S, T-> P }.
Determine Closure of ( T )+
T+={T,P,Q,R,S}
Database Management Systems
Numerical Examples on Equivalence of Two Sets of Functional Dependencies

Example 1: A relation R (A , C , D , E , H) is having two functional dependencies sets F


and G as shown-
Set F-
A→C
AC → D
E → AD
E→H

Set G-
A → CD
E → AH

Which of the following holds true?


(A) G ⊇ F
(B) F ⊇ G
(C) F = G
(D) All of the above
Database Management Systems
Numerical Examples on Equivalence of Two Sets of Functional Dependencies

Solution-

CASE 1: Determining whether F covers G-

Step-01: (A)+ = { A , C , D } // closure of left side of A → CD using set G


(E)+ = { A , C , D , E , H } // closure of left side of E → AH using set G

Step-02: (A)+ = { A , C , D } // closure of left side of A → CD using set F


(E)+ = { A , C , D , E , H } // closure of left side of E → AH using set F

Step-03: Comparing the results of Step-01 and Step-02, we find-


Functional dependencies of set F can determine all the attributes which have been
determined by the functional dependencies of set G.
Thus, we conclude F covers G i.e. F ⊇ G.
Database Management Systems
Numerical Examples on Equivalence of Two Sets of Functional Dependencies

Solution-

CASE 2: Determining whether G covers F-

Step-01: (A)+ = { A , C , D } // closure of left side of A → C using set F


(AC)+ = { A , C , D } // closure of left side of AC → D using set F
(E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set F

Step-02: (A)+ = { A , C , D } // closure of left side of A → C using set G


(AC)+ = { A , C , D } // closure of left side of AC → D using set G
(E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set G

Step-03: Comparing the results of Step-01 and Step-02, we find-


Functional dependencies of set G can determine all the attributes which have been
determined by the functional dependencies of set F.
Thus, we conclude G covers F i.e. G ⊇ F.
Database Management Systems
Numerical Examples on Equivalence of Two Sets of Functional Dependencies

Solution-

CASE 3: Determining whether both F and G cover each other-

From Step-01, we conclude F covers G.


From Step-02, we conclude G covers F.
Thus, we conclude both F and G cover each other i.e. F = G.

Thus, Option (D) is correct.


Database Management Systems
Exercise questions
Database Management Systems
Dependency preservation exercise

15.8. What is the dependency preservation property for a decomposition? Why is


it important?
15.11. Between the properties of dependency preservation and losslessness,
which one must definitely be satisfied? Why?
Database Management Systems
Dependency preservation exercise answers

1. What is the dependency preservation property for a decomposition? Why is it important?


A: Dependency preservation is a property of a database decomposition where the
functional dependencies of the original relation are preserved in the decomposed relations.
Importance of Dependency Preservation:
● Maintains Data Integrity
● Simplifies Enforcement of Constraints
● Reduces Redundancy

1. Between the properties of dependency preservation and losslessness, which one must
definitely be satisfied? Why?
A: Losslessness must definitely be satisfied because it ensures that no information is lost during
decomposition. This guarantees that the original relation can be perfectly reconstructed from the
decomposed relations, maintaining data integrity. Dependency preservation is important but
secondary to losslessness.
THANK YOU
Prof Nivedita Kasturi
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Dependency Preservation & Minimal
Cover Examples

Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Dependency Preservation:

Getting lossless decomposition is necessary. But of course, we also want to


keep dependencies since losing a dependency means that the corresponding
constraint can be checked only through the natural join of the appropriate
resultant relation in the decomposition. This would be very expensive. So we
aim to get a lossless dependency-preserving decomposition.
Database Management Systems
Dependency Preservation: Example 1

Question: R = (A, B, C) F= {A→ B, B→C}


Decomposition of R: R1=(A, C) R2=(B, C)
Does this decomposition preserve the given dependencies?

Solution:
In R1 the following dependencies hold: F1’ = { A฀ A, C฀ C, A฀ C,AC฀ AC}
In R2 the following dependencies hold. F2’ = { B฀ B, C฀ C, B฀ C,BC฀ BC}
F’= F1’ U F2’ = {A฀ C, B฀ C, trivial dependencies}
A฀ B can not be derived from F’.
so this decomposition is NOT preserving dependency.
Database Management Systems
Dependency Preservation: Example 2

Question: R = (A, B, C) F= {A ฀ B, B ฀ C}
Decomposition of R: R1=(A, B) R2=(B, C)
Does this decomposition preserve the given dependencies?

Solution:
In R1 the following dependencies hold: F1’ = { A฀ B, A฀ A, B฀ B, AB฀ AB}
In R2 the following dependencies hold. F2’ = { B฀ B, C฀ C, B฀ C, BC฀ BC}
F’= F1’ U F2’ = {A฀ B, B฀ C, trivial dependencies}
In F’ all the original dependencies occur
so this decomposition preserves dependency.
Database Management Systems
Dependency Preservation: Example 3


Database Management Systems
Dependency Preservation: Example 4

R(A, B, C, D,E), F={A->D,B->E,DE->C}


Let S(A, B, C) be a decomposed relation of R. What FD-s do hold on S?
Database Management Systems
Minimal Cover: Example1.

Find the minimal cover of the set of functional dependencies given:


{A -> B, B -> C, D -> ABC, AC -> D}

Step 1: Split the Functional Dependencies


{A -> B, B -> C,D -> A, D -> B, D -> C, AC -> D}

Step 2: Remove Redundant FDs


A -> B (not redundant) B -> C (not redundant)
D -> A (not redundant) D -> B (redundant, because D -> A and A -> B)
D -> C (not redundant, as D -> B was removed) AC -> D (not redundant)
After removing redundancies FDs set became
{A -> B, B -> C, D -> A, D -> C, AC -> D}
Database Management Systems
Minimal Cover: Example1.

Find the minimal cover of the set of functional dependencies given:


{A -> B, B -> C, D -> ABC, AC -> D}

Step 3: Remove Extraneous Attributes


In AC -> D, check if A or C is extraneous.
Compute closures
AC+ = {A,B,C,D}
A+ = {A,B,C,D}
C+ = {C}
Since A will give D, A is extraneous
Since C alone does not give D,C is not extraneous
So Minimal cover of (A -> B, B -> C, D -> ABC, AC -> D) => (A -> B, B -> C, D -> A, A -> D)
Database Management Systems
Minimal Cover: Example2.

Find the minimal cover of the set of functional dependencies given:


{A->C, AC->D, E->H, E->AD}

Step 1: {A->C, AC->D, E->H, E->A, E->D}


Step 2: {A->C, AC->D, E->H, E->A}
Here Redundant FD : {E->D}
Step 3: {AC->D}
{A}+ = {A,C}
Therefore C is extraneous and is removed.
{A->D}

Minimal Cover = {A->C, A->D, E->H, E->A}


Database Management Systems
Minimal Cover: Example3.

Find the minimal cover of the set of functional dependencies given:


{AB->C, D->E, AB->E, E->C}

Step 1: {AB->C, D->E, AB->E, E->C}


Step 2: {D->E, AB->E, E->C}
Here Redundant FD = {AB->C}
Step 3: {AB->E}
{A}+ = {A}
{B}+ = {B}
There is no extraneous attribute.

Therefore, Minimal cover = {D->E, AB->E, E->C}


Database Management Systems
Other Examples

Solve the given problems using the above method (Minimal cover).

Example 4: Consider another set F of functional dependencies:


F={ A ฀ BC, CD ฀ E, B ฀ D, E ฀ A }

Example 5: Given a relational Schema R( A, B, C, D) and set of Function


Dependency

FD = { B → A, AD → BC, C → ABD }. Find the canonical cover.

FD = { B → A, AD → C, C → BD } is Canonical Cover of
FD = { B → A, AD → BC, C → ABD}
THANK YOU

Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Normal Forms

Dr. J. Mannar Mannan


Department of Computer Science and Engineering
Database Management Systems
Normal Forms Based on Primary Keys

■ Normalization of Relations

■ Practical Use of Normal Forms

■ Definitions of Keys and Attributes Participating in Keys

■ First Normal Form

■ Second Normal Form

■ Third Normal Form


Database Management Systems
Normalization of Relations (1)

■ Normalization:
■ The process of decomposing unsatisfactory "bad" relations by breaking up their

attributes into smaller relations

■ Normal form:
■ Condition using keys and FDs of a relation to certify whether a relation schema is in a

particular normal form

3
Database Management Systems
Normalization of Relations (2)

■ 2NF, 3NF, BCNF


■ based on keys and FDs of a relation schema
■ 4NF
■ based on keys, multi-valued dependencies : MVDs;
■ 5NF
■ based on keys, join dependencies : JDs

4
Database Management Systems
Practical Use of Normal Forms

■ Normalization is carried out in practice so that the resulting designs are of high quality and
meet the desirable properties
■ The practical utility of these normal forms becomes questionable when the constraints on which
they are based are hard to understand or to detect
■ The database designers need not normalize to the highest possible normal form
■ (usually up to 3NF and BCNF. 4NF rarely used in practice.)
■ Denormalization:
■ The process of storing the join of higher normal form relations as a base relation—which is
in a lower normal form
Database Management Systems
Definitions of Keys and Attributes Participating in Keys (1)

■ A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-
of R with the property that no two tuples t1 and t2 in any legal relation state r of R will
have t1[S] = t2[S]

■ A key K is a superkey with the additional property that removal of any attribute from
K will cause K not to be a superkey any more.
Database Management Systems
Definitions of Keys and Attributes Participating in Keys (2)

■ If a relation schema has more than one key, each is called a candidate key.

■ One of the candidate keys is arbitrarily designated to be the primary key, and the
others are called secondary keys.
■ A Prime attribute must be a member of some candidate key
■ A Nonprime attribute is not a prime attribute—that is, it is not a member of any
candidate key.
Database Management Systems
First Normal Form

■ Disallows

■ composite attributes
■ multivalued attributes
■ nested relations; attributes whose values for an individual tuple are non-atomic
■ Considered to be part of the definition of a relation
■ Most RDBMSs allow only those relations to be defined that are in First Normal Form
Database Management Systems
Normalization into 1NF

Figure 14.9
Normalization into 1NF. (a) A relation schema that is not in 1NF.
(b) Sample state of relation DEPARTMENT. (c) 1NF version of the
same relation with redundancy.
Database Management Systems
Normalizing nested relations into 1NF

Figure 14.10
Normalizing nested
relations into 1NF. (a)
Schema of the
EMP_PROJ relation
with a nested relation
attribute PROJS. (b)
Sample extension of
the EMP_PROJ relation
showing nested
relations within each
tuple. (c)
Decomposition of
EMP_PROJ into
relations EMP_PROJ1
and EMP_PROJ2 by
propagating the
primary key.
Database Management Systems
3.5 Second Normal Form (1)

■ Uses the concepts of FDs, primary key


■ Definitions

■ Prime attribute: An attribute that is member of the primary key K


■ Full functional dependency: a FD Y -> Z where removal of any
attribute from Y means the FD does not hold any more
■ Examples:

■ {SSN, PNUMBER} -> HOURS is a full FD since neither SSN ->


HOURS nor PNUMBER -> HOURS hold
■ {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial
dependency ) since SSN -> ENAME also holds
Database Management Systems
Second Normal Form (2)

■ A relation schema R is in second normal form (2NF) if every non-prime


attribute A in R is fully functionally dependent on the primary key

■ R can be decomposed into 2NF relations via the process of 2NF


normalization or “second normalization”
Database Management Systems
Figure 14.11 Normalizing into 2NF and 3NF

Figure
14.11
Normalizing
into 2NF and
3NF. (a)
Normalizing
EMP_PROJ
into 2NF
relations. (b)
Normalizing
EMP_DEPT
into 3NF
relations.
Database Management Systems
Figure 14.12 Normalization into 2NF and 3NF

Figure 14.12
Normalization into
2NF and 3NF. (a)
The LOTS relation
with its functional
dependencies FD1
through FD4.
(b) Decomposing
into the 2NF
relations LOTS1
and LOTS2. (c)
Decomposing
LOTS1 into the
3NF relations
LOTS1A and
LOTS1B. (d)
Progressive
normalization of
LOTS into a 3NF
design.
Database Management Systems
3.6 Third Normal Form (1)

■ Definition:
■ Transitive functional dependency: a FD X -> Z that can be derived from two
FDs X -> Y and Y -> Z

■ Examples:
■ SSN -> DMGRSSN is a transitive FD
■ Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold

■ SSN -> ENAME is non-transitive


■ Since there is no set of attributes X where SSN -> X and X -> ENAME
Database Management Systems
3.6 Third Normal Form (2)

■ A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime


attribute A in R is transitively dependent on the primary key
■ R can be decomposed into 3NF relations via the process of 3NF normalization

■ NOTE:
■ In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only
if Y is not a candidate key.
■ When Y is a candidate key, there is no problem with the transitive dependency .
■ E.g., Consider EMP (SSN, Emp#, Salary ).
■ Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
Database Management Systems
Normal Forms Defined Informally

■ 1st normal form


■ All attributes depend on the key

■ 2nd normal form


■ All attributes depend on the whole key
■ 3rd normal form

■ All attributes depend on nothing but the key


Database Management Systems
General Normal Form Definitions (For Multiple Keys) (1)

■ The above definitions consider the primary key only


■ The following more general definitions take into account relations with multiple
candidate keys

■ Any attribute involved in a candidate key is a prime attribute


■ All other attributes are called non-prime attributes.
Database Management Systems
General Definition of 2NF (For Multiple Candidate Keys)

■ A relation schema R is in second normal form (2NF) if every non-prime attribute A


in R is fully functionally dependent on every key of R

■ In Figure 14.12 the FD


County_name → Tax_rate violates 2NF.

So second normalization converts LOTS into


LOTS1 (Property_id#, County_name, Lot#, Area, Price)
LOTS2 ( County_name, Tax_rate)
Database Management Systems
General Definition of Third Normal Form

■ Definition:
■ Superkey of relation schema R - a set of attributes S of R that contains a key of
R

■ A relation schema R is in third normal form (3NF) if whenever a FD X → A


holds in R, then either:
■ (a) X is a superkey of R, or
■ (b) A is a prime attribute of R
■ LOTS1 relation violates 3NF because
Area → Price ; and Area is not a superkey in LOTS1. (see Figure 14.12).
Database Management Systems
Interpreting the General Definition of Third Normal Form

■ Consider the 2 conditions in the Definition of 3NF:


■ A relation schema R is in third normal form (3NF) if whenever a FD X → A holds in
R, then either:

■ (a) X is a superkey of R, or
■ (b) A is a prime attribute of R
■ Condition (a) catches two types of violations :

■ - one where a prime attribute functionally determines a non-prime attribute.


This catches 2NF violations due to non-full functional dependencies.
■ -second, where a non-prime attribute functionally determines a non-prime
attribute. This catches 3NF violations due to a transitive dependency.
Database Management Systems
Interpreting the General Definition of Third Normal Form (2)

■ ALTERNATIVE DEFINITION of 3NF: We can restate the definition as:


A relation schema R is in third normal form (3NF) if every non-prime attribute in
R meets both of these conditions:

■ It is fully functionally dependent on every key of R


■ It is non-transitively dependent on every key of R
Note that stated this way, a relation in 3NF also meets the requirements for 2NF.

■ The condition (b) from the last slide takes care of the dependencies that “slip
through” (are allowable to) 3NF but are “caught by” BCNF which we discuss next.
Database Management Systems
Interpreting the General Definition of Third Normal Form (3)

■ 2nd ALTERNATIVE DEFINITION of 3NF: We can restate the definition as:


A relation schema R is in third normal form with respect to a set F of functional

dependencies if, for all functional dependencies in F + of the form α → β, where α ⊆ R and
β ⊆ R, at least one of the following holds:

• α → β is a trivial functional dependency.


• α is a superkey for R.
• Each attribute A in β − α is contained in a candidate key for R.
Note that the third condition above does not say that a single candidate key must
contain all the attributes in β − α; each attribute A in β − α may be contained in a
different candidate key.
THANK YOU

Dr. J. Mannar Mannan


Department of Computer Science and Engineering
Database Management Systems
BCNF and overview of Higher Normal
Forms
Dr. J. Mannar Mannan
Department of Computer Science and Engineering
Database Management Systems
BCNF (Boyce-Codd Normal Form)

■ A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD


X → A holds in R, then X is a superkey of R
■ Each normal form is strictly stronger than the previous one
■ Every 2NF relation is in 1NF
■ Every 3NF relation is in 2NF
■ Every BCNF relation is in 3NF
■ There exist relations that are in 3NF but not in BCNF
■ Hence BCNF is considered a stronger form of 3NF
■ The goal is to have each relation in BCNF (or 3NF)
Database Management Systems
Fig. 14.13 BCNF (Boyce-Codd Normal Form)

Figure 14.13
Boyce-Codd normal form. (a) BCNF normalization of LOTS1A with
the functional dependency FD2 being lost in the decomposition.
(b) A schematic relation with FDs; it is in 3NF, but not in BCNF
due to the f.d. C → B.
Database Management Systems
Figure 14.14 A relation TEACH that is in 3NF but not in BCNF

■ Two FDs exist in the relation TEACH:


■ fd1: { student, course} -> instructor
■ fd2: instructor -> course

Figure 14.14
A relation TEACH that is in 3NF but not BCNF.
Database Management Systems
Achieving the BCNF by Decomposition (1)

■ Two FDs exist in the relation TEACH:


■ fd1: { student, course} -> instructor
■ fd2: instructor -> course
■ {student, course} is a candidate key for this relation and that the
dependencies shown follow the pattern in Figure 14.13 (b).
■ So this relation is in 3NF but not in BCNF
■ A relation NOT in BCNF should be decomposed so as to meet this
property, while possibly forgoing the preservation of all
functional dependencies in the decomposed relations.
■ (See Algorithm 15.3)
Database Management Systems
Achieving the BCNF by Decomposition (2)

■ Three possible decompositions for relation TEACH


■ D1: {student, instructor} and {student, course}

■ D2: {course, instructor } and {course, student}

■ D3: {instructor, course } and {instructor, student}

■ All three decompositions will lose fd1.


■ We have to settle for sacrificing the functional dependency preservation. But
we cannot sacrifice the non-additivity property after decomposition.
■ Out of the above three, only the 3rd decomposition will not generate spurious
tuples after join.(and hence has the non-additivity property).

■ A test to determine whether a binary decomposition (decomposition into two


relations) is non-additive (lossless) is discussed under Property NJB on the next
slide. We then show how the third decomposition above meets the property.
Database Management Systems
Test for checking non-additivity of Binary Relational Decompositions

■ Testing Binary Decompositions for Lossless Join (Non-additive Join)


Property
■ Binary Decomposition: Decomposition of a relation R into two relations.
■ PROPERTY NJB (non-additive join test for binary decompositions): A
decomposition D = {R1, R2} of R has the lossless join property with respect to
a set of functional dependencies F on R if and only if either
■ The f.d. ((R1 ∩ R2) → (R1- R2)) is in F+, or
■ The f.d. ((R1 ∩ R2) → (R2 - R1)) is in F+.
Database Management Systems
Test for checking non-additivity of Binary Relational Decompositions

If you apply the NJB test to the 3 decompositions of the TEACH relation:
■ D1 gives Student → Instructor or Student → Course, none of which is true.
■ D2 gives Course → Instructor or Course → Student, none of which is true.
■ However, in D3 we get Instructor → Course or Instructor → Student.
Since Instructor → Course is indeed true, the NJB property is satisfied and D3 is
determined as a non-additive (good) decomposition.
Database Management Systems
General Procedure for achieving BCNF

■ Let R be the relation not in BCNF, let X be a subset-of R, and let X → A be the
FD that causes a violation of BCNF. Then R may be decomposed into two
relations:
■ (i) R –A and (ii) X υ A.
■ If either R –A or X υ A. is not in BCNF, repeat the process.

Note that the f.d. that violated BCNF in TEACH was Instructor →Course. Hence its BCNF
decomposition would be :
(TEACH – COURSE) and (Instructor υ Course), which gives
the relations: (Instructor, Student) and (Instructor, Course) that we obtained before in
decomposition D3.
Database Management Systems
Important points for BCNF

• Every binary relation (a relation with only two attributes) is always in BCNF.
• BCNF is free from redundancies arising out of functional dependencies (zero redundancy).
• BCNF decomposition is always lossless but not always dependency preserving.
• Sometimes, going for BCNF may not preserve functional dependencies. So, go for BCNF
only if the lost functional dependencies are not required else normalize till 3NF only.
• There exist many more normal forms even after BCNF like 4NF and more.
• But in the real world database systems, it is generally not required to go beyond BCNF.
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms

Explanation of Terms:

• Atomic Values: Values that cannot be divided further (e.g., an individual email
address as opposed to a list of email addresses).
• Functional Dependency: A relationship where one attribute uniquely determines
another attribute (e.g., StudentID → StudentName).
• Composite Key: A key that consists of more than one attribute.
• Transitive Dependency: A situation where one non-key attribute depends on another
non-key attribute through a chain of dependencies (e.g., A → B and B → C implies
A → C).
Database Management Systems
Multivalued Dependencies and Fourth Normal Form (1)

Definition:
■ A multivalued dependency (MVD) X —>> Y specified on relation schema R,
where X and Y are both subsets of R, specifies the following constraint on any
relation state r of R: If two tuples t1 and t2 exist in r such that t1[X] = t2[X], then
two tuples t3 and t4 should also exist in r with the following properties, where we
use Z to denote (R - (X υ Y)):
■ t3[X] = t4[X] = t1[X] = t2[X].
■ t3[Y] = t1[Y] and t4[Y] = t2[Y].
■ t3[Z] = t2[Z] and t4[Z] = t1[Z].

■ An MVD X —>> Y in R is called a trivial MVD if (a) Y is a subset of X, or (b) X υ


Y = R.
Database Management Systems
Multivalued Dependencies and Fourth Normal Form (3)

Definition: A relation schema R is in 4NF with respect to a set of dependencies


F (that includes functional dependencies and multivalued dependencies) if, for
every nontrivial multivalued dependency X —>> Y in F+, X is a superkey for R.

We can state the following points:


• An all-key relation is always in BCNF since it has no FDs.
• An all key relation such as the EMP relation in fig 14.15(a), which has no FDs but has MVD
Ename ->> Pname | Dname, is not in 4NF.
• A relation that is not in 4NF due to non trivial MVD must be decomposed to convert it into a set of
relations in 4NF.
• The decomposition removes the redundancy caused by the MVD.

Note: F+ is the (complete) set of all dependencies (functional or multivalued) that will hold
in every relation state r of R that satisfies F. It is also called the closure of F.
Database Management Systems
Fourth and fifth normal forms.

Figure 14.15 Fourth and fifth normal forms.


(a) The EMP relation with two MVDs: Ename –>> Pname and Ename –>> Dname.
(b) Decomposing the EMP relation into two 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.
(c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF if it has the JD(R1, R2, R3).
(d) Decomposing the relation SUPPLY into the 5NF relations R1, R2, R3.
Database Management Systems
Join Dependencies and Fifth Normal Form (1)

Definition:
■ A join dependency (JD), denoted by JD(R1, R2, ..., Rn), specified
on relation schema R, specifies a constraint on the states r of R.
■ The constraint states that every legal state r of R should have a non-
additive join decomposition into R1, R2, ..., Rn; that is, for every such r we
have
■ * (πR1(r), πR2(r), ..., πRn(r)) = r
Note: an MVD is a special case of a JD where n = 2.

■ A join dependency JD(R1, R2, ..., Rn), specified on relation


schema R, is a trivial JD if one of the relation schemas Ri in
JD(R1, R2, ..., Rn) is equal to R.
Database Management Systems
Join Dependencies and Fifth Normal Form (2)

Definition:
■ A relation schema R is in fifth normal form (5NF) (or Project-Join Normal
Form (PJNF)) with respect to a set F of functional, multivalued, and join
dependencies if,
■ for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied
by F),
■ every Ri is a superkey of R.

■ Discovering join dependencies in practical databases with hundreds of relations


is next to impossible. Therefore, 5NF is rarely used in practice.
Database Management Systems
Fourth and fifth normal forms.

Figure 14.15 Fourth and fifth normal forms. (a) The EMP relation with two MVDs: Ename
–>> Pname and Ename –>> Dname. (b) Decomposing the EMP relation into two 4NF
relations EMP_PROJECTS and EMP_DEPENDENTS. (c) The relation SUPPLY with no MVDs is
in 4NF but not in 5NF if it has the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY
into the 5NF relations R1, R2, R3.
Database Management Systems
Chapter Summary

■ Informal Design Guidelines for Relational Databases


■ Functional Dependencies (FDs)
■ Normal Forms (1NF, 2NF, 3NF)Based on Primary Keys
■ General Normal Form Definitions of 2NF and 3NF (For Multiple Keys)
■ BCNF (Boyce-Codd Normal Form)
■ Fourth and Fifth Normal Forms
Database Management Systems
Chapter Summary

Remember the following diagram which implies-


• A relation in BCNF will surely be in all other normal forms.
• A relation in 3NF will surely be in 2NF and 1NF.
• A relation in 2NF will surely be in 1NF.

This diagram also implies-


• BCNF is stricter than 3NF.
• 3NF is stricter than 2NF.
• 2NF is stricter than 1NF.
Database Management Systems
Chapter Summary

While determining the normal form of any given relation,


• Start checking from BCNF.
• This is because if it is found to be in BCNF, then it will surely be in all other normal
forms.
• If the relation is not in BCNF, then start moving towards the outer circles and check
for other normal forms in the order they appear.
THANK YOU
Dr. J. Mannar Mannan
Department of Computer Science and Engineering
Database Management Systems
Normalization Exercises

Dr. J. MANNAR MANNAN


Department of Computer Science and Engineering
Database Management Systems
Ex 14.8

Q. Define first, second, and third normal forms when only primary keys are considered. How do the general
definitions of 2NF and 3NF, which consider all keys of a relation, differ from those that consider only primary
keys?

Answer:

The concepts of First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF)
are important in database normalization to reduce redundancy and ensure data integrity. When considering
only primary keys, following are the definitions and then how they change when considering all keys of a
relation

1NF (First Normal Form) – Primary Key Consideration:


A relation is in 1NF if it contains only atomic (indivisible) values. In other words, the entries in each column
are singular values, not sets or lists. Additionally, the table must have a primary key, which uniquely identifies
each row.
•Key Point: The focus here is on ensuring that the data structure is in tabular format with no repeating groups
and a unique primary key.

[cont.]
Database Management Systems
Ex 14.8 [cont]

2NF (Second Normal Form) – Primary Key Consideration:


A relation is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the primary key. This
means that any partial dependency of a non-key attribute on a part of a composite primary key is eliminated. If the
primary key consists of a single column, then the table is automatically in 2NF.
•Key Point: In the context of primary keys only, 2NF eliminates partial dependencies (when part of a composite
primary key determines a non-key attribute).

3NF (Third Normal Form) – Primary Key Consideration:


A relation is in 3NF if it is in 2NF and there are no transitive dependencies, i.e., no non-key attribute is functionally
dependent on another non-key attribute. In other words, non-key attributes should only depend on the primary key.
•Key Point: In the context of primary keys, 3NF removes transitive dependencies, ensuring that non-key attributes are
not dependent on other non-key attributes but only on the primary key.
Database Management Systems
Ex 14.9

Q. What undesirable dependencies are avoided when a relation is in 2NF?

Answer:

When a relation is in Second Normal Form (2NF), the main undesirable dependency that is avoided is partial
dependency.

Partial Dependency:
A partial dependency occurs when a non-key attribute is dependent on only a part (a subset) of a composite primary key
(in the case of relations with composite keys). This situation can lead to redundancy and anomalies (such as update, insert,
and delete anomalies).
•Example of Partial Dependency:
Suppose we have a table with a composite primary key consisting of A and B. If a non-key attribute C depends only on A,
then we have a partial dependency of C on part of the primary key. In this case, C is only partially dependent on the
composite key (A, B), not fully dependent on the entire key.
Database Management Systems
Ex 14.10

Q. What undesirable dependencies are avoided when a relation is in 3NF?

Answer:

When a relation is in Third Normal Form (3NF), the main undesirable dependency
that is avoided is transitive dependency.

Transitive Dependency:
A transitive dependency occurs when a non-key attribute is dependent on another
non-key attribute rather than directly on the primary key (or candidate key).
Specifically, attribute A depends on B, and B depends on the primary key, so A is
transitively dependent on the primary key through B.
Database Management Systems
Ex 14.12

Q. Define Boyce-Codd normal form. How does it differ from 3NF? Why is it considered a stronger form of 3NF?

Answer:
Difference Between BCNF and 3NF:
BCNF is a stricter version of Third Normal Form (3NF). Both aim to eliminate undesirable dependencies, but they
differ in how strict they are regarding functional dependencies.
1.Functional Dependencies Involving Prime Attributes:
•3NF: A relation is in 3NF if, for every functional dependency (X → Y), one of the following conditions is true:
•X is a superkey, or
•Y is a prime attribute (i.e., part of a candidate key).
•BCNF: A relation is in BCNF if, for every functional dependency (X → Y), X must be a superkey. BCNF
does not allow the exception where Y is a prime attribute.
2.Handling of Candidate Keys:
•In 3NF, a relation can still satisfy the normal form if a non-superkey determines a prime attribute.
•In BCNF, such cases violate the form because BCNF requires that the left-hand side (determinant) of every
functional dependency be a superkey.
Database Management Systems
Ex 14.12

Q. Define Boyce-Codd normal form. How does it differ from 3NF? Why is it considered a stronger form of 3NF?

Answer:

BCNF Is Considered Stronger Than 3NF because:

•Stricter Dependency Rules: BCNF removes more potential anomalies by ensuring that all functional dependencies
have a superkey on the left-hand side. In contrast, 3NF allows certain functional dependencies that don’t involve a
superkey if the dependent attribute is a prime attribute.
•Eliminates Redundancies and Anomalies: BCNF helps avoid update, insert, and delete anomalies in cases where 3NF
might still permit them due to less strict rules for dependencies involving prime attributes.
Database Management Systems
Ex 14.22

Q. Prove that any relation schema with two attributes is in BCNF

Answer:

Let’s consider a relation schema R(A,B which has only two attributes: A and B. We need to show that such a relation
is always in Boyce-Codd Normal Form (BCNF).

Step 1: List Possible Functional Dependencies


For a relation with two attributes A and B, the possible non-trivial functional dependencies are:
1. A→B
2. B→A

Step 2: Define the BCNF Condition


A relation is in BCNF if, for every non-trivial functional dependency X→Y, the determinant X is a superkey.
(A superkey is any set of attributes that uniquely identifies each tuple (row) in the relation.)
[cont..]
Database Management Systems
Ex 14.22[cont]
Step 3: Consider Each Possible Functional Dependency

1.Case 1: A→B
1. If A→B, then A must uniquely determine B.
2. Since A can determine B, A is a superkey (because it uniquely identifies the whole tuple (A,B)).
3. Therefore, the dependency A→B satisfies the BCNF condition because the determinant A is a superkey.
2.Case 2: B→A
1. If B→A, then B must uniquely determine A.
2. Since B can determine A, B is a superkey (because it uniquely identifies the whole tuple (A,B)).
3. Therefore, the dependency B→A satisfies the BCNF condition because the determinant B is a superkey.

Step 4: No Other Functional Dependencies


Since the relation has only two attributes, the only possible non-trivial dependencies are A→B and B→A, and we have
shown that both satisfy the BCNF condition.
Conclusion:
For any relation with two attributes A and B, all possible functional dependencies either make A or B a superkey.
Therefore, the relation is always in BCNF.
Thus, any relation schema with two attributes is in BCNF.
THANK YOU

Dr. J. MANNAR MANNAN


Department of Computer Science and Engineering
Closure Of Functional Dependency :
Introduction
• The Closure Of Functional Dependency means the complete set of all
possible attributes that can be functionally derived from given functional
dependency using the inference rules known as Armstrong’s Rules.
• If “F” is a functional dependency then closure of functional dependency
can be denoted using “{F}+”.
• There are three steps to calculate closure of functional dependency.
These are:

Step-1 : Add the attributes which are present on Left Hand Side in the original
functional dependency.

Step-2 : Now, add the attributes present on the Right Hand Side of the
functional dependency.

Step-3 : With the help of attributes present on Right Hand Side, check the other
attributes that can be derived from the other given functional dependencies.
Repeat this process until all the possible attributes which can be derived are
added in the closure.

• Seems difficult? Check out the example explained below and it will surely
clear your doubt on how to calculate closure of functional dependency.

Closure Of Functional Dependency : Examples

Example-1 : Consider the table student_details having (Roll_No, Name,Marks,


Location) as the attributes and having two functional dependencies.

FD1 : Roll_No -> Name, Marks

FD2 : Name -> Marks, Location

Now, We will calculate the closure of all the attributes present in the relation
using the three steps mentioned below.

Step-1 : Add attributes present on the LHS of the first functional dependency
to the closure.
{Roll_no}+ = {Roll_No}
Step-2 : Add attributes present on the RHS of the original functional
dependency to the closure.
{Roll_no}+ = {Roll_No, Marks}

Step-3 : Add the other possible attributes which can be derived using attributes
present on the RHS of the closure. So Roll_No attribute cannot functionally
determine any attribute but Name attribute can determine other attributes
such as Marks and Location using 2nd Functional Dependency(Name [icon
name="long-arrow-right" class="" unprefixed_class=""] Marks, Location).
Therefore, complete closure of Roll_No will be :

{Roll_no}+ = {Roll_No, Marks, Name, Location}

Similarly, we can calculate closure for other attributes too i.e “Name”.

Step-1 : Add attributes present on the LHS of the functional dependency to the
closure.
{Name}+ = {Name}

Step-2 : Add the attributes present on the RHS of the functional dependency to
the closure.
{Name}+ = {Name, Marks, Location}

Step-3 : Since, we don’t have any functional dependency where “Marks or


Location” attribute is functionally determining any other attribute , we cannot
add more attributes to the closure. Hence complete closure of Name would be :
{Name}+ = {Name, Marks, Location}

NOTE : We don’t have any Functional dependency where marks and location
can functionally determine any attribute. Hence, for those attributes we can
only add the attributes themselves in their closures. Therefore,
{Marks}+ = {Marks}

and

{Location}+ = { Location}
Example-2 : Consider a relation R(A,B,C,D,E) having below mentioned
functional dependencies.

FD1 : A BC

FD2 : C B

FD3 : D E

FD4 : E D

Now, we need to calculate the closure of attributes of the relation R. The


closures will be:

{A}+ = {A, B, C}

{B}+ = {B}

{C}+ = {B, C}

{D}+ = {D, E}

{E}+ = {E}

Closure Of Functional Dependency : Calculating Candidate Key

• “A Candidate Key of a relation is an attribute or set of attributes that can


determine the whole relation or contains all the attributes in its closure."
• Let’s try to understand how to calculate candidate keys.

Example-1 : Consider the relation R(A,B,C) with given functional dependencies :

FD1 : A B

FD2 : B C

Now, calculating the closure of the attributes as :

{A}+ = {A, B, C}

{B}+ = {B, C}

{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the attributes
present in the relation “R”.

Example-2 : Consider another relation R(A, B, C, D, E) having the Functional


dependencies :

FD1 : A BC

FD2 : C B

FD3 : D E

FD4 : E D

Now, calculating the closure of the attributes as :

{A}+ = {A, B, C}

{B}+ = {B}

{C}+ = {C, B}

{D}+ = {E, D}

{E}+ = {E, D}
In this case, a single attribute is unable to determine all the attribute on its own
like in previous example. Here, we need to combine two or more attributes to
determine the candidate keys.

{A, D}+ = {A, B, C, D, E}

{A, E}+ = {A, B, C, D, E}


Hence, "AD" and "AE" are the two possible keys of the given relation “R”. Any
other combination other than these two would have acted as extraneous
attributes.

NOTE : Any relation “R” can have either single or multiple candidate keys.

Closure Of Functional Dependency : Key Definitions

1. Prime Attributes : Attributes which are indispensable part of candidate


keys. For example : “A, D, E” attributes are prime attributes in above
example-2.
2. Non-Prime Attributes : Attributes other than prime attributes which does
not take part in formation of candidate keys. For example.
3. Extraneous Attributes : Attributes which does not make any effect on
removal from candidate key.

For example : Consider the relation R(A, B, C, D) with functional dependencies :

FD1 : A BC

FD2 : B C

FD3 : D C

Here, Candidate key can be “AD” only. Hence,

Prime Attributes : A, D.

Non-Prime Attributes : B, C

Extraneous Attributes : B, C(As if we add any of the to the candidate key, it will
remain unaffected). Those attributes, which if removed does not affect closure
of that set.

Closure of an attribute x is the set of all attributes that are functional


dependencies on X with respect to F. It is denoted by X+ which means what X
can determine.

Algorithm
Let’s see the algorithm to compute X+

• Step 1 − X+ =X
• Step 2 − repeat until X+ does not change
o For each FD Y->Z in F
▪ If Y ⊆ X+ then X+ = X+ U Z

Example 1
Consider a relation R(A,B,C,D,E,F)

F: E->A, E->D, A->C, A->D, AE->F, AG->K.


Find the closure of E or E+

Solution
The closure of E or E+ is as follows −

E+ = E
=EA {for E->A add A}
=EAD {for E->D add D}
=EADC {for A->C add C}
=EADC {for A->D D already added}
=EADCF {for AE->F add F}
=EADCF {for AG->K don’t add k AG ⊄ D+)

Example 2
Let the relation R(A,B,C,D,E,F)

F: B->C, BC->AD, D->E, CF->B. Find the closure of B.

Solution
The closure for B is as follows −

B+ = {B,C,A,D,E}

Closure is used to find the candidate keys of R and compute F+

Candidate key of R: X is a candidate keys of R if X->{R}

For example,

R(A,B,C,D,E,F) WHERE F:A->BC, B->D, C->DE, BC->F. Then, find the


candidate keys of R.

Solution
A+= {A,B,C,D,E,F}={R}=>A is a candidate key

B+= {B,D} => B is not a candidate key

C+= {C,D,E} => C is not a candidate key


BC+= {B,C,D,E,F} => BC is not a candidate key

Closure of F (F+): F+ is the set of all FDs that can be inferred/ derived from
F. Using Armstrong Axioms repeatedly on F, we can compute all the FDs.

Example
R(A,B,C,D,E) AND F: A->B,B->C, C->D, A->E. Find the closure of F

Solution
A+= {A,B,C,D,E}

B+= {B,C,D}

C+= {C,D}

F+= {A->A, A->B, A->C, A->D, A->E, B->B, B->C, B->D, C->C, C->D}

GATE Question: Consider the relation scheme R = {E, F, G, H, I, J, K, L, M,


N} and the set of functional dependencies {{E, F} -> {G}, {F} -> {I, J}, {E,
H} -> {K, L}, K -> {M}, L -> {N} on R. What is the key for R? (GATE-CS-
2014)
A. {E, F}
B. {E, F, H}
C. {E, F, H, K, L}
D. {E}
Answer: Finding attribute closure of all given options, we get:
{E,F}+ = {EFGIJ}
{E,F,H}+ = {EFHGIJKLMN}
{E,F,H,K,L}+ = {{EFHGIJKLMN}
{E}+ = {E}
{EFH}+ and {EFHKL}+ results in set of all attributes, but EFH is minimal. So
it will be candidate key. So correct option is (B).

How to check whether an FD can be derived from a given FD set?


To check whether an FD A->B can be derived from an FD set F,

1. Find (A)+ using FD set F.


2. If B is subset of (A)+, then A->B is true else not true.

GATE Question: In a schema with attributes A, B, C, D and E following


set of functional dependencies are given
{A -> B, A -> C, CD -> E, B -> D, E -> A}
Which of the following functional dependencies is NOT implied by the
above set? (GATE IT 2005)
A. CD -> AC
B. BD -> CD
C. BC -> CD
D. AC -> BC
Answer: Using FD set given in question,
(CD)+ = {CDEAB} which means CD -> AC also holds true.
(BD)+ = {BD} which means BD -> CD can’t hold true. So this FD is no
implied in FD set. So (B) is the required option.
Others can be checked in the same way.

Prime and non-prime attributes


Attributes which are parts of any candidate key of relation are called as
prime attribute, others are non-prime attributes. For Example, STUD_NO in
STUDENT relation is prime attribute, others are non-prime attribute.
GATE Question: Consider a relation scheme R = (A, B, C, D, E, H) on
which the following functional dependencies hold: {A–>B, BC–> D, E–>C,
D–>A}. What are the candidate keys of R? [GATE 2005]
(a) AE, BE
(b) AE, BE, DE
(c) AEH, BEH, BCH
(d) AEH, BEH, DEH
Answer: (AE)+ = {ABECD} which is not set of all attributes. So AE is not a
candidate key. Hence option A and B are wrong.
(AEH)+ = {ABCDEH}
(BEH)+ = {BEHCDA}
(BCH)+ = {BCHDA} which is not set of all attributes. So BCH is not a
candidate key. Hence option C is wrong.
So correct answer is D.
Equivalence of Two Sets of Functional Dependencies-

In DBMS,
• Two different sets of functional dependencies for a given relation may or may not be
equivalent.
• If F and G are the two sets of functional dependencies, then following 3 cases are
possible-

Case-01: F covers G (F ⊇ G)
Case-02: G covers F (G ⊇ F)
Case-03: Both F and G cover each other (F = G)

Case-01: Determining Whether F Covers G-

Following steps are followed to determine whether F covers G or not-

Step-01:

• Take thefunctional dependencies of set G into consideration.


• For each functional dependency X → Y, find the closure of X using the functional
dependencies of set G.

Step-02:

• Take thefunctional dependencies of set G into consideration.


• For each functional dependency X → Y, find the closure of X using the functional
dependencies of set F.

Step-03:

• Compare the results of Step-01 and Step-02.


• If the functional dependencies of set F has determined all those attributes that were
determined by the functional dependencies of set G, then it means F covers G.
• Thus, we conclude F covers G (F ⊇ G) otherwise not.
Case-02: Determining Whether G Covers F-

Following steps are followed to determine whether G covers F or not-

Step-01:

• Take thefunctional dependencies of set F into consideration.


• For each functional dependency X → Y, find the closure of X using the functional
dependencies of set F.

Step-02:

• Take thefunctional dependencies of set F into consideration.


• For each functional dependency X → Y, find the closure of X using the functional
dependencies of set G.

Step-03:

• Compare the results of Step-01 and Step-02.


• If the functional dependencies of set G has determined all those attributes that were
determined by the functional dependencies of set F, then it means G covers F.
• Thus, we conclude G covers F (G ⊇ F) otherwise not.

Case-03: Determining Whether Both F and G Cover Each Other-

• If Fcovers G and G covers F, then both F and G cover each other.


• Thus, if both the above cases hold true, we conclude both F and G cover each other (F
= G).

PRACTICE PROBLEM BASED ON EQUIVALENCE OF FUNCTIONAL


DEPENDENCIES-

Problem-

A relation R (A , C , D , E , H) is having two functional dependencies sets F and G as shown-


Set F-
A→C
AC → D
E → AD
E→H

Set G-
A → CD
E → AH

Which of the following holds true?


(A) G ⊇ F
(B) F ⊇ G
(C) F = G
(D) All of the above

Solution-

Determining whether F covers G-

Step-01:

• (A)+={A, C , D } // closure of left side of A → CD using set G


+
• (E) = { A , C , D , E , H } // closure of left side of E → AH using set G

Step-02:

• (A)+={A, C , D } // closure of left side of A → CD using set F


+
• (E) = { A , C , D , E , H } // closure of left side of E → AH using set F

Step-03:

Comparing the results of Step-01 and Step-02, we find-


• Functional dependencies of set F can determine all the attributes which have been
determined by the functional dependencies of set G.
• Thus, we conclude F covers G i.e. F ⊇ G.

Determining whether G covers F-

Step-01:

• (A)+ ={A, C , D } // closure of left side of A → C using set F


+
• (AC) = { A , C , D } // closure of left side of AC → D using set F
+
• (E) = { A , C , D , E , H } // closure of left side of E → AD and E → H using set F

Step-02:

• (A)+ ={A, C , D } // closure of left side of A → C using set G


• (AC)+ = { A , C , D } // closure of left side of AC → D using set G
• (E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set G

Step-03:

Comparing the results of Step-01 and Step-02, we find-


• Functional dependencies of set G can determine all the attributes which have been
determined by the functional dependencies of set F.
• Thus, we conclude G covers F i.e. G ⊇ F.

Determining whether both F and G cover each other-

• From Step-01, we conclude F covers G.


• From Step-02, we conclude G covers F.
• Thus, we conclude both F and G cover each other i.e. F = G.

Thus, Option (D) is correct.


Sample Questions
Q.1 Let us take an example to show the relationship between two
FD sets. A relation R(A,B,C,D) having two FD sets F = {A->B, B-
>C, AB->D} and G = {A->B, B->C, A->C, A->D}

Step 1: Checking whether all FDs of FD1 are present in FD2


• A->B in set FD1 is present in set FD2.
• B->C in set FD1 is also present in set FD2.
• AB->D is present in set FD1 but not directly in FD2 but we will check whether
we can derive it or not. For set FD2, (AB)+ = {A, B, C, D}. It means that AB can
functionally determine A, B, C, and D. So AB->D will also hold in set FD2.
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Step 2: Checking whether all FDs of FD2 are present in FD1
• A->B in set FD2 is present in set FD1.
• B->C in set FD2 is also present in set FD1.
• A->C is present in FD2 but not directly in FD1 but we will check whether we
can derive it or not. For set FD1, (A)+ = {A, B, C, D}. It means that A can
functionally determine A, B, C, and D. SO A->C will also hold in set FD1.
• A->D is present in FD2 but not directly in FD1 but we will check whether we
can derive it or not. For set FD1, (A)+ = {A, B, C, D}. It means that A can
functionally determine A, B, C, and D. SO A->D will also hold in set FD1.
As all FDs in set FD2 also hold in set FD1, FD1 ⊃ FD2 is true.
Step 3: As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets
are semantically equivalent.
Q.2 Let us take another example to show the relationship between two FD sets. A
relation R2(A,B,C,D) having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B,
B->C, A->D}
Step 1: Checking whether all FDs of FD1 are present in FD2
• A->B in set FD1 is present in set FD2.
• B->C in set FD1 is also present in set FD2.
• A->C is present in FD1 but not directly in FD2 but we will check whether we
can derive it or not. For set FD2, (A)+ = {A, B, C, D}. It means that A can
functionally determine A, B, C, and D. SO A->C will also hold in set FD2.
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Step 2: Checking whether all FDs of FD2 are present in FD1
• A->B in set FD2 is present in set FD1.,
• B->C in set FD2 is also present in set FD1.
• A->D is present in FD2 but not directly in FD1 but we will check whether we
can derive it or not. For set FD1, (A)+ = {A,B,C}. It means that A can’t
functionally determine D.
• So A->D will not hold in FD1.
As all FDs in set FD2 do not hold in set FD1, FD2 ⊄ FD1.
Step 3: In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically
equivalent.
Find minimal cover of set of functional dependencies Exercise

1. Find the minimal cover of the set of functional dependencies given: {A → C, AB


→ C, C → DI, CD → I, EC → AB, EI → C}

Answer: minimal cover Fc = {A → C, C → D, C → I, EC → A, EC → B, EI → C}.


Solution:

Step 1: Decompose RHS of FD


FD= { A → C
AB → C
C→D
C→I
CD → I
EC → A
EC →B
EI → C}

Step 2: Remove the redundant FD from above set


a. Check if A → C is redundant
Find A+ = {ACDI} {Including A→C in FD}
A+={ A} {Excluding A→C from FD}
As we got different attributes for A+ when included it in FD for finding closure and
even when we excluded it from FD.
A → C is not redundant

b. Check if AB→C is redundant?


AB+={ABCDI} {Including AB→C in FD}
AB+={ABCDI} {Excluding AB→C from FD}

As we have got same closure for including AB→C and Excluding AB→C. then we
can drive AB→C from given FD without AB→C so it’s a redundant FD. Revome it
from FD

AB→C is redundant
Therefore updated FD is FD1 = {A → C,C → D ,C → I ,CD → I, EC → A, EC →B,
EI → C}
Now for below calculation use FD1

c. Check if C→D is redundant ?


Find C+= {CDI} {Including C→D in FD1}
C+={CI} {Excluding C→D from FD1}

C→D is not redundant

d. Check if C→I is redundant?


Find C+ ={CDI} {Including C→I in FD1}
C+={CDI} { Excluding C→I from FD1}
As we got same set of attributes for C+ even when we excluded the C→I from FD1.
Therefore we can infer the C→I from given set of FD1 so it can be removed from FD1 as its
redundant.
C→I is redundant .
So updated FD is FD2={ A → C,C → D ,CD → I, EC → A, EC →B, EI → C}
Now for below calculation use FD2
e. Check if EC→A is redundant?
Find EC+= {ECABDI} {Including EC→A in FD2}
EC+={ECBDI} {Excluding EC→A in FD2}
EC→A is Not redundant

f. Check if EC→B is redundant ?

Find EC+= {ECBADI} {Including EC→B in FD2}


EC+={ECADI} {Excluding EC→B in FD2}
EC→B is Not redundant

g. Check if CD→I is redundant?

Find CD+={CDI} {Including CD→I in FD2}


CD+={CD} {Excluding CD→I in FD2}
CD→I is not redundant.
h. Check if EI→C is redundant?

Find EI+={EICDAB} {Including EI→C in FD2}

EI+= {EI} {Excluding EI→C in FD2}

EI→C is not redundant .

Therefore FD is FD2={ A → C,C → D ,CD → I, EC → A, EC →B, EI → C}

Step 3: Check for extraneous attribute

Here consider the FD where we have more than one attribute on LHS

Following FD has more than one attribute on LHS

CD → I, EC → A, EC →B, EI → C

Now we will find extraneous attribute :

a. CD→I

Find closure for CD+, C+ and D+ from FD2. check if any closure has got same
attributes, then only that attribute alone can be sufficient for FD

CD+ = {CDI}
C+= {CDI}
D+= {D}

In above result the C+ has same set of attributes as CD+ so C attribute alone can
determine the I. hence D is an extraneous attribute.

Consider FD as C→I.

Therefore, updated FD is FD3= {A → C, C → D, C → I, EC → A, EC →B, EI


→ C}

b. EC→A

Find closure for EC+, E+ and C+ using FD3

EC+={ECABID}
E+={E}
C+={CDI}
In above closure we didn’t get closure of single attribute similar to EC so no
extraneous attribute.

c. EC→B
Find closure of EC+, E+ and C+ using FD3
EC+={ECBAID}
E+={E}
C+={CDI}

In above closure we didn’t get closure of single attribute similar to EC so no


extraneous attribute
Therefore, updated FD is FD3= {A → C, C → D, C → I, EC → A, EC →B, EI
→ C}

d. EI→C
Find closure of EI+, E+ and I+
EI+={EICDIAB}
E+={E}
I+={I}
In above closure we didn’t get closure of single attribute similar to EI so no
extraneous attribute

Therefore, minimal set of FD is


{A → C, C → D, C → I, EC → A, EC →B, EI → C}

Slove the below problems using above method.


2. Consider another set F of functional dependencies:
F={
A BC
CD E
B D
E A
}
3. Given a relational Schema R( A, B, C, D) and set of Function Dependency FD = { B
→ A, AD → BC, C → ABD }. Find the canonical cover?

FD = { B → A, AD → C, C → BD } is Canonical Cover of FD = { B → A, AD → BC, C


→ ABD

You might also like