Unit 3 Slides
Unit 3 Slides
Functional Dependencies
• Each relation schema consists of a number of attributes, and the relational database schema
consists of a number of relation schemas.
• We have assumed that attributes are grouped to form a relation schema by using the common
sense of the database designer or by mapping a database schema design from a conceptual data
model such as the ER data model.
• we did not develop any measure of appropriateness or goodness to measure the quality of
the design, other than the intuition of the designer.
• There are two levels at which we can discuss the goodness of relation schemas. The first is the
logical (or conceptual) level—how users interpret the relation schemas and the meaning of their
attributes. The second is the implementation (or physical storage) level—how the tuples in a
base relation are stored and updated.
Database Management Systems
Functional Dependencies
• A bottom-up design methodology (also called design by synthesis) considers the basic
relationships among individual attributes as the starting point and uses those to construct relation
schemas. This approach is not very popular because it suffers from the problem of having to collect
a large number of binary relationships among attributes as the starting point. For practical
situations, it is next to impossible to capture binary relationships among all such pairs of attributes.
• In contrast, a top-down design methodology (also called design by analysis) starts with a number
of groupings of attributes into relations that exist together naturally, for example, on an invoice, a
form, or a report. The relations are then analyzed individually and collectively, leading to further
decomposition until all desirable properties are met.
Database Management Systems
Functional Dependencies
• The theory is applicable primarily to the top-down design approach, and as such is more
appropriate when performing design of databases by analysis and decomposition of sets of
attributes that appear together in files, in reports, and on forms in real-life situations.
• Relational database design ultimately produces a set of relations. The implicit goals of the design
activity are information preservation and minimum redundancy.
• A functional dependency is a constraint between two sets of attributes from the database.
• Suppose that our relational database schema has n attributes A1, A2, … , An; let us think of the
whole database as being described by a single universal relation schema R = {A1, A2, … , An}.
• We do not imply that we will actually store the database as a single universal table;
• We use this concept only in developing the formal theory of data dependencies
Database Management Systems
Functional Dependencies
■ Some of the most commonly used types of real-world constraints can be represented
formally as keys (super keys, candidate keys, and primary keys), or as functional
dependencies
■ Functional dependencies (FDs)
■ Are used to specify formal measures of the "goodness" of relational designs
■ And keys are used to define normal forms for relations
■ Are constraints that are derived from the meaning and interrelationships of the data attributes
■ The functional dependency (FD) is a relationship that exists between two attributes, Where one
attribute can directly or indirectly derived from the other attribute.
■ A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique
value for Y
Dependent
Database Management Systems
Defining Functional Dependencies
■ A functional dependency, denoted by X → Y, between two sets of attributes X and Y that are
subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. The
constraint is that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y]
= t2[Y].
➔The values of the X component of a tuple uniquely (or functionally) determine the values of the
Y component. (Or) there is a functional dependency from X to Y, or that Y is functionally dependent
on X.
➔X functionally determines Y in a relation schema R if, and only if, whenever two tuples of r(R) agree
on their X-value, they must necessarily agree on their Y-value.
Note the following:
• If a constraint on R states that there cannot be more than one tuple with a given X-value in any
relation instance r(R)—that is, X is a candidate key of R—this implies that X → Y for any subset
of attributes Y of R (because the key constraint implies that no two tuples in any legal state r(R)
will have the same value of X). If X is a candidate key of R, then X → R.
• If X → Y in R, this does not say whether or not Y → X in R
Database Management Systems
Use of Functional Dependencies
To Summarize
■ X → Y in R specifies a constraint on all relation instances r(R)
■ Written as X → Y; can be displayed graphically on a relation schema
■ FDs are derived from the real-world constraints on the attributes
■ A functional dependency is a property of the semantics or meaning of the attributes.
■ Relation extensions r(R) that satisfy the functional dependency constraints are called legal relation
states (or legal extensions) of R.
■ For example, {State, Driver_license_number} → Ssn should normally hold for any adult in the
United States and hence should hold whenever these attributes appear in a relation
Database Management Systems
Notational Conventions
For simplicity, we assume that attribute names have only one meaning within
the database schema.
Database Management Systems
Defining Functional Dependencies
■ A functional dependency is said to be full dependency “if and only if the determinant of
the functional dependency if either candidate key or super key, and the dependent can
be either prime or non-prime attribute”.
Example: Consider the following determinant ABC → D
i.e., ABC determines D
BC→D, C→D, A→D These subsets of ABC, cannot determine D
So D is Fully Functional Dependent.
■ {Ssn, Pnumber} → Hours is a full dependency (neither Ssn → Hours nor Pnumber →
Hours holds).
Database Management Systems
Types of Functional Dependency
■ Multi-Valued Dependency:
Consider 3 fields X, Y, and Z in a relation. If for each value of X, there is a well-defined
set of values Y and Well-defined set of values of Z and set of values of Y is independent of
the set values of Z. This dependency is Multi-valued Dependency. i.e., X →→Y / Z.
CAR_MODEL → → MANUF_MONTH CAR_MODEL → → COLOUR
Database Management Systems
Ruling Out FDs
Consider the following TEACH relation.
we can say that the FD: Text → Course may exist.
However, the FDs Teacher → Course, Teacher → Text, Course → Text are ruled out.
(This is because the element on the LHS have similar values in the current column, but are not the same on the
RHS column)
Database Management Systems
What FDs may exist?
• We will discuss the rules of inference for functional dependencies and use them to
define the concepts of a cover, equivalence, and minimal cover among functional
dependencies.
• first we will describe two desirable properties of decompositions, namely, the
dependency preservation property and the nonadditive (or lossless) join property, which
are both used by the design algorithms to achieve desirable decompositions.
• We now turn our attention to the process of decomposition that we will use to get rid of
unwanted dependencies and achieve higher normal forms.
Database Management Systems
Decomposition
• We use Decomposition to get rid of unwanted dependencies and achieve higher normal
forms.
• The only way to avoid the repetition-of-information problem in a schema is to
decompose it into two schemas. One must be careful while doing this as this could lead to
the loss of interesting relationships.
• e.g:
employee (ID, name, street, city, salary)
INTO
employee1 (ID, name) + employee2 (name, street, city, salary)
The flaw in this decomposition arises from the possibility that the enterprise has two
employees with the same name.
Database Management Systems
Loss of information via a bad decomposition (Lossy decomposition)
• Let R be a relation schema and let R1 and R2 form a decomposition of R—that is,
viewing R, R1, and R2 as sets of attributes, R = R1 ∪ R2.
• We say that the decomposition is a lossless decomposition if there is no loss of
information by replacing R with two relation schemas R1 and R2.
• Essentially we must pass this sql statement and get back exactly r
select * from (select R1 from r) natural join (select R2 from r)
• Let R, R1, R2, and F be as above. R1 and R2 form a lossless decomposition of R if at
least one of the following functional dependencies is in F+
1. R1 ∩ R2 → R1
2. R1 ∩ R2 → R2
Database Management Systems
Lossless Decomposition
Then the following SQL constraints must be imposed on the decomposed schema to ensure their
contents are consistent with the original schema.
1. R1 ∩ R2 is the primary key of r1. This constraint enforces the functional dependency.
2. R1 ∩ R2 is a foreign key from r2 referencing r1. This constraint ensures that each tuple in r2
has a matching tuple in r1, without which it would not appear in the natural join of r1 and r2.
■ IR1, IR2, IR3 form a sound and complete set of inference rules
■ These are rules hold and all other rules that hold can be deduced from these
NOTE: By applying these rules repeatedly, we can find all of F+, given F. This collection of
rules is called Armstrong’s axioms (Sound + Complete)
Database Management Systems
Inference Rules for FDs (3)
■ The last three inference rules, as well as any other inference rules, can be deduced from
IR1, IR2, and IR3 (completeness property)
■ One important cautionary note regarding the use of these rules: Although X → A and X →
B implies X → AB by the union rule stated above, X → A and Y → B does imply that XY
→ AB. Also, XY → A does not necessarily imply either X → A or Y → A.
Database Management Systems
Example 1
1. AB ->E Given
2. E ->G Given
3. BE->I Given
4. GI ->H Given
5. AB ->G Transitivity on (1) and (2)
6. AB -> BE Augmentation (1) by B
7. AB -> I Transitivity on (6) and (3)
8. AB -> GI Union on (5) and (7)
9. AB -> H Transitivity on (8) and (4)
10. AB -> GH Union on (5) and (9)
THANK YOU
■ X → Y holds if whenever two tuples have the same value for X, they must have
the same value for Y
■ For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then
t1[Y]=t2[Y]
In real life, it is impossible to specify all possible functional dependencies for a given situation.
For example, if each department has one manager, so that Dept_no uniquely determines
Mgr_ssn (Dept_no → Mgr_ssn), and a manager has a unique phone number called Mgr_phone
(Mgr_ssn → Mgr_phone), then these two dependencies together imply that Dept_no →
Mgr_phone.
This is an inferred or implied FD and need not be explicitly stated in addition to the two given
FDs.
Therefore, it is useful to define a concept called closure formally that includes all possible
dependencies that can be inferred from the given set F
Database Management Systems
Closure
• Suppose we are given a relation schema r(A, B, C, G, H, I) and the set of functional
dependencies: A → B A → C CG → H CG → I B → H
• That is, we can show that, whenever a relation instance satisfies our given set of functional
dependencies, A → H must also be satisfied by that relation instance.
• Suppose that t1 and t2 are tuples such that: t1[A] = t2[A] Since we are given that A → B, it
follows from the definition of functional dependency that: t1[B] = t2[B] Then, since we are
given that B → H, it follows from the definition of functional dependency that: t1[H] = t2[H]
• Therefore, we have shown that, whenever t1 and t2 are tuples such that t1[A] = t2[A], it must
be that t1[H] = t2[H]. But that is exactly the definition of A → H.
Database Management Systems
Closure
Let F be a set of functional dependencies. The closure of F, denoted by F+, is the set of all
functional dependencies logically implied by F. Given F, we can compute F+ directly from
the formal definition of functional dependency.
NOTE:
Since a set of size n has 2n subsets, there are a
total of 2^n × 2^n = 2^(2^n)
possible functional dependencies, where n is the
number of attributes in R
Database Management Systems
Closure
• Example:
F = {Ssn → {Ename, Bdate, Address, Dnumber}, Dnumber → {Dname, Dmgr_ssn} }
• Some of the additional functional dependencies that we can infer from F are the following:
• The closure F+ of F is the set of all functional dependencies that can be inferred from F.
• To determine a systematic way to infer dependencies, we need to use all inference rules
learnt in previous lecture.
Database Management Systems
Closure
• One way of doing this is to compute F+, take all functional dependencies with α as the
left-hand side, and take the union of the right-hand sides of all such dependencies.
• Let α be a set of attributes. We call the set of all attributes functionally determined by α
under a set F of functional dependencies the closure of α under F; we denote it by α+
• Definition. For each such set of attributes X, we determine the set X+ of attributes that
are functionally determined by X based on F; X+ is called the closure of X under F.
Database Management Systems
Algorithm to determine Closure of X under F
For example, consider the following relation schema about classes held at a university in a
given academic year.
CLASS ( Classid, Course#, Instr_name, Credit_hrs, Text, Publisher, Classroom, Capacity).
Let F, the set of functional dependencies for the above relation include the following f.d.s:
FD1: Classid → Course#, Instr_name, Credit_hrs, Text, Publisher, Classroom, Capacity;
FD2: Course# → Credit_hrs;
FD3: {Course#, Instr_name} → Text, Classroom;
FD4: Text → Publisher
FD5: Classroom → Capacity
Note that the above FDs express certain semantics about the data in the relation CLASS.
Find the closure for 1. Classid 2. Course# 3. Course#, Instr_name
Database Management Systems
Closure of Attribute Sets
For example, FD1 states that each class has a unique Classid.
FD3 states that when a given course is offered by a certain instructor, the text is fixed and the
instructor teaches that class in a fixed room.
Using the inference rules about the FDs and applying the definition of closure,
• Note that each closure above has an interpretation that is revealing about the attribute(s)
on the left-hand side.
• For example, the closure of Course# has only Credit_hrs besides itself. It does not
include Instr_name because different instructors could teach the same course; it does not
include Text because different instructors may use different texts for the same course.
• Note also that the closure of {Course#, Instr_nam} does not include Classid, which
implies that it is not a candidate key.
• This further implies that a course with given Course# could be offered by different
instructors, which would make the courses distinct classes.
Database Management Systems
Closure of Attribute Sets
A → BC
BC → DE
D→F
CF → G
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
Database Management Systems
Closure of Attribute Sets
{ B , C }+ = { B , C }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
= { B , C , D , E , F , G } ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Database Management Systems
Closure of Attribute Sets
Problem-
Solution-
Option-(A):
{ CF }+ = { C , F }
={C,F,G} ( Using C → G )
={C,E,F,G} ( Using F → E )
={A,C,E,E,F} ( Using G → A )
= { A , C , D , E , F , G } ( Using AF → D )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Database Management Systems
Closure of Attribute Sets
Option-(B):
{ BG }+ = { B , G }
={A,B,G} ( Using G → A )
={A,B,C,D,G} ( Using AB → CD )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Database Management Systems
Closure of Attribute Sets
Option-(C):
{ AF }+ = { A , F }
={A,D,F} ( Using AF → D )
={A,D,E,F} ( Using F → E )
Since, our obtained result set is different from the given result set, so,it means it is not correctly
given.
Database Management Systems
Closure of Attribute Sets
Option-(D):
{ AB }+ = { A , B }
={A,B,C,D} ( Using AB → CD )
={A,B,C,D,G} ( Using C → G )
Since, our obtained result set is different from the given result set, so,it means it is not
correctly given.
Thus,
Option (C) and Option (D) are correct.
THANK YOU
Equivalence of Two Sets of Functional Show that the following two sets of
Dependencies- FDs are equivalent:
F = {A → C, AC → D, E → AD, E
• Two different sets of functional dependencies → H} and
for a given relation may or may not be G = {A → CD, E → AH}
equivalent. Which of the following holds true?
• If F and G are the two sets of functional (A) G ⊇ F
dependencies, then following 3 cases are (B) F ⊇ G
(C) F = G
possible- (D) All of the above
Case-01: F covers G (F ⊇ G) (solution is discussed after
Case-02: G covers F (G ⊇ F) algorithm steps)
Case-03: Both F and G cover each other (F = G)
Database Management Systems
Equivalence of Sets of FDs
Step-01:
• Take the functional dependencies of set G into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.
Step-02:
• Take the functional dependencies of set G into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.
Step-03:
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set F has determined all those attributes that were determined by the
functional dependencies of set G, then it means F covers G.
• Thus, we conclude F covers G (F ⊇ G) otherwise not.
Database Management Systems
Equivalence of Sets of FDs
Step-01:
• Take the functional dependencies of set F into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set F.
Step-02:
• Take the functional dependencies of set F into consideration.
• For each functional dependency X → Y, find the closure of X using the functional dependencies of set G.
Step-03:
• Compare the results of Step-01 and Step-02.
• If the functional dependencies of set G has determined all those attributes that were determined by the
functional dependencies of set F, then it means G covers F.
• Thus, we conclude G covers F (G ⊇ F) otherwise not.
Database Management Systems
Equivalence of Sets of FDs
Step-02:
• (A)+ = { A , C , D } // closure of left side of A → CD using set F
• (E)+ = { A , C , D , E , H } // closure of left side of E → AH using set F
Step-03:
Comparing the results of Step-01 and Step-02, we find Functional dependencies of set F can determine
all the attributes which have been determined by the functional dependencies of set G.
• Thus, we conclude F covers G i.e. F ⊇ G.
Database Management Systems
Equivalence of Sets of FDs : Solution
Step-02:
• (A)+ = { A , C , D } // closure of left side of A → C using set G
• (AC)+ = { A , C , D } // closure of left side of AC → D using set G
• (E)+ = { A , C , D , E , H } // closure of left side of E → AD and E → H using set G
Step-03:
Comparing the results of Step-01 and Step-02, we find-
• Functional dependencies of set G can determine all the attributes which have been determined by the
functional dependencies of set F.
• Thus, we conclude G covers F i.e. G ⊇ F.
Database Management Systems
Equivalence of Sets of FDs
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Database Management Systems
Examples of equivalence
Step 3: As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets
are semantically equivalent.
Database Management Systems
Examples of equivalence
Q.2 Let us take another example to show the relationship between two FD sets. A
relation R2(A,B,C,D) having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B,
B->C, A->D}
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.
Database Management Systems
Examples of equivalence
Step 3: In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically
equivalent.
Database Management Systems
Exercise questions
15.4. When are two sets of functional dependencies equivalent? How can we determine their
equivalence?
• To check that a lost dependency holds, we must take the JOIN of two or more
relations in the decomposition to get a relation that includes all left- and right-hand-
side attributes of the lost dependency, and then check that the dependency holds on the
result of the JOIN—an option that is not practical.
• An example of a decomposition that does not preserve dependencies is shown in
Figure 14.13(a), in which the functional dependency FD2 is lost when LOTS1A is
decomposed into {LOTS1AX, LOTS1AY}. The decompositions in Figure 14.12,
however, are dependency-preserving.
Database Management Systems
Dependency preservation
Figure 14.13
Figure 14.12
Database Management Systems
Dependency preservation
▪ Testing functional dependency constraints each time the database is updated can be
costly
▪ It is useful to design the database in a way that constraints can be tested efficiently.
▪ If testing a functional dependency can be done by considering just one relation, then the
cost of testing this constraint is low
▪ When decomposing a relation it is possible that it is no longer possible to do the testing
without having to perform a Cartesian Produced.
▪ A decomposition that makes it computationally hard to enforce functional dependency is
said to be NOT dependency preserving
● We say that a decomposition having the property F′+ = F+ is a dependency-
preserving decomposition.
Database Management Systems
Dependency preservation example
Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
With function dependencies:
i_ID → dept_name
s_ID, dept_name → i_ID
In the above design we are forced to repeat the department name once for each time an instructor
participates in a dept_advisor relationship.
To fix this, we need to decompose dept_advisor
Any decomposition that we will do for dept_advisor will not include all the attributes i.e s_ID,
dept_name, i_ID so following fd will not be preserved at all
s_ID, dept_name → i_ID
Thus, the decomposition NOT be dependency preserving
Database Management Systems
Dependency preservation exercise
1. Between the properties of dependency preservation and losslessness, which one must definitely be
satisfied? Why?
A: Losslessness must definitely be satisfied because it ensures that no information is lost during
decomposition. This guarantees that the original relation can be perfectly reconstructed from the
decomposed relations, maintaining data integrity. Dependency preservation is important but secondary
to losslessness.
THANK YOU
Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Unit 3
❏ Minimal cover
1. Set F:= E.
2. Replace each functional dependency X → {A1, A2, ..., An} in F by the n functional
dependencies X →A1, X →A2, ..., X → An.
(*Place FDs in a canonical form, Preparatory step*)
❑ Step 1: All above dependencies are in canonical (they have only one attribute on the
RHS) ;
The original set F can be inferred from E; In other words, the two sets F
and E are equivalent.
Database Management Systems
Computing the Minimal Sets of FDs: Example 2
Let the given set of FDs be G : {A → BCDE, CD → E}.We have to find the
minimum cover of G.
Step 1: All above dependencies are not in canonical (they do not have one
attribute on the RHS) ; so we have to convert them into:
■ Step 2 : For CD - > E, neither C nor D is extra on the LHS, since we can not
show that C -> E OR D -> E from the given FDs. Hence we can not replace it
with either.
Here consider the FD where we have more than one attribute on LHS
Following FD has more than one attribute on LHS
CD → I, EC → A, EC →B, EI → C
Now we will find extraneous attributes:
Database Management Systems
Example3.
a. CD→I
Find closure for CD+, C+ and D+ from FD2. check if any closure has got same attributes, then only that
attribute alone can be sufficient for FD
CD+ = {CDI}
C+= {CDI}
D+= {D}
In the above result the C+ has the same set of attributes as CD+ so C attribute alone can determine the I.
Hence D is an extraneous attribute.
Consider FD as C→I.
b. EC→A
Find closure for EC+, E+ and C+ using FD3
EC+={ECABID}
E+={E}
C+={CDI}
In the above closure, we didn’t get closure of a single attribute similar to EC so no extraneous attribute.
Database Management Systems
Example3.
c. EC→B
Find closure of EC+, E+ and C+ using FD3
EC+={ECBAID}
E+={E}
C+={CDI}
In the above closure we didn’t get closure of a single attribute similar to EC so no extraneous attribute
d. EI→C
Find closure of EI+, E+ and I+
EI+={EICDIAB}
E+={E}
I+={I}
In the above closure, we didn’t get closure of a single attribute similar to EI so no extraneous attribute
We can determine the candidate keys of a given relation using the following steps-
Step-02: The remaining attributes of the relation are non-essential attributes. This
is because they can be determined by using essential attributes.
We can determine the candidate keys of a given relation using the following steps-
Case-01: If all essential attributes together can determine all remaining non-essential
attributes, then-
•The combination of essential attributes is the candidate key.
•It is the only possible candidate key.
Case-02: If all essential attributes together can not determine all remaining non-
essential attributes, then-
•The set of essential attributes and some non-essential attributes will be the candidate
key(s).
•In this case, multiple candidate keys are possible.
•To find the candidate keys, we check different combinations of essential and non-
essential attributes.
Database Management Systems
Practice Problem: Finding Candidate Keys
Also, determine the total number of candidate keys and super keys.
Database Management Systems
Practice Problem: Finding Candidate Keys
Solution- We will find candidate keys of the given relation in the following steps-
Step-02: Now, We will check if the essential attributes together can determine all
remaining non-essential attributes. To check, we find the closure of CE.
So, we have-
{ CE }+
={C,E}
= { C , E , F } ( Using C → F )
= { A , C , E , F } ( Using E → A )
= { A , C , D , E , F } ( Using EC → D )
= { A , B , C , D , E , F } ( Using A → B )
Database Management Systems
Practice Problem: Finding Candidate Keys
We conclude that CE can determine all the attributes of the given relation.
So, CE is the only possible candidate key of the relation.
Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Relational Decomposition (Matrix Method)
Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Unit 3: Database Design
■ Decomposition:
■ The process of decomposing the universal relation schema R into a set of relation schemas
D = {R1,R2, …, Rm} that will become the relational database schema by using the functional
dependencies.
■ Attribute preservation condition:
▪ Each attribute in R will appear in at least one relation schema Ri in the decomposition so that
no attributes are “lost”.
■ Another goal of decomposition is to have each individual relation Ri in the decomposition D be
in BCNF or 3NF.
■ Additional properties of decomposition are needed to prevent from generating spurious tuples
Database Management Systems
Properties of Relational Decompositions (Cont.)
■ Note: The word loss in lossless refers to loss of information, not to loss of tuples. In fact, for
“loss of information” a better term is “addition of spurious information”
■ Non-additive join term means no spurious tuples results after the application of PROJECT and
JOIN operations.
Database Management Systems
Properties of Relational Decompositions (Cont.)
4. Repeat the following loop until a complete loop execution results in no changes to S
{for each functional dependency X →Y in F
{for all rows in S which have the same symbols in the columns corresponding to
attributes in X
{make the symbols in each column that correspond to an attribute in Y be the
same in all these rows as follows:
If any of the rows has an “a” symbol for the column, set the other rows
to that same “a” symbol in the column.
If no “a” symbol exists for the attribute in any of the rows, choose one of
the “b” symbols that appear in one of the rows for the attribute and set the other rows to that
same “b” symbol in the column ;};
};
};
5. If a row is made up entirely of “a” symbols, then the decomposition has the lossless join
property; otherwise it does not.
Database Management Systems
Example-1
R(S,A,I,P)
F = {S -> A, SI -> P}
S A I P
R1 a1 a2 b13 b14
R2 a1 b32 a3 a4
S A I P
R1 a1 a2 b13 b14
R2 a1 a2 a3 a4
R(A,B,C,D,E)
F = {A -> C, B -> C, C -> D, DE -> C, CE -> A}
R1(ADC), R2(AB), R3(BE), R4(CDE), R5(AE)
A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 b23 b24 b25
R3 b31 a2 b33 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 b53 b54 A5
Database Management Systems
Example-3
R(A,B,C,D,E)
FD = A -> C
A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 b24 b25
R3 b31 a2 b33 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 b54 a5
Database Management Systems
Example-3
R(A,B,C,D,E)
FD = B -> C
A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 b24 b25
R3 b31 a2 a3 b34 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 b54 a5
Database Management Systems
Example-3
R(A,B,C,D,E)
FD = C -> D
A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 a4 b25
R3 b31 a2 a3 a4 a5
R4 b41 b42 a3 a4 a5
R5 a1 b52 a3 a4 a5
Database Management Systems
Example-3
R(A,B,C,D,E)
FD = CE -> A
A B C D E
R1 a1 b12 a3 a4 b15
R2 a1 a2 a3 a4 b25
R3 a1 a2 a3 a4 a5
R4 a1 b42 a3 a4 a5
R5 a1 b52 a3 a4 a5
Database Management Systems
Properties of Relational Decompositions (Cont.)
Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Exercise questions
14.24 Consider the universal relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies F
= {{A, B}→{C}, {A}→{D, E}, {B}→{F}, {F}→{G, H}, {D}→{I, J}}. What is the key for R?
Answer:
AB+=CDEFGHIJ
AF+=CDEGHIJ
A+=CDEIJ
A+=CDEIJ
B+=CFGH
Candidate keys {AB,AF}
Prime attributes {A,B,F}
Non-prime attributes {C,D,E,F,G,H,I,J}
Repeat Exercise 14.24 for the following different set of functional dependencies G = {{A, B}→{C}, {B, D}→{E,
F}, {A, D}→{G, H}, {A}→{I}, {H}→{J}}
Database Management Systems
Exercise questions
15.1. What is the role of Armstrong’s inference rules (inference rules IR1 through IR3) in the development
of the theory of relational design?
ans: Armstrong's inference rules (IR1 through IR3) are a set of formal rules used for reasoning about
functional dependencies (FDs) in relational databases. These rules are crucial in developing the theory of
relational design, as they allow database designers to derive new functional dependencies from existing
ones, helping ensure that a relational schema is well-structured. The rules are:
15.2. What is meant by the completeness and soundness of Armstrong’s infer- ence rules?
ans: Soundness means that any dependency derived using Armstrong's rules must be a valid functional
dependency. In other words, the rules should not derive incorrect functional dependencies.
Completeness means that the rules are sufficient to derive all possible valid functional dependencies from a
given set of dependencies. No valid dependencies are left out.
Database Management Systems
Exercise questions
15.3. What is meant by the closure of a set of functional dependencies? Illustrate with an example.
ans: The closure of a set of functional dependencies, denoted as 𝐹+ is the set of all functional dependencies that can
be inferred from the original set of functional dependencies 𝐹 using Armstrong's inference rules.
Example:
Given a set F={𝐴→𝐵,𝐵→𝐶} the closure 𝐹+ would include:
F + ={A→B,B→C,A→C}.
Database Management Systems
Exercise questions
15.4. When are two sets of functional dependencies equivalent? How can we determine their equivalence?
ans: Two sets of functional dependencies 𝐹 and 𝐺 are considered equivalent if they imply the same set of functional
dependencies. In other words, the closure of F (𝐹 + ) must be the same as the closure of 𝐺 (𝐺 +).
15.5. What is a minimal set of functional dependencies? Does every set of dependencies have a minimal equivalent
set? Is it always unique?
Database Management Systems
Exercise questions
15.5. What is a minimal set of functional dependencies? Does every set of dependencies have a minimal equivalent
set?
ans: A minimal set of functional dependencies is a set where:
Each functional dependency has a single attribute on the right-hand side (canonical form).
No functional dependency can be removed without changing the closure.
No attribute on the left-hand side of any functional dependency can be removed without changing the closure.
Does every set of dependencies have a minimal equivalent set?
Yes, every set of functional dependencies has a minimal equivalent set.
Database Management Systems
Exercise questions
7.6 Compute the closure of the following set F of functional dependencies for rela-
tion schema R = (A, B, C, D, E).
A → BC
CD → E
B→D
E→A
ans: Given F:A→BC CD→E B→D E→A We will compute the closure of 𝐴 + A + (the set of attributes that can be
functionally determined by A) using the functional dependencies: Start with 𝐴 + = { 𝐴 } . From 𝐴 → 𝐵 𝐶, add 𝐵 and C
to A + , so 𝐴 + = { 𝐴 , 𝐵 , 𝐶 } . From B→D, add D to A + , so 𝐴 + = { 𝐴 , 𝐵 , 𝐶 , 𝐷 } . From 𝐶 𝐷 → 𝐸, add E to A + , so 𝐴 +
= { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } . From E→A, A + still remains { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } {(no new information is added). Thus, the closure
of 𝐴 + = { 𝐴 , 𝐵 , 𝐶 , 𝐷 , 𝐸 } . Candidate Keys: A candidate key is a minimal set of attributes that can determine all
attributes in the relation. Since A + ={A,B,C,D,E}, the attribute 𝐴 A can determine all other attributes. Therefore, 𝐴 A is
a candidate key for the relation schema R.
Database Management Systems
Exercise questions
7.7 Using the functional dependencies of Exercise 7.6, compute the canonical cover Fc.
ans: A canonical cover is a minimal set of functional dependencies that is equivalent to the original set but satisfies
three properties: Each functional dependency has a single attribute on the right-hand side. There are no redundant
dependencies. There are no extraneous attributes on the left-hand side of any dependency. Given F: 𝐴 → 𝐵 𝐶 , 𝐶 𝐷 →
𝐸 B→D 𝐸 → 𝐴
Step 1: Decompose dependencies with multiple attributes on the right-hand side. 𝐴 → 𝐵 𝐶 becomes two
dependencies: 𝐴 → 𝐵 and A→C. 𝐶 𝐷 → 𝐸 stays the same. B→D stays the same. E→A stays the same. Now we have
the following set: A→B, 𝐴 → 𝐶, 𝐶 𝐷 → 𝐸 , 𝐵 → 𝐷, 𝐸 → 𝐴
Step 2: Remove extraneous attributes. In this case, none of the attributes on the left-hand side of any dependency are
extraneous, so no changes are needed.
Step 3: Remove redundant dependencies. None of the dependencies in this set are redundant because removing any
of them would change the closure of the set. Thus, the canonical cover F c is: 𝐴 → 𝐵 ,𝐴 → 𝐶, 𝐶 𝐷 → 𝐸, 𝐵 → 𝐷, 𝐸 → 𝐴
Database Management Systems
Exercise questions
Set G-
A → CD
E → AH
Solution-
Solution-
Solution-
1. Between the properties of dependency preservation and losslessness, which one must
definitely be satisfied? Why?
A: Losslessness must definitely be satisfied because it ensures that no information is lost during
decomposition. This guarantees that the original relation can be perfectly reconstructed from the
decomposed relations, maintaining data integrity. Dependency preservation is important but
secondary to losslessness.
THANK YOU
Prof Nivedita Kasturi
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Dependency Preservation & Minimal
Cover Examples
Prof. Shilpa S
Department of Computer Science and Engineering
Database Management Systems
Dependency Preservation:
Solution:
In R1 the following dependencies hold: F1’ = { A A, C C, A C,AC AC}
In R2 the following dependencies hold. F2’ = { B B, C C, B C,BC BC}
F’= F1’ U F2’ = {A C, B C, trivial dependencies}
A B can not be derived from F’.
so this decomposition is NOT preserving dependency.
Database Management Systems
Dependency Preservation: Example 2
Question: R = (A, B, C) F= {A B, B C}
Decomposition of R: R1=(A, B) R2=(B, C)
Does this decomposition preserve the given dependencies?
Solution:
In R1 the following dependencies hold: F1’ = { A B, A A, B B, AB AB}
In R2 the following dependencies hold. F2’ = { B B, C C, B C, BC BC}
F’= F1’ U F2’ = {A B, B C, trivial dependencies}
In F’ all the original dependencies occur
so this decomposition preserves dependency.
Database Management Systems
Dependency Preservation: Example 3
•
Database Management Systems
Dependency Preservation: Example 4
Solve the given problems using the above method (Minimal cover).
FD = { B → A, AD → C, C → BD } is Canonical Cover of
FD = { B → A, AD → BC, C → ABD}
THANK YOU
Prof. Shilpa S
Department of Computer Science and Engineering
[email protected]
Database Management Systems
Normal Forms
■ Normalization of Relations
■ Normalization:
■ The process of decomposing unsatisfactory "bad" relations by breaking up their
■ Normal form:
■ Condition using keys and FDs of a relation to certify whether a relation schema is in a
3
Database Management Systems
Normalization of Relations (2)
4
Database Management Systems
Practical Use of Normal Forms
■ Normalization is carried out in practice so that the resulting designs are of high quality and
meet the desirable properties
■ The practical utility of these normal forms becomes questionable when the constraints on which
they are based are hard to understand or to detect
■ The database designers need not normalize to the highest possible normal form
■ (usually up to 3NF and BCNF. 4NF rarely used in practice.)
■ Denormalization:
■ The process of storing the join of higher normal form relations as a base relation—which is
in a lower normal form
Database Management Systems
Definitions of Keys and Attributes Participating in Keys (1)
■ A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-
of R with the property that no two tuples t1 and t2 in any legal relation state r of R will
have t1[S] = t2[S]
■ A key K is a superkey with the additional property that removal of any attribute from
K will cause K not to be a superkey any more.
Database Management Systems
Definitions of Keys and Attributes Participating in Keys (2)
■ If a relation schema has more than one key, each is called a candidate key.
■ One of the candidate keys is arbitrarily designated to be the primary key, and the
others are called secondary keys.
■ A Prime attribute must be a member of some candidate key
■ A Nonprime attribute is not a prime attribute—that is, it is not a member of any
candidate key.
Database Management Systems
First Normal Form
■ Disallows
■ composite attributes
■ multivalued attributes
■ nested relations; attributes whose values for an individual tuple are non-atomic
■ Considered to be part of the definition of a relation
■ Most RDBMSs allow only those relations to be defined that are in First Normal Form
Database Management Systems
Normalization into 1NF
Figure 14.9
Normalization into 1NF. (a) A relation schema that is not in 1NF.
(b) Sample state of relation DEPARTMENT. (c) 1NF version of the
same relation with redundancy.
Database Management Systems
Normalizing nested relations into 1NF
Figure 14.10
Normalizing nested
relations into 1NF. (a)
Schema of the
EMP_PROJ relation
with a nested relation
attribute PROJS. (b)
Sample extension of
the EMP_PROJ relation
showing nested
relations within each
tuple. (c)
Decomposition of
EMP_PROJ into
relations EMP_PROJ1
and EMP_PROJ2 by
propagating the
primary key.
Database Management Systems
3.5 Second Normal Form (1)
Figure
14.11
Normalizing
into 2NF and
3NF. (a)
Normalizing
EMP_PROJ
into 2NF
relations. (b)
Normalizing
EMP_DEPT
into 3NF
relations.
Database Management Systems
Figure 14.12 Normalization into 2NF and 3NF
Figure 14.12
Normalization into
2NF and 3NF. (a)
The LOTS relation
with its functional
dependencies FD1
through FD4.
(b) Decomposing
into the 2NF
relations LOTS1
and LOTS2. (c)
Decomposing
LOTS1 into the
3NF relations
LOTS1A and
LOTS1B. (d)
Progressive
normalization of
LOTS into a 3NF
design.
Database Management Systems
3.6 Third Normal Form (1)
■ Definition:
■ Transitive functional dependency: a FD X -> Z that can be derived from two
FDs X -> Y and Y -> Z
■ Examples:
■ SSN -> DMGRSSN is a transitive FD
■ Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
■ NOTE:
■ In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only
if Y is not a candidate key.
■ When Y is a candidate key, there is no problem with the transitive dependency .
■ E.g., Consider EMP (SSN, Emp#, Salary ).
■ Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
Database Management Systems
Normal Forms Defined Informally
■ Definition:
■ Superkey of relation schema R - a set of attributes S of R that contains a key of
R
■ (a) X is a superkey of R, or
■ (b) A is a prime attribute of R
■ Condition (a) catches two types of violations :
■ The condition (b) from the last slide takes care of the dependencies that “slip
through” (are allowable to) 3NF but are “caught by” BCNF which we discuss next.
Database Management Systems
Interpreting the General Definition of Third Normal Form (3)
dependencies if, for all functional dependencies in F + of the form α → β, where α ⊆ R and
β ⊆ R, at least one of the following holds:
Figure 14.13
Boyce-Codd normal form. (a) BCNF normalization of LOTS1A with
the functional dependency FD2 being lost in the decomposition.
(b) A schematic relation with FDs; it is in 3NF, but not in BCNF
due to the f.d. C → B.
Database Management Systems
Figure 14.14 A relation TEACH that is in 3NF but not in BCNF
Figure 14.14
A relation TEACH that is in 3NF but not BCNF.
Database Management Systems
Achieving the BCNF by Decomposition (1)
If you apply the NJB test to the 3 decompositions of the TEACH relation:
■ D1 gives Student → Instructor or Student → Course, none of which is true.
■ D2 gives Course → Instructor or Course → Student, none of which is true.
■ However, in D3 we get Instructor → Course or Instructor → Student.
Since Instructor → Course is indeed true, the NJB property is satisfied and D3 is
determined as a non-additive (good) decomposition.
Database Management Systems
General Procedure for achieving BCNF
■ Let R be the relation not in BCNF, let X be a subset-of R, and let X → A be the
FD that causes a violation of BCNF. Then R may be decomposed into two
relations:
■ (i) R –A and (ii) X υ A.
■ If either R –A or X υ A. is not in BCNF, repeat the process.
Note that the f.d. that violated BCNF in TEACH was Instructor →Course. Hence its BCNF
decomposition would be :
(TEACH – COURSE) and (Instructor υ Course), which gives
the relations: (Instructor, Student) and (Instructor, Course) that we obtained before in
decomposition D3.
Database Management Systems
Important points for BCNF
• Every binary relation (a relation with only two attributes) is always in BCNF.
• BCNF is free from redundancies arising out of functional dependencies (zero redundancy).
• BCNF decomposition is always lossless but not always dependency preserving.
• Sometimes, going for BCNF may not preserve functional dependencies. So, go for BCNF
only if the lost functional dependencies are not required else normalize till 3NF only.
• There exist many more normal forms even after BCNF like 4NF and more.
• But in the real world database systems, it is generally not required to go beyond BCNF.
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Database Management Systems
Comparison of Normal Forms
Explanation of Terms:
• Atomic Values: Values that cannot be divided further (e.g., an individual email
address as opposed to a list of email addresses).
• Functional Dependency: A relationship where one attribute uniquely determines
another attribute (e.g., StudentID → StudentName).
• Composite Key: A key that consists of more than one attribute.
• Transitive Dependency: A situation where one non-key attribute depends on another
non-key attribute through a chain of dependencies (e.g., A → B and B → C implies
A → C).
Database Management Systems
Multivalued Dependencies and Fourth Normal Form (1)
Definition:
■ A multivalued dependency (MVD) X —>> Y specified on relation schema R,
where X and Y are both subsets of R, specifies the following constraint on any
relation state r of R: If two tuples t1 and t2 exist in r such that t1[X] = t2[X], then
two tuples t3 and t4 should also exist in r with the following properties, where we
use Z to denote (R - (X υ Y)):
■ t3[X] = t4[X] = t1[X] = t2[X].
■ t3[Y] = t1[Y] and t4[Y] = t2[Y].
■ t3[Z] = t2[Z] and t4[Z] = t1[Z].
Note: F+ is the (complete) set of all dependencies (functional or multivalued) that will hold
in every relation state r of R that satisfies F. It is also called the closure of F.
Database Management Systems
Fourth and fifth normal forms.
Definition:
■ A join dependency (JD), denoted by JD(R1, R2, ..., Rn), specified
on relation schema R, specifies a constraint on the states r of R.
■ The constraint states that every legal state r of R should have a non-
additive join decomposition into R1, R2, ..., Rn; that is, for every such r we
have
■ * (πR1(r), πR2(r), ..., πRn(r)) = r
Note: an MVD is a special case of a JD where n = 2.
Definition:
■ A relation schema R is in fifth normal form (5NF) (or Project-Join Normal
Form (PJNF)) with respect to a set F of functional, multivalued, and join
dependencies if,
■ for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied
by F),
■ every Ri is a superkey of R.
Figure 14.15 Fourth and fifth normal forms. (a) The EMP relation with two MVDs: Ename
–>> Pname and Ename –>> Dname. (b) Decomposing the EMP relation into two 4NF
relations EMP_PROJECTS and EMP_DEPENDENTS. (c) The relation SUPPLY with no MVDs is
in 4NF but not in 5NF if it has the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY
into the 5NF relations R1, R2, R3.
Database Management Systems
Chapter Summary
Q. Define first, second, and third normal forms when only primary keys are considered. How do the general
definitions of 2NF and 3NF, which consider all keys of a relation, differ from those that consider only primary
keys?
Answer:
The concepts of First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF)
are important in database normalization to reduce redundancy and ensure data integrity. When considering
only primary keys, following are the definitions and then how they change when considering all keys of a
relation
[cont.]
Database Management Systems
Ex 14.8 [cont]
Answer:
When a relation is in Second Normal Form (2NF), the main undesirable dependency that is avoided is partial
dependency.
Partial Dependency:
A partial dependency occurs when a non-key attribute is dependent on only a part (a subset) of a composite primary key
(in the case of relations with composite keys). This situation can lead to redundancy and anomalies (such as update, insert,
and delete anomalies).
•Example of Partial Dependency:
Suppose we have a table with a composite primary key consisting of A and B. If a non-key attribute C depends only on A,
then we have a partial dependency of C on part of the primary key. In this case, C is only partially dependent on the
composite key (A, B), not fully dependent on the entire key.
Database Management Systems
Ex 14.10
Answer:
When a relation is in Third Normal Form (3NF), the main undesirable dependency
that is avoided is transitive dependency.
Transitive Dependency:
A transitive dependency occurs when a non-key attribute is dependent on another
non-key attribute rather than directly on the primary key (or candidate key).
Specifically, attribute A depends on B, and B depends on the primary key, so A is
transitively dependent on the primary key through B.
Database Management Systems
Ex 14.12
Q. Define Boyce-Codd normal form. How does it differ from 3NF? Why is it considered a stronger form of 3NF?
Answer:
Difference Between BCNF and 3NF:
BCNF is a stricter version of Third Normal Form (3NF). Both aim to eliminate undesirable dependencies, but they
differ in how strict they are regarding functional dependencies.
1.Functional Dependencies Involving Prime Attributes:
•3NF: A relation is in 3NF if, for every functional dependency (X → Y), one of the following conditions is true:
•X is a superkey, or
•Y is a prime attribute (i.e., part of a candidate key).
•BCNF: A relation is in BCNF if, for every functional dependency (X → Y), X must be a superkey. BCNF
does not allow the exception where Y is a prime attribute.
2.Handling of Candidate Keys:
•In 3NF, a relation can still satisfy the normal form if a non-superkey determines a prime attribute.
•In BCNF, such cases violate the form because BCNF requires that the left-hand side (determinant) of every
functional dependency be a superkey.
Database Management Systems
Ex 14.12
Q. Define Boyce-Codd normal form. How does it differ from 3NF? Why is it considered a stronger form of 3NF?
Answer:
•Stricter Dependency Rules: BCNF removes more potential anomalies by ensuring that all functional dependencies
have a superkey on the left-hand side. In contrast, 3NF allows certain functional dependencies that don’t involve a
superkey if the dependent attribute is a prime attribute.
•Eliminates Redundancies and Anomalies: BCNF helps avoid update, insert, and delete anomalies in cases where 3NF
might still permit them due to less strict rules for dependencies involving prime attributes.
Database Management Systems
Ex 14.22
Answer:
Let’s consider a relation schema R(A,B which has only two attributes: A and B. We need to show that such a relation
is always in Boyce-Codd Normal Form (BCNF).
1.Case 1: A→B
1. If A→B, then A must uniquely determine B.
2. Since A can determine B, A is a superkey (because it uniquely identifies the whole tuple (A,B)).
3. Therefore, the dependency A→B satisfies the BCNF condition because the determinant A is a superkey.
2.Case 2: B→A
1. If B→A, then B must uniquely determine A.
2. Since B can determine A, B is a superkey (because it uniquely identifies the whole tuple (A,B)).
3. Therefore, the dependency B→A satisfies the BCNF condition because the determinant B is a superkey.
Step-1 : Add the attributes which are present on Left Hand Side in the original
functional dependency.
Step-2 : Now, add the attributes present on the Right Hand Side of the
functional dependency.
Step-3 : With the help of attributes present on Right Hand Side, check the other
attributes that can be derived from the other given functional dependencies.
Repeat this process until all the possible attributes which can be derived are
added in the closure.
• Seems difficult? Check out the example explained below and it will surely
clear your doubt on how to calculate closure of functional dependency.
Now, We will calculate the closure of all the attributes present in the relation
using the three steps mentioned below.
Step-1 : Add attributes present on the LHS of the first functional dependency
to the closure.
{Roll_no}+ = {Roll_No}
Step-2 : Add attributes present on the RHS of the original functional
dependency to the closure.
{Roll_no}+ = {Roll_No, Marks}
Step-3 : Add the other possible attributes which can be derived using attributes
present on the RHS of the closure. So Roll_No attribute cannot functionally
determine any attribute but Name attribute can determine other attributes
such as Marks and Location using 2nd Functional Dependency(Name [icon
name="long-arrow-right" class="" unprefixed_class=""] Marks, Location).
Therefore, complete closure of Roll_No will be :
Similarly, we can calculate closure for other attributes too i.e “Name”.
Step-1 : Add attributes present on the LHS of the functional dependency to the
closure.
{Name}+ = {Name}
Step-2 : Add the attributes present on the RHS of the functional dependency to
the closure.
{Name}+ = {Name, Marks, Location}
NOTE : We don’t have any Functional dependency where marks and location
can functionally determine any attribute. Hence, for those attributes we can
only add the attributes themselves in their closures. Therefore,
{Marks}+ = {Marks}
and
{Location}+ = { Location}
Example-2 : Consider a relation R(A,B,C,D,E) having below mentioned
functional dependencies.
FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E}
FD1 : A B
FD2 : B C
{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the attributes
present in the relation “R”.
FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}
In this case, a single attribute is unable to determine all the attribute on its own
like in previous example. Here, we need to combine two or more attributes to
determine the candidate keys.
NOTE : Any relation “R” can have either single or multiple candidate keys.
FD1 : A BC
FD2 : B C
FD3 : D C
Prime Attributes : A, D.
Non-Prime Attributes : B, C
Extraneous Attributes : B, C(As if we add any of the to the candidate key, it will
remain unaffected). Those attributes, which if removed does not affect closure
of that set.
Algorithm
Let’s see the algorithm to compute X+
• Step 1 − X+ =X
• Step 2 − repeat until X+ does not change
o For each FD Y->Z in F
▪ If Y ⊆ X+ then X+ = X+ U Z
Example 1
Consider a relation R(A,B,C,D,E,F)
Solution
The closure of E or E+ is as follows −
E+ = E
=EA {for E->A add A}
=EAD {for E->D add D}
=EADC {for A->C add C}
=EADC {for A->D D already added}
=EADCF {for AE->F add F}
=EADCF {for AG->K don’t add k AG ⊄ D+)
Example 2
Let the relation R(A,B,C,D,E,F)
Solution
The closure for B is as follows −
B+ = {B,C,A,D,E}
For example,
Solution
A+= {A,B,C,D,E,F}={R}=>A is a candidate key
Closure of F (F+): F+ is the set of all FDs that can be inferred/ derived from
F. Using Armstrong Axioms repeatedly on F, we can compute all the FDs.
Example
R(A,B,C,D,E) AND F: A->B,B->C, C->D, A->E. Find the closure of F
Solution
A+= {A,B,C,D,E}
B+= {B,C,D}
C+= {C,D}
F+= {A->A, A->B, A->C, A->D, A->E, B->B, B->C, B->D, C->C, C->D}
In DBMS,
• Two different sets of functional dependencies for a given relation may or may not be
equivalent.
• If F and G are the two sets of functional dependencies, then following 3 cases are
possible-
Case-01: F covers G (F ⊇ G)
Case-02: G covers F (G ⊇ F)
Case-03: Both F and G cover each other (F = G)
Step-01:
Step-02:
Step-03:
Step-01:
Step-02:
Step-03:
Problem-
Set G-
A → CD
E → AH
Solution-
Step-01:
Step-02:
Step-03:
Step-01:
Step-02:
Step-03:
As we have got same closure for including AB→C and Excluding AB→C. then we
can drive AB→C from given FD without AB→C so it’s a redundant FD. Revome it
from FD
AB→C is redundant
Therefore updated FD is FD1 = {A → C,C → D ,C → I ,CD → I, EC → A, EC →B,
EI → C}
Now for below calculation use FD1
Here consider the FD where we have more than one attribute on LHS
CD → I, EC → A, EC →B, EI → C
a. CD→I
Find closure for CD+, C+ and D+ from FD2. check if any closure has got same
attributes, then only that attribute alone can be sufficient for FD
CD+ = {CDI}
C+= {CDI}
D+= {D}
In above result the C+ has same set of attributes as CD+ so C attribute alone can
determine the I. hence D is an extraneous attribute.
Consider FD as C→I.
b. EC→A
EC+={ECABID}
E+={E}
C+={CDI}
In above closure we didn’t get closure of single attribute similar to EC so no
extraneous attribute.
c. EC→B
Find closure of EC+, E+ and C+ using FD3
EC+={ECBAID}
E+={E}
C+={CDI}
d. EI→C
Find closure of EI+, E+ and I+
EI+={EICDIAB}
E+={E}
I+={I}
In above closure we didn’t get closure of single attribute similar to EI so no
extraneous attribute