UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
Fig 2.1
END GOAL: RELATIONAL MODEL
Fig 2.2
Domain:
A domain D is the original sets of atomic values used to model data.
By a atomic, we mean that each value in the domain is indivisible as
far as the relational model is concerned. A domain is a pool of the
values from where one or more attributes ( or columns) can draw their
actual values.
For Examples: The domain of name is the set of character strings that
represents names of person.
Tuple:
According to the model, every relation or table is made up of many
tuples.
They are also called the records. They are the rows that a table is
made up of.
Relation
A relation is a subset of the Cartesian product of a list of domains
characterized by a name. Given ‘n’ domains denoted by D1, D2, ..., Dn,
R is a relation defined on the these domains if R ≤ D1 * D2 * ... *Dn.
Relation can be viewed as a “table”.
In that table, each row represents a tuple of data values and each
column represents an attribute.
Attribute
A column of a relation designated by name. The name associated should be
meaningful. Each attributes associates with a domain. A relation schema
denoted by R is a list of attributes (A1, A2,...., An).
The data shows that there are employees with no mobile number because
the number might be unknown at this point of time.
Integrity Constraints over Relations
An integrity constraint (IC) is a condition specified on a database schema
and restricts the data that can be stored in an instance of the database. If a
database instance satisfies all the integrity constraints specifies on the
database schema, it is a legal instance. A DBMS permits only legal
instances to be stored in the database.
Many kinds of integrity constraints can be specified in the relational model:
Domain Constraints:
A relation schema specifies the domain of each field in the relation instance.
These domain constraints in the schema specify the condition that each
instance of the relation has to satisfy: The values that appear in a column
must be drawn from the domain associated with that column. Thus, the
domain of a field is essentially the type of that field.
Key Constraints:
A Key Constraint is a statement that a certain minimal subset of the fields
of a relation is a unique identifier for a tuple.
Super Key:
An attribute or set of attributes that uniquely identifies a tuple within a
relation.
However, a super key may contain additional attributes that are not
necessary for a unique identification.
Ex: Tthe customer_id of the relation customer is sufficient to distinguish one
tuple from other. Thus, customer_id is a super key. Similarly, the
combination of customer_id and customer_name is a super key for the
relation customer. Here the customer_name is not a super key, because
several people may have the same name.
We are often interested in super keys for which no proper subset is a super
key. Such minimal super keys are called candidate keys.
Candidate Key:
A super key such that no proper subset is a super key within the relation.
There are two parts of the candidate key definition:
i. Two distinct tuples in a legal instance cannot have identical values in
all the fields of a key.
ii. No subset of the set of fields in a candidate key is a unique identifier
for a tuple.
A relation may have several candidate keys.
Ex: The combination of customer_name and customer_street is sufficient to
distinguish the members of the customer relation. Then both, {customer_id}
and {customer_name, customer_street} are candidate keys. Although
customer_id and customer_name together can distinguish customer tuples,
their combination does not form a candidate key, since the customer_id
alone is a candidate key.
Primary Key:
The candidate key that is selected to identify tuples uniquely within the
relation. Out of all the available candidate keys, a database designer can
identify a primary key. The candidate keys that are not selected as the
primary key are called as alternate keys.
Features of the primary key:
i. Primary key will not allow duplicate values.
ii. Primary key will not allow null values.
iii. Only one primary key is allowed per table.
Ex: For the student relation, we can choose student_id as the primary key.
Foreign Key:
Foreign keys represent the relationships between tables. A foreign key is a
column (or a group of columns) whose values are derived from the primary
key of some other table.
The table in which foreign key is defined is called a Foreign table or Details
table. The table that defines the primary key and is referenced by the foreign
key is called the Primary table or Master table.
Features of foreign key:
Records cannot be inserted into a detail table if corresponding records in
the master table do not exist.
Records of the master table cannot be deleted or updated if
corresponding records in the detail table actually exist.
General Constraints:
Domain, primary key, and foreign key constraints are considered to be a
fundamental part of the relational data model. Sometimes, however, it is
necessary to specify more general constraints.
For example, we may require that student ages be within a certain range of
values. Giving such an IC, the DBMS rejects inserts and updates that violate
the constraint.
Current database systems support such general constraints in the form of
table constraints and assertions. Table constraints are associated with a
single table and checked whenever that table is modified. In contrast,
assertions involve several tables and are checked whenever any of these
tables is
Storing the same information in more than one place within a database is
called redundancy and can lead to several problems:
Redundant Storage: Some information is stored repeatedly.
Update Anomalies: If one copy of such repeated data is updated, an
inconsistency is created unless all copies are similarly updated.
Insertion Anomalies: It may not be possible to store certain
information unless some other, unrelated, information is stored as
well.
Deletion Anomalies: It may not be possible to delete certain
information without losing some other, unrelated, information as well.
The key for Hourly_Emps is ssn. In addition, suppose that the hourly_wages
attribute is determined by the rating attribute. That is, for a given rating
value, there is only one permissible hourly_wages value. This IC is an
example of a functional dependency. It leads to possible redundancy in the
relation Hourly_Emps, as shown below:
If the same value appears in the rating column of two tuples, the IC tells us
that the same value must appear in the hourly_wages column as well. This
redundancy has the following problems:
Null Values
Null values cannot provide a complete solution, but they can provide some
help.
Consider the example Hourly_Emps relation. Here null values cannot help to
eliminate redundant storage, update or deletion anomalies. It appears that
they can address insertion anomalies. For instance, we can insert an
employee tuple with null values in the hourly wage field. However, null
values cannot address all insertion anomalies. Thus, null values do not
provide a general solution to the problems of redundancy, even though they
can help in some cases.
Decompositions
Wages(rating, hourly_wages)
To answer a first question, several normal forms have been proposed for
relations. If a relation schema is in one of these normal forms, we know that
certain kinds of problems cannot arise.
Functional Dependencies
An FD X→Y says that if two tuples agree on the values in attributes X, they
must also agree on the values in attributes Y.
A B C D
1
X→ Y is read as X functionally determines Y, or simply as X determines Y.
a1 b1 c1 d1
a1 b1 c1 d2
a1 b2 c2 d1
a2 b1 c3 d1
Here, if we add a tuple <a1, b1, c2, d1> to the instance shown in figure, the
resulting instance would violate the FD.
Given a set of FDs over a relation schema R, typically several additional FDs
hold over R whenever all of the given FDs hold.
With given FDs ssn → did and did→ lot. Then, in any legal instance of
Workers, if two tuples have the same ssn value, they must have the same
did value, and because they have the same did value, they must also have
the same lot value. Therefore, the FD ssn→ lot also holds on Workers.
The set of all FDs implied by a given set F of FDs is called the closure of F,
denoted as F+. The closure F+ can be calculated by using the following
Armstrong’s Axioms rules. Let X, Y, and Z be the sets of attributes over a
relation schema R:
Ex2:
i) C→ CSJDPQV.
ii) JP→C.
iii) SD→P.
Several additional FDs hold in the closure of the set of given FDs:
Note:
In a trivial FD, the right side contains only attributes that also
appear on the left side. Using reflexivity, we can generate all trivial
dependencies, which are of the form:
Attribute Closure
If we want to check whether a given dependency, say, X→Y, is in the closure
F+, we can do so efficiently without computing F+. We first compute the
attribute closure X+ with respect to F, which is the set of attributes A such
that X→A can be inferred using Armstrong Axioms. The algorithm for
computing the attribute closure of a set X of attributes is shown below:
closure = X;
(1) It deals with the 1-1 Relationship between attributes and rearly it will
also talk about 1-M
(2) F.D must be defined on the scheme but not instances.
(3) F.D must be a Non-trivial.
(4) In trivial F.D RHS is a complete subset of LHS Eg: ABC BC
(5) In non-trivial F.D at least one of the RHS attributes is not a subset of
LHS Eg: ABC BD
(6) In a complete non-trivial F.D none of the RHS attributes are the
subset of LHS Eg: ABC DE
Once the F.D’s are identified from semantics then additional F.D’s can
be derived from the existing set.
Eg: F1 from semantics and F2 from F1 then total F.D’s= F1+ F2.The
input for the normalization process should be F1+ F2. F2 can be
identified in the different ways.
(1) By using interference rules.
(2) By closure set of attributes.
INFERENCE RULES
(1) Reflexive: If ‘B’ is a subset of ‘A’ then always ‘A’ can determine ‘B’
AB
(2) Augmentation: If A B then AC BC
(3) Transitive: If A B and B C then A C
(4) Union: It is applied for the LHS attributes i.e., If A B , A C then
A BC
(5) Decomposition: If A BC then we can write it as A B and A C
(6) Composition: If A B and C D then, AC BD
(7) Self determination: A A , B B
1Q Find the additional F.D’s derived from F1 where a set of F.D’s from
semantics.
(1) A B
A C
( 2) B C
(3) C D AC
( 4) D E F2= D EH
D EH
(5) D H FH
( 6) E F
(7) F G
F H
(8) G H
(1) Let ‘X’ be a set of attributes that will become the closure.
(2) Repeatedly search for a F.D where the LHS of F.D is a part of ‘X’ then
add RHS of the F.D to ‘X’ is already not available.
(3) Repeat step (2) as many times as necessary until no more attributes
can be added to ‘X’.
(4) The set ‘X’ after no more attributes can be added to ‘X’ will become a
closure set.
AB
BC DE
AEG G
Find AC
Ans: X= AC
= ACB
= ABCDE
AC = ABCDE
A BC
CD E
BD
EA
Find B , AB & CD
Ans: X =B
=BD
B+ =BD
Find AB+
X = AB
= ABC
= ABCD
=ABCDE
AB = ABCDE
Find CD+
X = CD
= CDE
= ACDE
=ABCDE
CD = ABCDE
X= AB
= ABC
= ABCD
=ABCDE
AB = ABCDE
F1:
AB C
BC AD
DE
CF B
i. D A
X =D
=DE
D+=DE
D A i.e. Cannot be determine A
ii. AB D
X=AB
=ABC
=ABCD
AB+=ABCD
iii.AB F
X=AB
=ABC
=ABCD
=ABCDE
AB+=ABCDE
AB↛ F cannot be determine F
Sol:
i. X=BCD
=BCDE
=BCDEH
=BCDEAH
BCD+=BCDEAH
BCD H
(ii) ABC H
X=ABC
ABC H
(2) Identification of key by using closure set as attributes
A BC
CD E
EC
D AEH
AEH BD
DH BC
Find keys?
Super key=ABCDEH
Now find A+=ABC
E+=EC
D+=DAEH
=ABCDEH
D is key
If the closure of any of the LHS attributes are combinations of
the LHS attributes includes all the attributes in the table then
that will become a key in the table.
A table can have 2 or more keys.
Q2 Consider a relation with five attributes ABCDE and FDs are
AB
BC E
ED A
A+=AB X
BC+=BCE X
ED+=ABDE X
AB+=AB X
AC+=ABCE X
BD+=BD X
ABC+=ABCE
BCD+=BCDEA
ACD+=ABCDE
CDE+=ADEBC
Q3 R(ABCDE) & FDs are
AB C
CD E
DE B
AB+=ABC
CD+=CDE
DE+=DEB
ABC+=ABC
ABD+=ABDCE
ABE+=ABEC
ACD+=ACDEB
ABD & ACD are keys
AB C
A DE
BF
F GH
D RJ
Note:
Sometimes all the attributes in the table may not appear in F.D’s
AB+=ABCDEFGHIJ
The key for the relation R=ABJ the missing attributes from the F.D’s
must be attached to the closure.
AB C
BD EF
AD GH
A I
HJ
Sol:
Super key=ABDH
AB+=ABC X
BD+=BDEF X
AD+=ADGHIJ X
ABD+=ABDCEFGHIJ
ABH+=ABHCIJ X
ABD+=ABCDEFGHIJ
Different database designers may define different F.D’s sets from the
same requirements.To evaluate whether they are equivalent if we are
able to derive all F.D’s in G from F and vice-versa.
AC
AC D
F=
E AD
EH
A CD
G=
E AH
Step 1: Take set F and enclose all FD’s in G that can be derived from F.
A CD
A+ from F
X=A
=AC
=ACD
A CD can be derived from F
E AH
E+ from F
X=E
=EAD
=EADH
E AH can be derived from F
Step 2: Take set G and enclose all F.D’s in F that can be derived from G.
A C
A+ from G
X=A
=ACD
A C can be derived from G
X=AC
=ACD
E AD
X=E
=EAH
=EAHCD
E AH & E ADfromG
B CD
F= AD E
BA
B CDE
G= B ABC
AD E
Sol:
Step 1:
B CDE
B+ from F
X=B
=BCDA
=ABCDE
All FD’s are derivable from F.
Step 2:
B CD
B+ from G
X=B
=BCDE
=ABCDE
All FD’s are derivable from G.
F G
F is preferable
No of dependencies are less.
Step 2: Evaluate all F.D’s in step 1 for their necessity. If they are not
necessary, remove them from the list.
Step 3: Evaluate the necessity of the RHS attributes in FD’s obtained from
step 2.If they are not necessary remove from FD.
Step 4: Apply the union rule for common to LHS attribute in the FD’s
obtained from step 3.Then we will get irreducible set.
F=
AB
CB
D ABC
AC D
Sol:
Step 1:
(1) A B
(2) C B
(3) D A
(4) D B
(5) D C
(6) AC D
Step 2:
Remove 1 & compute A+ from2, 3,4,5,6
A+=A
We need 1
Remove 2 and compute 1, 3, 4, 5&6
C+=C
We need 2.
Remove 3 and compute D+ from 1, 2, 4, 5&6
D+=DBC
We need 3.
Remove 3 and compute D+ from 1, 2, 4, 5&6
D+=DBC
We need 3.
Remove 4 and compute D+ from 1, 2, 4, 5&6
D+=ADCB
D B can be removed.
Remove 5 and compute D+ from 1, 2, 3,4&6
D+=ABD
We need 5.
Remove 6 and compute D+ from 1, 2,3, 4, 5
AC+=ACB
We need 6.
Step 3:
AB
CB
DA
DC
AC D
Remove A
AB AB
CB CB
DA DA
DC DC
CD AC D
C+=CDAB C+=CB
C+ C+
Remove C
AB AB
CB CB
DA DA
DC DC
AD AC D
A+=ADCB A+=AB
A+ A+
Step 4:
AB
CB
DA
DC
AC D
AB
CB
Therefore, it is an irreducible F.D.
D AC
AC D
AB C
C B
A B
Find the Irreducible set
Sol:
Step 2:
Remove A
AB C B C
C B C B
A B A B
B+=B B+=BC
B+ B+
Remove B
AB C A C
C B C B
A B A B
A+=ABC A+=ACB
A+=A+
B can be removed
Step 4:
A C
C B
A B
Q3 FDs are
F= ABD AC
C BE
AD BF
B E
Find the minimal set
Step 1:
ABD A
ABD C
C B
C E
AD B
AD F
B E
Remove (1) & compute ABD+ from (2-7)
ABD+ =ABDCEF
(1) can be removed
Remove (2) & compute ABD+ from (1,3-7)
ABD+ =ABDEF
We need (2)
Remove (3) & compute C+ from (1,2,4-7)
C+ =CE
We need (3)
Remove (4) & compute C+ C+ =BCE
(4) Can be removed
Remove (5) & compute AD+
AD+ =ADF
We need (5)
Remove (6) & compute AD+
AD+ =ADBCE
We need (6)
Remove (7) & compute B+
B+ =B
We need (7)
Step 3:
ABD C
C B
AD B
AD F
B E
Remove A
ABD C BD C
C B C B
AD B AD B
AD F AD F
B E B E
BD+=BDE BD+=BDCE
BD+ BD+
Remove B
ABD C AD C
C B C B
AD B AD B
AD F AD F
B E B E
AD+=ABFECD AD+=ADCFBE
AD+= AD+
B can be removed.
Eg:R=ABCD
F: AB C
AB D
C D
Key=AB
C d Is a transitive dependency and it causes insertion, deletion &
updation problems in the table.
NORMALIZATION
OR
It is a tool to validate or evaluate the logical database design with the
help of rules which are called as Normal Forms. They are
1 NF
2 NF
3 NF
BCNF Problem intensity reduces and no. of tables needed will be
increased.
4 NF
5 NF
DKNF
Points to be Remember
F=AB C
A DE
B F
F GH
D IJ
Sol: key=AB
AB=CDEFGHIJ
Step 1: A B C D E F G H I J
Or
A+=ADEIJ
B+=BFGH
R 1 = ADEIJ
(c) R 2 = BFGH R=ABC
R 3 = ABC
Required 2 NF
(a) Key=AB
(b) A+=ACDF
B+=BE
R1 = ACDF
(c) R 2 = BE R=AB
R 3 = AB
Requried 2 NF
Q3 Consider the relation R=ABCDE. Find the key and normalize upto 2NF
F=B E
C D
A B
Sol:(A) KEY=AC
(B) A+=ABE
C+=CD
(C) R=A
R 1 = ABE
R 2 = CD Required 2 NF
R 3 = AC
F=AB C
BD EF
AD GH
A I
H J
Find the key and normalize upto 2NF
Transitive D IJ ADEIJ
DIJ
ADE
BFGH FGH
BP
ABC ABC
R 1 = DIJ R 1 = DIJ
R 2 = ADE R 2 = ADE
R 3 = FGH R 3 = FGH Iti sin 3NF
R 4 BF R 4 BF
R 5 ABC R 5 ABC
R1 = CD
R 2 = ACF
in3NF
R 3 = BE
R 4 AB
Q3 Consider the relation R=ABCDE. Find the key and normalize upto 3NF
F=B E
C D
A B
R1=ABE BE
AB
R2=CD CD
R3=AC AC
R 1 = BE
R 2 = AB
Iti sin 3NF
R 3 = CD
R 4 AC
F=AB C
BD EF
AD GH
A I
H J
Normalize upto 3NF
R1=AI
R2=ABC
R3=BDEF
R4=ADGHJ HJ
ADG
R5=ABD
R 1 = AI
R 2 = ABC
R 3 = BDEF
Iti sin 3NF
R 4 HJ
R 5 ADG
R 6 ABD
Q5(a) Give a set of FDs for the relation schema R(ABCD) with primary
key AB under which R is 1NF but not in 2NF
(b) Find FDs such that R is in 2NF but not in 3NF
R=ABCD
Key=AB
Sol: (a) with these FD’s table cannot be in 2NF
B C A C
B D A D
(b ) with these FD’s the table may be in 2NF but not in 3NF
C D D C
Note 1: In general if x ; if AorB and x A or x b (key=AB) then it will
violate 2 NF
Note 2: In general, if I have x→ ; with A or B and x is not a proper
set of AB then it will violet 3NF but not 2 NF.
Note 3: If there is a F.D x y,It is allowed in 3NF(also in 2NF) if x is a
super key or y is a part of key.
Q:
R:
A B C
a1 b1 c1
a2 b2 c2
a3 b1 c3
R1
A B
a1 b1
a2 b2
a3 b1
R2
`
B C
b1 c1
b2 c2
b1 c3 ∴ R R1 (R) R 2 (R)
A B C
a1 b1 c1
a1 b1 c3
a2 b2 c2
a3 b1 c1
a3 b1 c3
5 rows
Lossy decomposition
The above method is a time consuming and error prone there fore,to check
the lossless joint property we use the following short cut method.
i.e, If the common column b/w the relation consists unique value(only
primary key willwill have unique values) then the decomposition will become
lossless otherwise it is a lossy decomposition.
R1 R 2 R1
or
R1 R 2 R 2
Sol: R1 (ABC)
R2 (CD)
A B (R1)
A C (R1)
C D (R2)
Q:
(1) R=ABCD
A D
C A
B C
Sol:
(a) Key=b
(b) At present A,B,C,D one in 2NF
(c) BACD
(2)B C
D A
Sol:
(a) Key=BD
(b) 1 NF
(c) 2 NF=BC,DA,BD
3 NF= BC, DA, BD
(d) BCNF= BC,DA,BD
BCD DA
Lossless decomposition.
(3) ABC D
D A
Sol:
(a) Key=ABC,BCD
(b) let key=ABC
3 NF
(c) BCNF=ABC
Super key ABC D
D A
Q. R ABCD
AB
BC D
A C
Sol:
a). key=A
b). 2NF
c). 3NF=BCD,ABC
d). BCNF:
A A
BCNF
BC BC
Q). AB C
AB D
C A
D B
Sol:
a). key=AB,BC,CD,AD
3NF
c).AB AB
C C A X
D D B X
OR
Sol is 3NF
Q). AB CEFG
A D
F G
FB H
HBC ADEFG
FBC ADE
Sol:
a).key = AB,HBC,FBC
ABCEFGH
2NF=AD,ABCEFGH
c). 3NF=AD,ABCEFGH FG
ABCEFH
FBH
D).BCNF
AB A
A F
F AB
FB
HBC
FBC
BCNF=FG,AD,FBH,ABCEF
Q). R=ABCD
R1=BC &AD(R2)
(1).
B C
D A
Sol:
(1) a).key=BD
b).R1 R2=0
lossy decomposition
bad decomposition
(2).AB C
C A R1=ACD
C D R2=BC
A). key=AB,BC
b). R1 R2=C
(3). A BC R1=ABC
C AD R2=AD
A).KEY=A
b). R1 R2=A
(4). A B
B C R1=AB
C D R2=ACD
Sol:
a).key= A
b). R1 R2=A
Lossless
FD not preserved B C
Q).
A B R1=AB
B C R2=AD
C D R3=CD
Sol:
It is a lossy decomposition
Q).
R=ABCDE
AB DE
A C
D E
Sol:
(a) Key=AB
(b) 2 NF=ABDE,AC
(c) 3 NF=ABD,DE,AC
(d) BCNF=AB AB
D A
D A
Q)
AB CDE
C A
D E
Sol:
(a) Key=AB,BC
(b) choose AB as a key and no partial dependencies
3 NF=ABCD, DE
BCNF= ABCD,CA,DE
Limitations of Normalization:
{ 040-222222,
111 040-222222 BA
111 040-333333 BA
A →→ B
A →→ C
For example, we specify the MVD in the above Employee relation as follows:
EmpNum →→ EmpPhone
EmpNum →→ EmpDegrees
If a relation R satisfies X →→Y2, the following must be true for every legal
instance of r of R:
if for any two tuples t1, t2 and t1(X) = t2(X), then there exist t3 in r such
that
t3(X) = t1(X), t3(Y) = t1(Y), t3(Z) = t2(Z).
By symmetry, there exist t4 in r such that, t4(X) = t1(X), t4(Y) = t2(Y), t4(Z) =
t1(Z).
2
X→→Y can be read as X multi-determines Y.
X Y Z
x1 y1 z1 t1
x1 y2 z2 t2
x1 y1 z2 t3
x1 y2 z1 t4
The MVD X→→ Y says that the relationship between X and Y is independent
of the relationship between X and R─ Y.
Trivial MVD:
a) B is a subset of A or
b) A U B = R.
Y is a subset of X or XY = R, or
X is a super key.
Example:
Emp1
Emp2 EmpNum EmpDegrees
111 040-
333333
Both new relations are in 4NF because the Emp1 relation contains the trivial
MVD EmpNum→→EmpPhone, and the Emp2 relation contains the trivial
MVD EmpNum→→EmpDegrees.
Properties of Decomposition
Lossless-Join Decomposition:
From the definition it is easy to see that r is always a subset of natural join
of decomposed relations. If we take projections of a relation and recombine
them using natural join, we typically obtain some tuples that were not in the
original relation.
Example:
By replacing the instance r shown in figure with the instances ∏SP (r) and
∏PD (r), we lose some information.
S P D S P P D
s1 p1 d1 s1 p1 p1 d1
s2 p2 d2 s2 p2 p2 d2
s3 p1 d3 s3 p1 p1 d3
S P D
s1 p1 d1
s2 p2 d2
s1 p1 d3
s3 p1 d1
Theorem: Let R be a relation and F be a set of FDs that hold over R. The
decomposition of R into relations with attribute sets R1 and R2 is lossless if
and only if F+ contains either the FD R1 ∩ R2 → R1 (or R1─R2) or the FD R1
∩ R2 → R2 (or R2─R1).
Dependency-Preserving Decomposition:
Consider the Contracts relation with attributes CSJDPQV. The given FDs are
C→CSJDPQV, JP→C, and SD→P. Because SD is not a key, the dependency
SD→P causes a violation of BCNF.
We can decompose Contracts into relations with schemas CSJDQV and SDP
to address this violation. The decomposition is lossless-join. But, there is
one problem. If we want to enforce an integrity constraint JP→C, it requires
an expensive join of the two relations. We say that this decomposition is not
dependency-preserving.
Let R be a relation schema that is decomposed into two schemas with
attributes sets X and Y, and let F be a set of FDs over R. The projection of
F on X is the set of FDs in the closure F+ that involve only attributes in X.
We denote the projection of F on attributes X as FX . Note that a dependency
U→V in F+ is in FX only if all the attributes in U and V are in X.
Example:
The closure of F contains all dependencies in F plus A→C, B→A, and C→B.
Consequently FAB contains
A→B and B→A, and FBC contains B→C and C→B. Therefore, FAB U FBC
contains A→B, B→C, B→A
and C→B. The closure of FAB and FBC now includes C→A (which follows from
C→B and B→A). Thus