0% found this document useful (0 votes)
11 views

DBMS Unit-5

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

DBMS Unit-5

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

PARULINSTITUTEOF ENGINEERING &TECHNOLOGY

FACULTY OF ENGINEERING & TECHNOLOGY


PARULUNIVERSITY

Database Management System


Relational Database Design
Computer Science & Engineering

Nilesh Khodifad
Outline
• Functional Dependency (FD) and its Types
• Armstrong's axioms OR Inference rules
• Closure of a set of FDs
• Closure of attribute sets
• Decomposition
• Anomaly and its types
• Normalization and normal forms
What is Functional Dependency(FD)
 Let R be a relation schema having n attributes A1, A2, A3,…, An.
Student

 Let attributes X and Y are two subsets of attributes of relation R.


 If the values of the X component of a tuple uniquely (or functionally) determine the values
of the Y component, then there is a functional dependency from X to Y.
 This is denoted by X → Y (i.e RollNo → Name, SPI, BL).
 It is referred as: Y is functionally dependent on the X or X functionally determines Y.
Diagrammatic representation of Functional Dependency (FD)

 Example
 Consider the relation Account(account_no, balance, branch).
 account_no can determine balance and branch.
 So, there is a functional dependency from account_no to balance and branch.
 This can be denoted by account_no → {balance, branch}.
Types of Functional Dependency (FD)
 Full Functional Dependency
 In a relation, the attribute B is fully functional dependent on A if B is functionally
dependent on A, but not on any proper subset of A.
 Eg. {Roll_No, Semester, Department_Name} → SPI
 We need all three {Roll_No, Semester, Department_Name} to find SPI.
 Partial Functional Dependency
 In a relation, the attribute B is partial functional dependent on A if B is functionally
dependent on A as well as on any proper subset of A.
 If there is some attribute that can be removed from A and the still dependency holds
then it is partial functional dependancy.
 Eg. {Enrollment_No, Department_Name} → SPI
 Enrollment_No is sufficient to find SPI, Department_Name is not required to find SPI.
Types of Functional Dependency (FD)
 Transitive Functional Dependency
 In a relation, if attribute(s) A → B and B → C, then A → C (means C is transitively
depends on A via B).

 Eg. Subject → Faculty & Faculty → Age then Subject → Age


 Therefore as per the rule of transitive dependency: Subject → Age should hold, that
makes sense because if we know the subject name we can know the faculty’s age.
Types of Functional Dependency (FD)
 Trivial Functional Dependency
 X → Y is trivial FD if Y is a subset of X
 Eg. {Roll_No, Department_Name, Semester} → Roll_No
 Nontrivial Functional Dependency
 X → Y is nontrivial FD if Y is not a subset of X
 Eg. {Roll_No, Department_Name, Semester} → Student_Name
Armstrong's axioms OR Inference rules
 Armstrong's axioms are a set of rules used to infer (derive) all the
functional dependencies on a relational database.
What is closure of a set of FDs?
 Given a set F set of functional dependencies, there are certain other
functional dependencies that are logically implied by F.
 E.g.: F = {A → B and B → C}, then we can infer that A → C (by
transitivity rule)
 The set of functional dependencies (FDs) that is logically implied by
F is called the closure of F.
 It is denoted by F+.
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency A → H is logical implied.

We have

A→B Transitivity
rule A→H
B→H
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency CG → HI is logical implied.

We have
CG → H
Union rule CG → HI
CG → I
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency AG → I is logical implied.

We have
A→C Pseudo-
AG → I
CG → I transitivity rule
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)
 The functional dependency AG → I is logical implied.

We have
A→C Augmentation
rule AG → CG

AG → CG AG → I
Transitivity rule
CG → I
Closure of a set of FDs [Example]
 Suppose we are given a relation schema R(A,B,C,G,H,I) and the set of
functional dependencies are:
 F = (A → B, A → C, CG → H, CG → I, B → H)
 Find out the closure of F.

F+ = (A → H, CG → HI, AG → I)
Closure of a set of FDs [Example]
 Compute the closure of the following set F of functional dependencies
FDs for relational schema R = (A,B,C,D,E,F):
 F = (A → B, A → C, CD → E, CD → F, B → E)
 Find out the closure of F.
Closure of a set of FDs [Example]
 Compute the closure of the following set F of functional
dependencies FDs for relational schema R = (A,B,C,D,E):
 F = (AB → C, D → AC, D → E )
 Find out the closure of F.

F+ = (D → A, D → C, D → ACE)
What is a closure of attribute sets?
 Given a set of attributes α, the closure of α under F is the set of
attributes that are functionally determined by α under F.
 It is denoted by α+. Algorithm

 Algorithm to compute α+, the closure of α under F


 Steps
1. result = α
2. while (changes to result) do
 for each β → γ in F do
 begin
• if β ⊆ result then result = result U γ
• else result = result
 end
Closure of attribute sets [Example]
 Consider the relation schema R = (A, B, C, G, H, I).
 For this relation, a set of functional dependencies F can be given as
F = {A → B, A → C, CG → H, CG → I, B → H}
 Step 1.
 Find out the closure of (AG)+.
result = α => result = AG
Algorithm
 Algorithm to compute α+, the closure of α under F
 Steps
1. result = α
2. while (changes to result) do
 for each β → γ in F do
 begin
• if β ⊆ result then result = result U γ
• else result = result
AG+ = ABCGHI
 end
Closure of attribute sets [Exercise]
 Given functional dependencies (FDs) for relational schema
R = (A,B,C,D,E):
 F = {A → BC, CD → E, B → D, E → A}
 Find Closure for A Answer
 Find Closure for CD
 Find Closure for B A+ = ABCDE
 Find Closure for BC
CD+ = ABCDE
 Find Closure for E
B+ = BD
BC+ = ABCDE
E+ = ABCDE
What is decomposition?
 Decomposition is the process of breaking down given relation into
two or more relations.
 Relation R is replaced by two or more relations in such a way that:
 Each new relation contains a subset of the attributes of R
 Together, they all include all tuples and attributes of R
 Types of decomposition
 Lossy decomposition
 Lossless decomposition (non-loss decomposition)
Lossy decomposition
 The decomposition of relation R into R1 and
R2 is lossy when the join of R1 and R2 does
not yield the same relation as in R.
 This is also referred as lossy-join
decomposition.
 The disadvantage of such kind of
decomposition is that some information is
lost during retrieval of original relation.
 From practical point of view, decomposition
should not be lossy decomposition.
Lossless decomposition
 The decomposition of relation R
into R1 and R2 is lossless when the
join of R1 and R2 produces the
same relation as in R.
 This is also referred as a non-
additive (non-loss) decomposition.
 All decompositions must be
lossless.
What is an anomaly in database design?
 Anomalies are problems that can occur in poorly planned, un-
normalized database where all the data are stored in one table.
 There are three types of anomalies that can arise in the database
because of redundancy are
 Insert anomaly
 Delete anomaly
 Update / Modification anomaly
Insert anomaly
 Consider a relation Emp_Dept(EID, Ename, City, DID, Dname, Manager) EID as a primary key

 Suppose a new department (IT) has been started by the organization but initially there is no
employee appointed for that department.
 We want to insert that department detail in Emp_Dept table.
 But the tuple for this department cannot be inserted into this table as the EID will have
NULL value, which is not allowed because EID is primary key.
 This kind of problem in the relation where some tuple cannot be inserted is known as insert
anomaly.
Delete anomaly
 Consider a relation Emp_Dept(EID, Ename, City, DID, Dname, Manager) EID as a primary
key

 Now consider there is only one employee in some department (IT) and that employee
leaves the organization.
 So we need to delete tuple of that employee (Jay).
 But in addition to that information about the department also deleted.
 This kind of problem in the relation where deletion of some tuples can lead to loss of
some other data not intended to be removed is known as delete anomaly.
Update anomaly
 Consider a relation Emp_Dept(EID, Ename, City, Dname, Manager) EID as a primary key

 Suppose the manager of a (CE) department has changed, this requires that the Manager in
all the tuples corresponding to that department must be changed to reflect the new
status.
 If we fail to update all the tuples of given department, then two different records of
employee working in the same department might show different Manager lead to
inconsistency in the database.
How to deal with insert, delete and update anomaly
What is normalization?
 Normalization is the process of removing redundant data from
tables to improve data integrity, scalability and storage efficiency.
 data integrity (completeness, accuracy and consistency of data)
 scalability (ability of a system to continue to function well in a growing
amount of work)
 storage efficiency (ability to store and manage data that consumes the least
amount of space)

 What we do in normalization?
 Normalization generally involves splitting an existing table into multiple
(more than one) tables, which can be re-joined or linked each time a query
is issued (executed).
How many normal forms are there?
 Normal forms:
 1NF (First normal form)
 2NF (Second normal form)
 3NF (Third normal form)
 BCNF (Boyce–Codd normal form)
 4NF (Forth normal form)
 5NF (Fifth normal form)

As we move from 1NF to 5NF number of tables and


complexity increases but redundancy decreases.
1NF (First Normal Form)
 Conditions for 1NF

Each cells of a table should contain a single value.

 A relation R is in first normal form (1NF) if and only if it does not


contain any composite attribute or multi-valued attributes or their
combinations.
OR
 A relation R is in first normal form (1NF) if and only if all underlying
domains contain atomic values only.
1NF (First Normal Form) [Example - Composite attribute]

• In customer relation address is composite attribute


which is further divided into sub-attributes as “Road”
and “City”.
• So customer relation is not in 1NF.

 Problem: It is difficult to retrieve the list of customers living in ’Jamnagar’ city


from customer table.
 The reason is that address attribute is composite attribute which contains road
name as well as city name in single cell.
 It is possible that city name word is also there in road name.
 In our example, ’Jamnagar’ word occurs in both records, in first record it is a part
of road name and in second one it is the name of city.
1NF (First Normal Form) [Example - Composite attribute]
1NF (First Normal Form) [Example - Multivalued attribute]
1NF (First Normal Form) [Example - Multivalued attribute]
2NF (Second Normal Form)
 Conditions for 2NF

It is in 1NF and each table should contain a single primary key.

 A relation R is in second normal form (2NF)


 if and only if it is in 1NF and
 every non-primary key attribute is fully dependent on the primary key
OR
 A relation R is in second normal form (2NF)
 if and only if it is in 1NF and
 no any non-primary key attribute is partially dependent on the primary key
2NF (Second Normal Form) [Example]

 FD1: {CID, ANO} → {AccesssDate, Balance, BranchName}


 FD2: ANO → {Balance, BranchName}
 Balance and BranchName are partial dependent on primary key (CID + ANO). So customer
relation is not in 2NF.
2NF (Second Normal Form) [Example]

 Problem: For example, in case of a joint account multiple (more than one) customers have
common (one) accounts.
 If an account ’A01’ is operated jointly by two customers says ’C01’ and ’C02’ then data
values for attributes Balance and BranchName will be duplicated in two different tuples of
customers ’C01’ and ’C02’.
2NF (Second Normal Form) [Example]

 Solution: Decompose relation in such a way that resultant relations do not have any
partial FD.
 Remove partial dependent attributes from the relation that violets 2NF.
 Place them in separate relation along with the prime attribute on which they are
fully dependent.
 The primary key of new relation will be the attribute on which it is fully dependent.
 Keep other attributes same as in that table with the same primary key.
3NF (Third Normal Form)
 Conditions for 3NF

It is in 2NF and there is no transitive dependency.

(Transitive dependency???) A → B & B → C then A → C

 A relation R is in third normal form (3NF)


 if and only if it is in 2NF and
 every non-key attribute is non-transitively dependent on the primary key
OR
 A relation R is in third normal form (3NF)
 if and only if it is in 2NF and
 no any non-key attribute is transitively dependent on the primary key
3NF (Third Normal Form) [Example]

 FD1: ANO → {Balance, BranchName, BranchAddress}


 FD2: BranchName → BranchAddress
 So AccountNO → BranchAddress (Using Transitivity rule)
 BranchAddress is transitive depend on primary key (ANO). So customer relation is not in
3NF.
3NF (Third Normal Form) [Example]

 Problem: In this relation, branch address will be stored repeatedly


for each account of the same branch which occupies more space.
3NF (Third Normal Form) [Example]

 Solution: Decompose relation in such a way that resultant relations do not have any transitive FD.
 Remove transitive dependent attributes from the relation that violets 3NF.
 Place them in a new relation along with the non-prime attributes due to which transitive
dependency occurred.
 The primary key of the new relation will be non-prime attributes due to which transitive
dependency occurred.
 Keep other attributes same as in the table with same primary key and add prime attributes of
other relation into it as a foreign key.
BCNF (Boyce-Codd Normal Form)

 A relation R is in Boyce-Codd normal form (BCNF)


 if and only if it is in 3NF and
 for every functional dependency X → Y, X should be the primary key of the table. OR
 A relation R is in Boyce-Codd normal form (BCNF)
 if and only if it is in 3NF and
 every prime key attribute is non-transitively dependent on the primary key OR
 A relation R is in Boyce-Codd normal form (BCNF)
 if and only if it is in 3NF and
 no any prime key attribute is transitively dependent on the primary key
BCNF (Boyce-Codd Normal Form) [Example]
BCNF (Boyce-Codd Normal Form) [Example]
Multivalued dependency (MVD)
 For a dependency X → Y, if for a single value of X, multiple values of Y exists, then
the table may have multi-valued dependency.
Student
RNO Subject Faculty
101 DS Patel
101 DBMS Patel
101 DS Shah
101 DBMS Shah

 Multivalued dependency (MVD) is denoted by →→


 Multivalued dependency (MVD) is represented as X →→ Y
4NF (Forth Normal Form)
 Conditions for 4NF
 A relation R is in fourth normal form (4NF)
 if and only if it is in BCNF and
 has no multivalued dependencies

 Above student table has multivalued dependency. So student table is not in


4NF.
Functional dependency & Multivalued dependency
5NF (Fifth Normal Form)
 Conditions for 5NF
 A relation R is in fifth normal form (5NF)
 if and only if it is in 4NF and
 it cannot have a lossless decomposition in to any number of smaller tables (relations).

Student_Result relation is further decomposed


into sub-relations. So the above relation is not in
5NF.
5NF (Fifth Normal Form)
How to find key?

 Conditions to find key


 The attribute is a part of key, if it does not occur on any side of FD
 The attribute is a part of key, if it occurs on the left-hand side of an FD,
but never occurs on the right-hand side
 The attribute is not a part of key, if it occurs on the right-hand side of an
FD, but never occurs on the left-hand side
 The attribute may be a part of key or not, if it occurs on the both side of
an FD
How to find key? [Example]
 Let a relation R with attributes ABCD with FDs C → A, B → C. Find keys
for relation R.
 attribute not occur on any side of FDs (D) √
 attribute occurs on only left-hand side of an FDs (B) √
 attribute occurs on only right-hand side of an FDs (A) X
 attribute occurs on both the sides of an FDs (C) ?
 The core is BD.
 B determines C and C determines A, So using transitivity rule B
determines A also.
 So BD is a key.
How to find key? [Exercise]
 Let a relation R with attributes ABCD with FDs C → D, C → A and B → C. Find keys for
relation R.
 The core is B. B determines C which determines A and D, so B is a key. Therefore B is
the key.

 Let a relation R with attributes ABCD with FDs B → C, D → A. Find keys for relation R.
 The core is BD. B determines C and D determines A, so BD is a key. Therefore BD is the
key.

 Let a relation R with attributes ABCD with FDs A → B, BC → D and A → C. Find keys for
relation R.
 The core is A. A determines B and C which determine D, so A is a key. Therefore A is
the key.
Find (candidate) key & check for normal forms [Example]
 Suppose you are given a relation R with four attributes ABCD. For each of the following
sets of FDs, do the following: F = (B → C, D → A)

 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key is BD

Relation R is in 1NF but not 2NF. In above FDs, there is a partial dependency
(As per FD B → C, C depends only on B but Key is BD so C is partial depends on key
(BD))
(As per FD D → A, A depends only on D but Key is BD so A is partial depends on key
(BD))
Find (candidate) key & check for normal forms [Example]
 Suppose you are given a relation R with four attributes ABCD. For each of the following
sets of FDs, do the following: F = (C → D, C → A, B → C)

 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key is B

Relation R is in 2NF but not 3NF. In above FDs, there is a transitive


dependency
(As per FDs B → C & C → D then B → D so D is transitive depends on key (B))
(As per FDs B → C & C → A then B → A so A is transitive depends on key
(B))
Find (candidate) key & check for normal forms [Example]
 Suppose you are given a relation R with four attributes ABCD. For each of
the following sets of FDs, do the following: F = (ABC → D, D → A)
 Identify the candidate key(s) for R.

 Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

Candidate Key are ABC & BCD

Relation R is in 3NF but not BCNF.


In the above FDs, both FDs have prime attribute (D and A) in dependent
(right) side.
How to normalize database?
 A software contract and consultancy firm maintains details of all the
various projects in which its employees are currently involved. These
details comprise: Employee Number, Employee Name, Date of Birth,
Department Code, Department Name, Project Code, Project
Description, Project Supervisor.
 Assume the following:
 Each employee number is unique.
 Each department has a single department code.
 Each project has a single code and supervisor.
 Each employee may work on one or more projects.
 Employee names need not necessarily be unique.
 Project Code, Project Description and Project Supervisor are repeating fields.
 Normalize this data to Third Normal Form.
How to normalize database?
 A software contract and consultancy firm maintains details of all the
various projects in which its employees are currently involved. These
details comprise: Employee Number, Employee Name, Date of Birth,
Department Code, Department Name, Project Code, Project
Description, Project Supervisor.

Employ Date Departme Departme Project Project


Employee Project
ee of nt nt Descriptio Superviso
Number Code
Name Birth Code Name n r
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE 1 IOT Patel
1 Raj 1-1-85 1 CE 2 PHP Shah
How to normalize database?

1NF
How to normalize database?
1NF

2NF
How to normalize database?
3NF
Employee Employee Date of Department Department Department
Number Name Birth Code Code Name
1 Raj 1-1-85 1 1 CE
2 Meet 4-4-86 2 2 EC
3 Suresh 2-2-85 1
Employee Project
Number Code
Project Project Project
1 1
Code Description Supervisor
1 IOT Patel 2 2

2 PHP Shah 3 1
1 2
Thank You!!!
www.paruluniversi
ty.ac.in

You might also like