0% found this document useful (0 votes)
4 views

20CS404-DBMS (5 Unit Notes)-82-140

The document outlines the syllabus and contents of a unit on the Entity-Relationship (E-R) model in database design, covering topics such as E-R diagrams, relational mapping, functional dependencies, and normalization forms. It details the database design process, including requirement analysis, conceptual design, and logical design, along with the definitions of entities, relationships, and attributes. Additionally, it explains mapping cardinalities, weak entities, and the concepts of specialization and generalization in the Enhanced E-R model.

Uploaded by

praveen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

20CS404-DBMS (5 Unit Notes)-82-140

The document outlines the syllabus and contents of a unit on the Entity-Relationship (E-R) model in database design, covering topics such as E-R diagrams, relational mapping, functional dependencies, and normalization forms. It details the database design process, including requirement analysis, conceptual design, and logical design, along with the definitions of entities, relationships, and attributes. Additionally, it explains mapping cardinalities, weak entities, and the concepts of specialization and generalization in the Enhanced E-R model.

Uploaded by

praveen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

UNIT - II

Syllabus
Entity-Relationship model - E-R Diagrams - Enhanced-ER Model - ER-to-Relational Mapping -
Functional Dependencies - Non-loss Decomposition - First, Second, Third Normal Forms,
Dependency Preservation - Boyce/Codd Normal Form - Multi-valued Dependencies and Fourth
Normal Form - Join Dependencies and Fifth Normal Form.

Contents
Introduction to Entity Relationship Model
Mapping Cardinality
ER Diagrams
Enhanced ER Model
Examples based on ER Diagram
2.6 ER to Relational Mapping .................................May-17, ................................Marks 13
Concept of Relational Database Design
Functional Dependencies
Concept of Redundancy and Anomalies
2.10 Decomposition ...................................................Dec.-17, .................................Marks 7
2.11 Normal Forms ........................................................Dec.-14, 15, May-18 ............ Marks 16
Boyce / Codd Normal Form (BCNF)
Multivalued Dependencies and Fourth Normal Form May-14, Dec.-16................... Marks 16
Join Dependencies and Fifth Normal Form
Two Marks Questions with Answers

(2 - 1)
Database Management Systems 2-2 Database Design

Part I Entity Relationship Model

2.1 Introduction to Entity Relationship Model


Entity Relational model is a model for identifying entities to be represented in the
database and representation of how those entities are related.
Let us first understand the design process of database design.

2.1.1 Design Phases


Following are the six steps of database design process. The ER model is most relevant
to first three steps

Fig. 2.1.1 : Database design process

Step 1 : Requirement analysis :


 In this step, it is necessary to understand what data need to be stored in the
database, what applications must be built, what are all those operations that are
frequently used by the system.
 The requirement analysis is an informal process and it requires proper
communication with user groups.
 There are several methods for organizing and presenting information gathered in
this step.
 Some automated tools can also be used for this purpose.

Step 2 : Conceptual database design :


 This is a steps in which E-R Model i.e. Entity Relationship model is built.
 E-R model is a high level data model used in database design.
 The goal of this design is to create a simple description of data that matches with
the requirements of users.

Step 3 : Logical database design :


 This is a step in which ER model in converted to relational database schema,
sometimes called as the logical schema in the relational data model.
Database Management Systems 2-3 Database Design

Step 4 : Schema refinement :


 In this step, relational database schema is analyzed to identify the potential
problems and to refine it.
 The schema refinement can be done with the help of normalizing and
restructuring the relations.

Step 5 : Physical database design :


 In this step, the design of database is refined further.
 The tasks that are performed in this step are - building indexes on tables and
clustering tables, redesigning some parts of schema obtained from earlier design
steps.

Step 6 : Application and security design :


 Using design methodologies like UML(Unified Modeling Language) the design of
the database can be accomplished.
 The role of each entity in every process must be reflected in the application task.
 For each role, there must be the provision for accessing the some part of database
and prohibition of access to some other part of database.
 Thus some access rules must be enforced on the application(which is accessing
the database) to protect the security features.

2.1.2 ER Model
The ER data model specifies enterprise schema that represents the overall logical
structure of a database.
The E-R model is very useful in mapping the meanings and interactions of real-world
entities onto a conceptual schema.
The ER model consists of three basic concepts –

1) Entity Sets
 Entity : An entity is an object that exists and is distinguishable from other objects.
For example - Student named “Poonam” is an entity and can be identified by her
name. The entity can be concrete or abstract. The concrete entity can be - Person,
Book, Bank. The abstract entity can be like - holiday, concept entity is represented
as a box.
Student Employee Department
 Entity set : The entity set is a set of entities of the same types. For example - All
students studying in class X of the School. The entity set need not be disjoint. Each
entity in entity set have the same set of attributes and the set of attributes will
Database Management Systems 2-4 Database Design

distinguish it from other entity sets. No other entity set will have exactly the same
set of attributes.

2) Relationship Sets
Relationship is an association among two or more entities.
The relationship set is a collection of similar relationships. For example - Following
Fig. 2.1.2 shows the relationship works_for for the two entities Employee and
Departments.

Fig. 2.1.2 : Relation set

The association between entity sets is called as participation. that is, the entity sets E1,
E2, . . . , En participate in relationship set R.
The function that an entity plays in a relationship is called that entity’s role.

3) Attributes
Attributes define the properties of a data object of entity. For example if student is an
entity, his ID, name, address, date of birth, class are its attributes. The attributes help
in determining the unique entity. Refer Fig. 2.1.3 for Student entity set with attributes
- ID, name, address. Note that entity is shown by rectangular box and attributes are
shown in oval. The primary key is underlined.

Types of Attributes

Fig. 2.1.3 : Student entity set with attributes

1) Simple and Composite Attributes :


1) Simple attributes are attributes that are drawn from the atomic value domains
For example - Name = {Parth} ; Age = {23}
Database Management Systems 2-5 Database Design

2) Composite attributes: Attributes that consist of a hierarchy of attributes


For example - Address may consists of “Number”, “Street” and “Suburb”
→ Address = {59 + ‘JM Road’ + ‘ShivajiNagar’}

2) Single valued and multivalued :


 There are some attributes that can be represented using a single value. For
example - StudentID attribute for a Student is specific only one studentID.
 Multivalued attributes : Attributes that have a set of values for each entity. It is
represented by concentric ovals
For example - Degrees of a person: ‘ BSc’ , ‘MTech’, ‘PhD’

3) Derived attribute :
Derived attributes are the attributes that contain values that are calculated from other
attributes. To represent derived attribute there is dotted ellipse inside the solid ellipse. For
example –Age can be derived from attribute DateOfBirth. In this situation, DateOfBirth
might be called Stored Attribute.

Fig. 2.1.4
Database Management Systems 2-6 Database Design

2.2 Mapping Cardinality


Mapping Cardinality represents the number of entities to which another entity can be
associated via a relationship set.
The mapping cardinalities are used in representing the binary relationship sets.
Various types of mapping cardinalities are -
1) One to One : An entity A is associated with at least one entity on B and an entity B
is associated with at one entity on A. This can be represented as

2) One to Many : An entity in A is associated with any number of entities in B. An


entity in B, however, can be associated with at most one entity in A.

3) Many to One : An entity in A is associated with at most one entity in B. An entity in


B, however, can be associated with any number of entities in A.
Database Management Systems 2-7 Database Design

4) Many to many : An entity in A is associated with any number (zero or more) of


entities in B, and an entity in B is associated with any number (zero or more) of
entities in A.

2.3 ER Diagrams
An E-R diagram can express the overall logical structure of a database graphically.E-R
diagrams are used to model real-world objects like a person, a car, a company and the
relation between these real-world objects.

Features of ER model
i) E-R diagrams are used to represent E-R model in a database, which makes them
easy to be converted into relations (tables).
ii) E-R diagrams provide the purpose of real-world modeling of objects which makes
them intently useful.
iii) E-R diagrams require no technical knowledge and no hardware support.
iv) These diagrams are very easy to understand and easy to create even by a naive user.
v) It gives a standard solution of visualizing the data logically.

Various Components used in ER Model are -


Component Symbol Example
Entity : Any real-world
object can be represented
as an entity about which
data can be stored in a
database. All the real
world objects like a book,
an organization, a product,
a car, a person are the
examples of an entity.
Database Management Systems 2-8 Database Design

Relationship : Rhombus is
used to setup relationships
between two or more
entities.

Attribute : Each entity has


a set of properties. These
properties of each entity
are termed as attributes.
For example, a car entity
would be described by
attributes such as price,
registration number, model
number, color etc

Derived attribute :
Derived attributes are
those which are derived
based on other attributes,
for example, age can be
derived from date of birth.

To represent a derived
attribute, another dotted
ellipse is created inside the
main ellipse

Multivalued attribute : An
attribute that can hold
multiple values is known
as multivalued attribute.
We represent it with
double ellipses in an E-R
Diagram. E.g. A person can
have more than one phone
numbers so the phone
number attribute is
multivalued.
Database Management Systems 2-9 Database Design

Total participation : Each


entity is involved in the
relationship. Total
participation is represented
by double lines.

2.3.1 Mapping Cardinality Representation using ER Diagram


There are four types of relationships that are considered for key constraints.
i) One to one relation : When entity A is associated with at the most one entity B then
it shares one to one relation. For example - There is one project manager who
manages only one project.

ii) One to many : When entity A is associated with more than one entities at a time
then there is one to many relation. For example - One customer places order at a
time.

ii) Many to one : When more than one entities are associated with only one entity then
there is is many to one relation. For example - Many student take a
ComputerSciCourse.

Alternate representation can be


Database Management Systems 2 - 10 Database Design

iii) Many to many : When more than one entities are associated with more than one
entities. For example -Many teachers can teach many students.

Alternate representation can be

2.3.2 Ternary Relationship


The relationship in which three entities are involved is called ternary relationship. For
example -

2.3.3 Binary and Ternary Relationships


 Although binary relationships seem natural to most of us, in reality it is
sometimes necessary to connect three or more entities. If a relationship connects
three entities, it is called ternary or "3-ary."
 Ternary relationships are required when binary relationships are not sufficient to
accurately describe the semantics of an association among three entities.
 For example - Suppose, you have a database for a company that contains the
entities, PRODUCT, SUPPLIER, and CUSTOMER. The usual relationships might
be PRODUCT/ SUPPLIER where the company buys products from a supplier - a
normal binary relationship. The intersection attribute for PRODUCT/SUPPLIER is
wholesale_price
Database Management Systems 2 - 11 Database Design

Fig. 2.3.1 : A binary relationship of PRODUCT and

SUPPLIER and an intersection attribute, wholesale_price

 Now consider the CUSTOMER entity, and that the customer buys products. If all
customers pay the same price for a product, regardless of supplier, then you have
a simple binary relationship between CUSTOMER and PRODUCT. For the
CUSTOMER/ PRODUCT relationship, the intersection attribute is retail_price.

Fig. 2.3.2 : A binary relationship of PRODUCT and CUSTOMER

and an Intersection attribute, retail_price

 Single ternary relation : Now consider a different scenario. Suppose the customer
buys products but the price depends not only on the product, but also on the
supplier. Suppose you needed a customerID, a productID, and a supplierID to
identify a price. Now you have an attribute that depends on three things and
hence you have a relationship between three entities (a ternary relationship) that
will have the intersection attribute, price.
Database Management Systems 2 - 12 Database Design

Fig. 2.3.3 : Ternary relation

2.3.4 Weak Entity Set


 A weak entity is an entity that cannot be uniquely identified by its attributes
alone. The entity set which does not have sufficient attributes to form a primary
key is called as weak entity set.

Fig. 2.3.4 : Weak entity set

 Strong Entity Set

The entity set that has primary key is called as strong entity set

Weak entity rules


 A weak entity set has one or more many-one relationships to other (supporting)
entity sets.
 The key for a weak entity set is its own underlined attributes and the keys for the
supporting entity sets. For example - player-number and team-name is a key for
Players.

Difference between Strong and Weak Entity Set


Sr. No. Strong entity set Weak entity set
1 It has its own primary key. It does not have sufficient attribute to
form a primary key on its own.
Database Management Systems 2 - 13 Database Design

2. It is represented by rectangle It is represented by double rectangle.

3. It represents the primary key which It represents the partial key or


is underlined. discriminator which is represented by
dashed underline.

4. The member of strong entity set is The member of weak entity set is called
called as dominant entity set subordinate entity set.

5. The relationship between two The relationship between strong entity


strong entity sets is represented by set and weak entity set is represented
diamond symbol. by double diamond symbol.

6. The primary key is one of the The primary key of weak entity set is a
attributes which uniquely identifies combination of partial key and primary
its member. key of the strong entity set.

2.4 Enhanced ER Model

2.4.1 Specialization and Generalization


 Some entities have relationships that form hierarchies. For instance, Employee can
be an hourly employee or contracted employee.
 In this relationship hierarchies, some entities can act as superclass and some other
entities can act as subclass.
 Superclass : An entity type that represents a general concept at a high level, is
called superclass.
 Subclass : An entity type that represents a specific concept at lower levels, is
called subclass.
 The subclass is said to inherit from superclass. When a subclass inherits from one
or more superclasses, it inherits all their attributes. In addition to the inherited
attributes, a subclass can also define its own specific attributes.
 The process of making subclasses from a general concept is called specialization.
This is top-down process. In this process, the sub-groups are identified within an
entity set which have attributes that are not shared by all entities.
 The process of making superclass from subclasses is called generalization. This is
a bottom up process. In this process multiple sets are synthesized into high level
entities.
 The symbol used for specialization/ Generalization is
Database Management Systems 2 - 14 Database Design

 For example – There can be two subclass entities namely Hourly_Emps and
Contract_Emps which are subclasses of Empoyee class. We might have attributes
hours_worked and hourly_wage defined for Hourly_Emps and an attribute
contractid defined for ContractEmps.
Therefore, the attributes defined for an Hourly_Emps entity are the attributes for
Employees plus Hourly_Emps. We say that the attributes for the entity set
Employees are inherited by the entity set Hourly_Emps and that Hourly-Emps
ISA (read is a) Employees. It can be represented by following Fig. 2.4.1.

Fig. 2.4.1

2.4.2 Constraints on Specialization/Generalization


There are four types of constraints on specialization/generalization relationship. These
are -
1) Membership constraints : This is a kind of constraints that involves determining
which entities can be members of a given lower-level entity. There are two types of
membership constraints -
i) Condition defined : In condition-defined lower-level entity sets,membership
is evaluated on the basis of whether or not an entity satisfies an explicit
condition or predicate. For example - Consider the high-level entity Set
Employee that has attribute Employee_type. All Employee entities are
evaluated on defining Employee_type attribute. All entities that satisfy the
condition student type = “ContractEmployee” are included in Contracted
Employee. Since all the lower-level entities are evaluated on the basis of the
same attribute this type of generalization is said to be attribute-defined.
ii) User defined : This is kind of entity set that in which the membership is
manually defined.
2) Disjoint constraints : The disjoint constraint only applies when a superclass has
more than one subclass. If the subclasses are disjoint, then an entity occurrence can
be a member of only one of the subclasses. For entity Student has either
Postgraduate_Student entity or Undergraduate_Student
Database Management Systems 2 - 15 Database Design

3) Overlapping : When some entity can be a member of more than one subclasses. For
example - Person can be both a Student or a Staff. The And can be used to represent
this constraint.

4) Completeness : It specifies whether or not an entity in the higher-level entity set


must belong to at least one of the lower-level entity sets within the
generalization/specialization. This constraint may be one of the following -
i) Total generalization or specialization : Each higher-level entity must belong
to a lower-level entity set. For example - Account in the bank must either
Savings account or Current Account. The mandatory can be used to represent
this constraint.

ii) Partial generalization or specialization : Some higher-level entities may not


belong to any lower-level entity set.
Database Management Systems 2 - 16 Database Design

2.4.3 Aggregation
A feature of the entity relationship model that allows a relationship set to participate in
another relationship set. This is indicated on an ER diagram by drawing a dashed box
around the aggregation.
For example - We treat the relationship set work and the entity sets employee and
project as a higher-level entity set called work.

Fig. 2.4.2 : ER model with aggregation

2.5 Examples based on ER Diagram


Example 2.5.1 Draw the ER diagram for banking systems (home loan applications).
AU : Dec.-17, Marks 8
OR Draw an ER diagram corresponding to customers and loans. AU : May.-14, Marks 8

OR Write short notes on : E-R diagram for banking system . AU : Dec.-14, Marks 8
Database Management Systems 2 - 17 Database Design

Solution :

Example 2.5.2 Consider the relation schema given in Figure. Design and draw an ER
diagram that capture the information of this schema. AU : May-17, Marks 5

Employee(empno,name,office,age)
Books(isbn,title,authors,publisher)
Loan(empno,isbn,date)
Database Management Systems 2 - 18 Database Design

Solution :

Example 2.5.3 Construct an E-R diagram for a car insurance company whose customers own
one or more cars each.Each car has associated with it zero to any number of recorded
accidents. Each insurance policy covers one or more cars and has one or more premium
payments associated with it. Each payment is for particular period of time and has an
associated due date and date when the payment was received. AU : Dec.-16, Marks 7

Solution :

Example 2.5.4 A car rental company maintains a database for all vehicles in its current fleet.
For all vehicles, it includes the vehicle identification number license number, manufacturer,
model, date of purchase and color. Special data are included for certain types of vehicles.
Database Management Systems 2 - 19 Database Design

Trucks : Cargo capacity


Sports cars : horsepower, renter age requirement
Vans : number of passengers
Off-road vehicles : ground clearance, drivetrain (four-or two-wheel drive)
Construct an ER model for the car rental company database. AU : Dec.-15, Marks 16

Solution :

Example 2.5.5 Draw E-R diagram for the "Restaurant Menu Ordering System", which will
facilitate the food items ordering and services within a restaurant. The entire restaurant
scenario is detailed as follows. The customer is able to view the food items menu, call the
waiter, place orders and obtain the final bill through the computer kept in their table. The
Waiters through their wireless tablet PC are able to initialize a table for customers, control
the table functions to assist customers, orders, send orders to food preparation staff (chef)
and finalize the customer's bill. The Food preparation staffs (chefs), with their touch-display
interfaces to the system, are able to view orders sent to the kitchen by waiters. During
preparation they are able to let the waiter know the status of each item, and can send
notifications when items are completed. The system should have full accountability and
logging facilities, and should support supervisor actions to account for exceptional
Database Management Systems 2 - 20 Database Design

circumstances, such as a meal being refunded or walked out on. AU : May-15, Marks 16

Solution :

Example 2.5.6 A university registrar’s office maintains data about the following entities :
(1) courses, including number, title, credits, syllabus, and prerequisites;
(2) course offerings, including course number, year, semester, section number,
instructor(s), timings, and classroom;
(3) students, including student-id, name, and program; and
(4) instructors, including identification number, name, department, and title.
Further, the enrollment of students in courses and grades awarded to students in each
course they are enrolled for must be appropriately modeled. Construct an E-R diagram for
the registrar’s office. Document all assumptions that you make about the mapping
constraints.
AU : Dec.-13, Marks 10
Database Management Systems 2 - 21 Database Design

Solution :

Example 2.5.7 What is aggregation in ER model ? Develop an ER diagram using


aggregation that captures following information : Employees work for projects. An
employee working for particular project uses various machinery. Assume necessary
attributes. State any assumptions you make. Also discuss about the ER diagram you have
designed. AU : Dec.-11, Marks 8

Solution : Aggregation : Refer section 2.4.3.


ER Diagram : The ER diagram for above described scenario can be drawn as follows -

The above ER model contains the redundant information, because every Employee,
Project, Machinery combination in works_on relationship is also considered in manages
Database Management Systems 2 - 22 Database Design

relationship. To avoid this redundancy problem we can make use of aggregation


relationship in ER diagram as follows -

We can then create a binary relationship manages for between Manager and
(Employee, Project, Machinery).
Example 2.5.8 Construct an E-R diagram for a hospital with a set of patients and a set of
medical doctors. Associate with each patient a log of the various tests and examinations
conducted. AU : Dec.-07, Marks 8

Solution :
Database Management Systems 2 - 23 Database Design

2.6 ER to Relational Mapping AU : May-17, Marks 13

In this section we will discuss how to map various ER model constructs to Relational
Model construct.

2.6.1 Mapping of Entity Set to Relationship


 An entity set is mapped to a relation in a straightforward way.
 Each attribute of entity set becomes an attribute of the table.
 The primary key attribute of entity set becomes an entity of the table.
 For example - Consider following ER diagram.

The converted employee table is as follows -

EmpID EName Salary

201 Poonam 30000

202 Ashwini 35000

203 Sharda 40000

The SQL statement captures the information for above ER diageam as follows -

CREATE TABLE Employee( EmpID CHAR(11),


EName CHAR(30),
Salary INTEGER,
PRIMARY KEY(EmpID))

2.6.2 Mapping Relationship Sets(without Constraints) to Tables


 Create a table for the relationship set.
 Add all primary keys of the participating entity sets as fields of the table.
 Add a field for each attribute of the relationship.
 Declare a primary key using all key fields from the entity sets.
Database Management Systems 2 - 24 Database Design

 Declare foreign key constraints for all these fields from the entity sets.
For example - Consider following ER model

The SQL statement captures the information for relationship present in above ER
diagram as follows -

CREATE TABLE Works_In (EmpID CHAR(11),


DeptID CHAR(11),
EName CHAR(30),
Salary INTEGER,
DeptName CHAR(20),
Building CHAR(10),
PRIMARY KEY(EmpID,DeptID),
FOREIGN KEY (EmpID) REFERENCES Employee,
FOREIGN KEY (DeptID) REFERENCES Department
)

2.6.3 Mapping Relationship Sets( With Constraints) to Tables


 If a relationship set involves n entity sets and some m of them are linked via
arrows in the ER diagram, the key for anyone of these m entity sets constitutes a
key for the relation to which the relationship set is mapped.
 Hence we have m candidate keys, and one of these should be designated as the
primary key.
 There are two approaches used to convert a relationship sets with key constraints
into table.
 Approach 1 :

o By this approach the relationship associated with more than one entities is
separately represented using a table. For example - Consider following ER
diagram. Each Dept has at most one manager, according to the key
constraint on Manages.
Database Management Systems 2 - 25 Database Design

Here the constraint is each department has at the most one manager to manage it.
Hence no two tuples can have same DeptID. Hence there can be a separate table
named Manages with DeptID as Primary Key. The table can be defined using
following SQL statement

CREATE TABLE Manages(EmpID CHAR(11),


DeptID INTEGER,
Since DATE,
PRIMARY KEY(DeptID),
FOREIGN KEY (EmpID) REFERENCES Employees,
FOREIGN KEY (DeptID) REFERENCES Departments)

 Approach 2 :

o In this approach , it is preferred to translate a relationship set with key


constraints.
o It is a superior approach because, it avoids creating a distinct table for the
relationship set.
o The idea is to include the information about the relationship set in the
table corresponding to the entity set with the key, taking advantage of the
key constraint.
o This approach eliminates the need for a separate Manages relation, and
queries asking for a department's manager can be answered without
combining information from two relations.
o The only drawback to this approach is that space could be wasted if
several departments have no managers.
o The following SQL statement, defining a Dep_Mgr relation that captures
the information in both Departments and Manages, illustrates the second
approach to translating relationship sets with key constraints :
Database Management Systems 2 - 26 Database Design

CREATE TABLE Dep_Mgr ( DeptID INTEGER,


DName CHAR(20),
Budget REAL,
EmpID CHAR (11),
since DATE,
PRIMARY KEY (DeptID),
FOREIGN KEY (EmpID) REFERENCES Employees)

2.6.4 Mapping Weak Entity Sets to Relational Mapping


A weak entity can be identified uniquely only by considering the primary key of
another (owner) entity. Following steps are used for mapping Weka Entity Set to
Relational Mapping
 Create a table for the weak entity set.
 Make each attribute of the weak entity set a field of the table.
 Add fields for the primary key attributes of the identifying owner.
 Declare a foreign key constraint on these identifying owner fields.
 Instruct the system to automatically delete any tuples in the table for which there
are no owners
For example - Consider following ER model

Following SQL Statement illustrates this mapping

CREATE TABLE Department(DeptID CHAR(11),


DeptName CHAR(20),
Bldg_No CHAR(5),
PRIMARY KEY (DeptID,Bldg_No),
FOREIGN KEY(Bldg_No) References Buildings on delete cascade
)
Database Management Systems 2 - 27 Database Design

2.6.5 Mapping of Specialization / Generalization(EER Construct) to Relational


Mapping
The specialialization/Generalization relationship(Enhanced ER Construct) can be
mapped to database tables(relations) using three methods. To demonstrate the methods,
we will take the – InventoryItem, Book, DVD

Method 1 : All the entities in the relationship are mapped to individual tables

InventoryItem(ID , name)
Book(ID,Publisher)
DVD(ID, Manufacturer)

Method 2 : Only subclasses are mapped to tables. The attributes in the superclass
are duplicated in all subclasses. For example -

Book(ID,name,Publisher)
DVD(ID, name,Manufacturer)

Method 3 : Only the superclass is mapped to a table. The attributes in the subclasses
are taken to the superclass. For example -

InventoryItem(ID , name,Publisher,Manufacturer)

This method will introduce null values. When we insert a Book record in the table, the
Manufacturer column value will be null. In the same way, when we insert a DVD record
in the table, the Publisher value will be null.
Database Management Systems 2 - 28 Database Design

Example 2.6.1 Construct an E-R diagram for a hospital with a set of patients and a set of
medical doctors. Associate with each patient a log of the various tests and examinations
conducted. Also construct appropriate tables for the ER diagram you have drawn.
Solution :
ER Diagram - Refer example 2.5.8.
Relational Mapping

patients (P_id, name, insurance, date-admitted, date-checked-out)


doctors (Dr_id, name, specialization)
test (testid, testname, date, time, result)
doctor-patient (P_id, Dr_id)
test-log (testid, P_id) performed-by (testid, Dr_id)

University Question
1. Discuss the correspondence between the ER model construct and the relational model constructs.
Show how each ER model construct can be mapped to the relational model. Discuss the option for
mapping EER construct. AU : May-17, Marks 13

Part II Relational Database Design

2.7 Concept of Relational Database Design


 There are two primary goals of relational database design – i) to generate a set of
relation schemas that allows us to store information without unnecessary
redundancy, and ii) to allows us to retrieve information easily.
 For achieving these goals, the database design need to be normalized. That means
we have to check whether the schema is it normal form or not.
 For checking the normal form of the schema, it is necessary to check the functional
dependencies and other data dependencies that exists within the schema.
Hence before letting us know what the normalization means, it is necessary to
understand the concept of functional dependencies.
2.8 Functional Dependencies
Definition : Let P and Q be sets of columns, then: P functionally determines Q,
written P → Q if and only if any two rows that are equal on (all the attributes in) P must
be equal on (all the attributes in) Q.
In other words, the functional dependency holds if
T1.P = T2.P, then T1.Q=T2.Q
Database Management Systems 2 - 29 Database Design

Where notation T1.P projects the tuple T1 onto the attribute in P.


For example : Consider a relation in which the roll of the student and his/her name is
stored as follows :

R N
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE

Fig. 2.8.1 : Table which holds functional


dependency i.e. R->B

Here, R->N is true. That means the functional dependency holds true here. Because for
every assigned RollNuumber of student there will be unique name. For instance : The
name of the Student whose RollNo is 1 is AAA. But if we get two different names for the
same roll number then that means the table does not hold the functional dependency.
Following is such table -
R N
1 AAA
2 BBB
3 CCC
1 XXX
2 YYY

Fig. 2.8.2 : Table which does not hold


functional dependency
In above table for RollNumber 1 we are getting two different names - “AAA” and
“XXX”. Hence here it does not hold the functional dependency.

2.8.1 Computing Closure Set of Functional Dependency


The closure set is a set of all functional dependencies implied by a given set F. It is
denoted by F+
The closure set of functional dependency can be computed using basic three rules
which are also called as Armstrong’s Axioms.
Database Management Systems 2 - 30 Database Design

These are as follows -


i) Reflexivity : If X  Y, then X Y
ii) Augmentation : If X Y, then XZ  YZ for any Z
iv) Transitivity : If X  Y and Y  Z, then X  Z
In addition to above axioms some additional rules for computing closure set of
functional dependency are as follows -
 Union : If X  Y and X Z then X YZ
 Decomposition : If X YZ, then X Y and X Z
Example 2.8.1 Compute the closure of the following set of functional dependencies for a
relation scheme R(A,B,C,D,E), F={A->BC, CD->E, B->D, E->A)
Solution : Consider F as follows
A->BC
CD->E
B->D
E->A
The closure can be written for each attribute of relation as follows
 (A)+ = Step 1 : {A} -> the attribute itself
Step 2 : {ABC} as A->BC
Step 3 : {ABCD} as B->D
Step 4 : {ABCDE} as CD->E
Step 5 : {ABCDE} as E->A and A is already present
Hence (A)+ ={ABCDE}
 (B)+ = Step 1:{B}
Step 2 : {BD} as B->D
Step 3 : {BD} as there is no BD pair on LHS of F
Hence (B)+ ={BD}
 (C)+ = Step 1 :{C}
Step 2 : {C} as there is no single C on LHS of F
Hence (C)+ ={C}
 (D)+ = Step 1 : {D}
Step 3 : {D} as there is no BD pair on LHS of F
Database Management Systems 2 - 31 Database Design

Hence (D)+ ={D}


 (E)+ = Step 1 : {E}
Step 2 : {EA} as E->A
Step 3 : {EABC} as A->BC
Step 4 : {EABCD} as B->D
Step 5 : {EABCD} as CD->E and E is already present
By rearranging we get {ABCDE}
Hence (E)+ ={ABCDE}
 (CD)+ = Step 1:{CD}
Step 2 :{CDE}
Step 3 :{CDEA}
Step 4 :{CDEAB}
By rearranging we get {ABCDE}
Hence (CD)+ ={ABCDE}
Example 2.8.2 Compute the closure of the following set of functional dependencies for a
relation scheme R(A,B,C,D,E), F={A->BC, CD->E, B->D, E->A) and Find the candidate
key.
Solution : For finding the closure of functional dependencies - Refer example 2.8.1.
We can identify candidate from the given relation schema with the help of functional
dependency. For that purpose we need to compute the closure set of attribute. Now we
will find out the closure set which can completely identify the relation R(A,B,C,D).
Let, (A)+ = {ABCDE}
(B)+ = {BD}
(C)+ = {C}
(D)+ = {D}
(E)+ = {ABCDE}
(CD)+ = {ABCDE}
Clearly, only (A)+,(E)+ and (CD)+ gives us {ABCD} i.e. complete relation R. Hence these
are the candidate keys.
Database Management Systems 2 - 32 Database Design

2.8.2 Canonical Cover or Minimal Cover


Formal Definition : A minimal cover for a set F of FDs is a set G of FDs such that :
1) Every dependency in G is of the form X->A, where A is a single attribute.
2) The closure F+ is equal to the closure G+.
3) If we obtain a set H of dependencies from G by deleting one or more dependencies
or by deleting attributes from a dependency in G, then F+ H+.

Concept of Extraneous Attributes


Definition : An attribute of a functional dependency is said to be extraneous if we can
remove it without changing the closure of the set of functional dependencies. The formal
definition of extraneous attributes is as follows:
Consider a set F of functional dependencies and the functional dependency    in F

 Attribute A is extraneous in if A , and F logically implies (F – {  }) ∪


{( – A )   }
 Attribute A is extraneous in if A and the set of functional dependencies
(F – {   }) ∪ {(  ( – A) } logically implies F.
Algorithm for computing Canonical Cover for set of functional Dependencies F
Fc = F
repeat
Use the union rule to replace any dependencies in Fc of the form
1  1 and 1  2 and 1  12
Find a functional dependency    in Fc with an extraneous attribute either in  or in .
/* The test for extraneous attributes is done using Fc, not F */
If an extraneous attribute is found, delete it from    in Fc .
until (Fc does not change)
Example 2.8.3 Consider the following functional dependencies over the attribute set
R(ABCDE) for finding minimal cover FD = {A->C, AC->D, B->ADE}
Solution :
Step 1 : Split the FD such that R.H.S contain single attribute. Hence we get
A->C
AC->D
B->A
Database Management Systems 2 - 33 Database Design

B->D
B->E
Step 2 : Find the redundant entries and delete them. This can be done as follows -

o For A->C : We find (A)+ by assuming that we delete A->C temporarily. We


get (A)+={A}. Thus from A it is not possible to obtain C by deleting A->C.
This means we can not delete A->C
o For AC->D : We find (AC)+ by assuming that we delete AC->D
temporarily. We get (AC)+={AC}. Thus by such deletion it is not possible to
obtain D. This means we can not delete AC->D
o For B->A : We find (B)+ by assuming that we delete B->A temporarily. We
get (B)+={BDE}. Thus by such deletion it is not possible to obtain A. This
means we can not delete B->A
o For B->D : We find (B)+ by assuming that we delete B->D temporarily. We
get (B)+={BEACD}. This shows clearly that even if we delete B->D we can
obtain D. This means we can delete B->A. Thus it is redundant.
o For B->E : We find (B)+ by assuming that we delete B->E temporarily. We
get (B)+={BDAC}. Thus by such deletion it is not possible to obtain E. This
means we can not delete B->E
To summarize we get now
A->C
AC->D
B->A
B->E
Thus R.H.S gets simplified.
Step 3 : Now we will simplify L.H.S.
Consider AC->D. Here we can split A and C. For that we find closure set of A and C.
(A)+ = (AC)
(C)+ = (C)
Thus C can be obtained from both A as well as C. That also means we need not have to
have AC on L.H.S. Instead, only A can be allowed and C can be eliminated. Thus after
simplification we get
A->D
Database Management Systems 2 - 34 Database Design

To summarize we get now


A->C
A->D
B->A
B->E
Thus L.H.S gets simplified.
Step 3 : The simplified L.H.S. and R.H.S can be combined together to form

A->CD
B->AE
This is a minimal cover or Canonical cover of functional dependencies.
2.9 Concept of Redundancy and Anomalies
Definition : Redundancy is a condition created in database in which same piece of
data is held at two different places.
Redundancy is at the root of several problems associated with relational schemas.
Problems caused by redundancy : Following problems can be caused by redundancy-
i) Redundant storage : Some information is stored repeatedly.
ii) Update anomalies : If one copy of such repeated data is updated then inconsistency
is created unless all other copies are similarly updated.
iii) Insertion anomalies : Due to insertion of new record repeated information get
added to the relation schema.
iv) Deletion anomalies : Due to deletion of particular record some other important
information associated with the deleted record get deleted and thus we may lose
some other important information from the schema.
Example : Following example illustrates the above discussed anomalies or redundancy
problems
Consider following Schema in which all possible information about Employee is
stored.
Database Management Systems 2 - 35 Database Design

1) Redundant storage : Note that the information about DeptID, DeptName and
DeptLoc is repeated.
2) Update anomalies : In above table if we change DeptLoc of Pune to Chennai, then
it will result inconsistency as for DeptID 101 the DeptLoc is Pune. Or otherwise, we
need to update multiple copies of DeptLoc from Pune to Chennai. Hence this is an
update anomaly.
3) Insertion anomalies : For above table if we want to add new tuple say
(5, EEE,50000) for DeptID 101 then it will cause repeated information of
(101, XYZ,Pune) will occur.
4) Deletion anomalies : For above table, if we delete a record for EmpID 4, then
automatically information about the DeptID 102,DeptName PQR and DeptLoc
Mumbai will get deleted and one may not be aware about DeptID 102. This causes
deletion anomaly.
2.10 Decomposition AU : Dec.-17, Marks 7

 Decomposition is the process of breaking down one table into multiple tables.
 Formal definition of decomposition is -
 A decomposition of relation Schema R consists of replacing the relation Schema by
two relation schema that each contain a subset of attributes of R and together
include all attributes of R by storing projections of the instance.
 For example - Consider the following table
Employee_Department table as follows -

Eid Ename Age City Salary Deptid DeptName


E001 ABC 29 Pune 20000 D001 Finance
E002 PQR 30 Pune 30000 D002 Production
E003 LMN 25 Mumbai 5000 D003 Sales
E004 XYZ 24 Mumbai 4000 D004 Marketing
E005 STU 32 Hyderabad 25000 D005 Human Resource
We can decompose the above relation Schema into two relation schemas as Employee
(Eid, Ename, Age, City, Salary) and Department (Deptid, Eid, DeptName). as follows -

Employee Table
Eid Ename Age City Salary
E001 ABC 29 Pune 20000
E002 PQR 30 Pune 30000
E003 LMN 25 Mumbai 5000
Database Management Systems 2 - 36 Database Design

E004 XYZ 24 Mumbai 4000


E005 STU 32 Hyderabad 25000

Department Table
Deptid Eid DeptName
D001 E001 Finance
D002 E002 Production
D003 E003 Sales
D004 E004 Marketing
D005 E005 Human Resource
 The decomposition is used for eliminating redundancy.
 For example : Consider following relation Schema R in which we assume that the
grade determines the salary, the redundancy is caused

Schema R

 Hence, the above table can be decomposed into two Schema S and T as follows :

Schema S Schema T
Name eid deptname Grade Grade Salary
AAA 121 Accounts 2 2 8000
AAA 132 Sales 3 3 7000
BBB 101 Marketing 4 4 7000
CCC 106 Purchase 2 2 8000

Problems Related to Decomposition :


Following are the potential problems to consider :
1) Some queries become more expensive.
2) Given instances of the decomposed relations, we may not be able to reconstruct the
corresponding instance of the original relation!
Database Management Systems 2 - 37 Database Design

3) Checking some dependencies may require joining the instances of the decomposed
relations.
4) There may be loss of information during decomposition.

Properties Associated With Decomposition


There are two properties associated with decomposition and those are –
1) Loss-less Join or non Loss Decomposition : When all information found in the
original database is preserved after decomposition, we call it as loss less or non loss
decomposition.
2) Dependency Preservation : This is a property in which the constraints on the
original table can be maintained by simply enforcing some constraints on each of
the smaller relations.

2.10.1 Non-loss Decomposition or Loss-less Join


The lossless join can be defined using following three conditions :
i) Union of attributes of R1 and R2 must be equal to attribute of R. Each attribute of R
must be either in R1 or in R2.
Att(R1) 𝖴 Att(R2) = Att(R)
ii) Intersection of attributes of R1 and R2 must not be NULL.
Att(R1) ∩ Att(R2) ≠ Φ
iii) Common attribute must be a key for at least one relation (R1 or R2)
Att(R1) ∩ Att(R2) -> Att(R1)
or Att(R1) ∩ Att(R2) -> Att(R2)
Example 2.10.1 Consider the following relation R(A,B,C,D)and FDs A->BC, is the decomposition
of R into R1(A,B,C), R2(A,D). Check if the decomposition is lossless join or not.
Solution :
Step 1 : Here Att(R1) 𝖴 Att(R2) = Att(R) i.e R1(A,B,C) 𝖴 R2(A,D)=(A,B,C,D) i.e R.
Thus first condition gets satisfied.
Step 2 : Here R1 ∩ R2={A}. Thus Att(R1) ∩ Att(R2) ≠ . Here the second condition
gets satisfied.
Step 3 : Att(R1) ∩ Att(R2) -> {A}. Now (A)+={A,B,C}  attributes of R1. Thus the
third condition gets satisfied.
This shows that the given decomposition is a lossless join.
Database Management Systems 2 - 38 Database Design

Example 2.10.2 Consider the following relation R(A,B,C,D,E,F) and FDs A->BC, C->A, D-
>E, F->A, E->D is the decomposition of R into R1(A,C,D), R2(B,C,D), and R3(E,F,D).
Check for lossless.
Solution :
Step 1 : R1 R2 R3=R. Here the first condition for checking lossless join is satisfied
as (A,C,D)𝖴 (B,C,D) 𝖴 (E,F,D)={A,B,C,D,E,F} which is nothing but R.
Step 2 : Consider R1∩ R2={CD} and R2∩R3={D}. Hence second condition of
intersection not being  gets satisfied.
Step 3 : Now, consider R1(A,C,D) and R2(B,C,D). We find R1∩R2={CD}
(CD)+ = {ABCDE}  attributes of R1 i.e.{A,C,D}. Hence condition 3 for checking
lossless join for R1 and R2 gets satisfied.
Step 4 : Now, consider R2(B,C,D) and R3(E,F,D) . We find R2∩R3={D}.
(D) +={D,E} which is neither complete set of attributes of R2 or R3.[Note that F is
missing for being attribute of R3].
Hence it is not lossless join decomposition. Or in other words we can say it is a
lossy decomposition.
Example 2.10.3 Suppose that we decompose schema R=(A,B,C,D,E) into (A,B,C) (C,D,E)
Show that it is not a lossless decomposition.
Solution :
Step 1 : Here we need to assume some data for the attributes A, B, C, D, and E.
Using this data we can represent the relation as follows –
Relation R
A B C D E
a 1 x p q
b 2 x r s

Relation R1 = (A,B,C)
A B C
a 1 x
b 2 x

Relation R2 = (C,D,E)
C D E
x p q
x r s
Database Management Systems 2 - 39 Database Design

Step 2 : Now we will join these tables using natural join, i.e. the join based on
common attribute C. We get R1 ⋈ R2 as

A B C D E
a 1 x p q
Here we get more rows or
a 1 x r s tuples than original
b 2 x p q relation R

b 2 x r s
Clearly R1 ⋈ R2  R. Hence it is not lossless decomposition.

2.10.2 Dependency Preservation


 Definition : A Decomposition D = {R1, R2, R3….Rn} of R is dependency
preserving for a set F of Functional dependency if - (F1 𝖴 F2 𝖴 … 𝖴 Fm) = F.
 If decomposition is not dependency-preserving, some dependency is lost in the
decomposition.

Example 2.10.4 Consider the relation R (A, B, C) for functional dependency set {A -> B and
B -> C} which is decomposed into two relations R1 = (A, C) and R2 = (B, C). Then check if
this decomposition dependency preserving or not.
Solution : This can be solved in following steps :
Step 1 : For checking whether the decomposition is dependency preserving or not
we need to check
following condition

F+ = (F1 F2)+
Step 2 : We have with us the F+ ={ A->B and B->C }
Step 3 : Let us find (F1)+ for relation R1 and (F2)+ for relation R2

R1(A,C) R2(B,C)
A->A Trivial B->B Trivial
C->C Trivial C->C Trivial
A->C In (F)+A->B->C and it is Nontrivial B->C In (F)+ B->C and it is Non-Trivial
AC->AC Trivial BC->BC Trivial
A->B but is not useful as B is not part of R1 We can not obtain C->B
set
We can not obtain C->A
Database Management Systems 2 - 40 Database Design

Step 4 : We will eliminate all the trivial relations and useless relations. Hence we
can obtain R1 and R2 as

R1(A,C) R2(B,C)
A->C Nontrivial

B->C Non-Trivial

(F1∪ F2)+ = {A->C, B->C} {A->B, B->C} i.e.(F)+


Thus the condition specified in step 1 i.e. F+=(F1 F2)+ is not true. Hence it is not
dependency preserving decomposition.

Example 2.10.5 Let relation R(A,B,C,D) be a relational schema with following functional
dependencies {A->B, B->C,C->D, and D->B}. The decomposition of R into (A,B), (B,C)
and (B,D). Check whether this decomposition is dependency preserving or not.
Solution :
Step 1 : Let (F)+ = {A->B, B->C, C->D,D->B}.
Step 2 : We will find (F1)+, (F2)+, (F3)+ for relations R1(A,B) , R2(B,C) and R3(B,D) as
follows -

R1(A,B) R2(B,C) R3(B,D)


A->A Trivial B->B Trivial B->B Trivial
B->B Trivial C->C Trivial D->D Trivial
A->B ∵ (F)+ B->C ∵ (F)+ and it’s B-> D ∵ (F)+ as andB-
and it’s non Trivial non Trivial >C->D and it’s non
B->A can not be Trivial
C->B ∵ In (F)+ and
obtained D->B ∵ (F)+ and it’s
C->D->C and it is
AB->AB Nontrivial non Trivial
BC->BC Trivial BD->BD Trivial

Step 3 : We will eliminate all the trivial relations and useless relations. Hence we
can obtain R1 ∪ R2 ∪ R3 as
R1(A,B) R2(B,C) R2(B,D)
A->B B->C B-> D
C->B D->B
Database Management Systems 2 - 41 Database Design

Step 4 : As from above FD’s we get

Step 5 : This proves that F+=(F1 F2 F3)+. Hence given decomposition is


dependency preserving.

University Question
1. Differentiate between lossless join decomposition and dependency preserving decomposition.
AU : Dec.-17, Marks 7

2.11 Normal Forms AU : Dec.-14, 15, May-18, Marks 16

 Normalization is the process of reorganizing data in a database so that it meets


two basic requirements:
1) There is no redundancy of data (all data is stored in only one place), and
2) data dependencies are logical (all related data items are stored together)
 The normalization is important because it allows database to take up less disk
space.
 It also help in increasing the performance.

2.11.1 First Normal Form


The table is said to be in 1NF if it follows following rules -
i) It should only have single (atomic) valued attributes/columns.
ii) Values stored in a column should be of the same domain
iii) All the columns in a table should have unique names.
iv) And the order in which data is stored, does not matter.
Consider following Student table
Student
sid sname Phone
1 AAA 11111
22222
2 BBB 33333
3 CCC 44444
55555
Database Management Systems 2 - 42 Database Design

As there are multiple values of phone number for sid 1 and 3, the above table is not in
1NF. We can make it in 1NF. The conversion is as follows -

sid sname Phone


1 AAA 11111
1 AAA 22222
2 BBB 33333
3 CCC 44444
3 CCC 55555

2.11.2 Second Normal Form


Before understanding the second normal form let us first discuss the concept of partial
functional dependency and prime and non prime attributes.

Concept of Partial Functional Dependency


Partial dependency means that a nonprime attribute is functionally dependent on part
of a candidate key.
For example : Consider a relation R(A,B,C,D) with functional dependency
{AB->CD,A->C}
Here (AB) is a candidate key because
(AB)+ = {ABCD}={R}
Hence {A,B} are prime attributes and {C,D} are non prime attribute. In A->C, the non
prime attribute C is dependent upon A which is actually a part of candidate key AB.
Hence due to A->C we get partial functional dependency.

Prime and Non Prime Attributes


 Prime attribute : An attribute, which is a part of the candidate-key, is known as a
prime attribute.
 Non-prime attribute : An attribute, which is not a part of the prime-key, is said to
be a non-prime attribute.
 Example : Consider a Relation R={A,B,C,D} and candidate key as AB, the Prime
attributes : A, B
Non Prime attributes : C, D

The Second Normal Form


For a table to be in the Second Normal Form, following conditions must be followed
i) It should be in the First Normal form.
ii) It should not have partial functional dependency.
Database Management Systems 2 - 43 Database Design

For example : Consider following table in which every information about a the
Student is maintained in a table such as student id(sid), student name(sname), course
id(cid) and course name(cname).

Student_Course
sid sname cid cname

1 AAA 101 C
2 BBB 102 C++
3 CCC 101 C
4 DDD 103 Java
This table is not in 2NF. For converting above table to 2NF we must follow the
following steps -
Step 1 : The above table is in 1NF.
Step 2 : Here sname and sid are associated similarly cid and cname are associated
with each other. Now if we delete a record with sid=2, then automatically the
course C++ will also get deleted. Thus,
sid->sname or cid->cname is a partial functional dependency, because {sid,cid}
should be essentially a candidate key for above table. Hence to bring the above table
to 2NF we must decompose it as follows :
Student
Here candidate key is
sid sname cid (sid,cid)
and
1 AAA 101
(sid,cid)->sname
2 BBB 102
3 CCC 101
4 DDD 103

Course
cid cname
Here candidate key is
101 C cid

102 C++ Here cid->cname

101 C
103 Java

Thus now table is in 2NF as there is no partial functional dependency


Database Management Systems 2 - 44 Database Design

2.11.3 Third Normal Form


Before understanding the third normal form let us first discuss the concept of
transitive dependency, super key and candidate key

Concept of Transitive Dependency


A functional dependency is said to be transitive if it is indirectly formed by two
functional dependencies. For example -
X -> Z is a transitive dependency if the following functional dependencies hold true :
X->Y
Y->Z

Concept of Super key and Candidate Key


Superkey : A super key is a set or one of more columns (attributes) to uniquely
identify rows in a table.
Candidate key : The minimal set of attribute which can uniquely identify a tuple is
known as candidate key. For example consider following table

RegID RollNo Sname

101 1 AAA
102 2 BBB
103 3 CCC
104 4 DDD

Superkeys
 {RegID}
 {RegID, RollNo}
 {RegID,Sname}
 {RollNo,Sname}
 {RegID, RollNo,Sname}

Candidate Keys
 {RegID}
 {RollNo}

Third Normal Form


A table is said to be in the Third Normal Form when,
i) It is in the Second Normal form.(i.e. it does not have partial functional dependency)
ii) It doesn't have transitive dependency.
Database Management Systems 2 - 45 Database Design

Or in other words
In other words 3NF can be defined as : A table is in 3NF if it is in 2NF and for each
functional dependency
X-> Y
at least one of the following conditions hold :
i) X is a super key of table
ii) Y is a prime attribute of table
For example : Consider following table Student_details as follows -

sid sname zipcode cityname state

1 AAA 11111 Pune Maharashtra


2 BBB 22222 Surat Gujarat
3 CCC 33333 Chennai Tamilnadu
4 DDD 44444 Jaipur Rajastan
5 EEE 55555 Mumbai Maharashtra
Here
Super keys : {sid},{sid,sname},{sid,sname,zipcode}, {sid,zipcode,cityname}… and so on.
Candidate keys : {sid}
Non-Prime attributes : {sname,zipcode,cityname,state}
The dependencies can be denoted as
sid->sname
sid->zipcode
zipcode->cityname
cityname->state
The above denotes the transitive dependency. Hence above table is not in 3NF. We can
convert it into 3NF as follows :
Student
sid sname zipcode
1 AAA 11111
2 BBB 22222
3 CCC 33333
4 DDD 44444
5 EEE 55555
Database Management Systems 2 - 46 Database Design

Zip
zipcode cityname state
11111 Pune Maharashtra
22222 Surat Gujarat
33333 Chennai Tamilnadu
44444 Jaipur Rajasthan
55555 Mumbai Maharashtra

Example 2.11.1 Consider the relation R = {A, B, C, D, E, F, G, H, I, J} and the set of


functional dependencies F= {{A, B} C, A {D, E}, B F, F {G, H}, D {I, J} }
1. What is the key for R ? Demonstrate it using the inference rules.
2. Decompose R into 2NF, then 3NF relations.
Solution : Let,
A  DE (given)
 A  D, A  E
As D  I J, A  I J
Using union rule we get
A  DEIJ
As AA
we get A  ADEIJ
Using augmentation rule we compute AB
AB  ABDEIJ
But AB  C (given)
 AB  ABCDEIJ
B  F (given) F  GH  B  GH (transitivity)
 AB  AGH is also true

Similarly AB  AF ∵ B  F (given)
Thus now using union rule
AB  ABCDEFGHIJ
 AB is a key
The table can be converted to 2NF as
Database Management Systems 2 - 47 Database Design

R1 = (A, B, C)

R2 = (A, D, E, I, J)

R3 = (B, F, G, H)

The above 2NF relations can be converted to 3NF as follows


R1 = (A, B, C)

R2 = (A, D, E)

R3 = (D, I, J)

R4 = (B, E)

R5 = (E, G, H).

University Questions
1. What is database normalization ? Explain the
first normal form, second normal form and third normal form. AU : May-18, Marks 13; Dec.-
15, Marks 16
2. What are normal forms. Explain the types of normal form with an example.
AU : Dec.-14, Marks 16

2.12 Boyce / Codd Normal Form (BCNF)


Boyce and Codd Normal Form is a higher version of the Third Normal form. This
form deals with certain type of anomaly that is not handled by 3NF.
A 3NF table which does not have multiple overlapping candidate keys is said to be in
BCNF.
Or in other words,
For a table to be in BCNF, following conditions must be satisfied :
i) R must be in 3rd Normal Form
ii) For each functional dependency ( X → Y ), X should be a super Key. In simple
words if Y is a prime attribute then X can not be non prime attribute.
For example - Consider following table that represents that a Student enrollment for
the course -

Enrollment Table
sid course Teacher
1 C Ankita
1 Java Poonam
Database Management Systems 2 - 48 Database Design

2 C Ankita
3 C++ Supriya
4 C Archana
From above table following observations can be made :
 One student can enroll for multiple courses. For example student with sid=1 can
enroll for C as well as Java.
 For each course, a teacher is assigned to the student.
 There can be multiple teachers teaching one course for example course C can be
taught by both the teachers namely - Ankita and Archana.
 The candidate key for above table can be (sid,course), because using these two
columns we can find
 The above table holds following dependencies
o (sid,course)->Teacher
o Teacher->course
 The above table is not in BCNF because of the dependency teacher->course. Note
that the teacher is not a superkey or in other words, teacher is a non prime
attribute and course is a prime attribute and non-prime attribute derives the prime
attribute.
 To convert the above table to BCNF we must decompose above table into Student
and Course tables

Student
sid Teacher
1 Ankita
1 Poonam
2 Ankita
3 Supriya
4 Archana

Course
Teacher course
Ankita C
Poonam Java
Ankita C
Supriya C++
Archana C
Now the table is in BCNF
Database Management Systems 2 - 49 Database Design

Example 2.12.1 Consider a relation(A,B,C,D) having following FDs.{AB->C, AB->D, C-


>A, B->D}. Find out the normal form of R.
Solution :
Step 1 : We will first find out the candidate key from the given FD.
(AB)+ = {ABCD} = R
(BC)+ = {ABCD} = R

(AC) + = {AC} R
There is no involvement of D on LHS of the FD rules. Hence D can not be part of any
candidate key. Thus we obtain two candidate keys (AB)+ and (BC)+. Hence
prime attributes = {A,B,C}
Non prime attributes = {D}
Step 2 : Now, we will start checking from reverse manner, that means from BCNF,
then 3NF, then 2NF.
Step 3 : For R being in BCNF for X->Y the X should be candidate key or super key.
From above FDs consider C->D in which C is not a candidate key or super key.
Hence given relation is not in BCNF.
Step 4 : For R being in 3NF for X->Y either i) the X should be candidate key or super
key or ii) Y should be prime attribute. (For prime and non prime attributes refer
step 1)

o For AB->C or AB->D the AB is a candidate key. Condition for 3NF is


satisfied.
o Consider C->A. In this FD the C is not candidate key but A is a prime
attribute. Condition for 3NF is satisfied.
o Now consider B->D. In this FD, the B is not candidate key, similarly D is
not a prime attribute. Hence condition for 3NF fails over here.
Hence given relation is not in 3NF.
Step 5 : For R being in 2NF following condition should not occur.
Let X->Y, if X is a proper subset of candidate key and Y is a non prime attribute. This
is a case of partial functional dependency.
For relation to be in 2NF there should not be any partial functional dependency.

o For AB->C or AB->D the AB is a complete candidate key. Condition for


2NF is satisfied.
Database Management Systems 2 - 50 Database Design

o Consider C->A. In this FD the C is not candidate key. Condition for 2NF is
satisfied.
o Now consider B->D. In this FD, the B is a part of candidate key(AB or BC),
similarly D is not a prime attribute. That means partial functional
dependency occurs here. Hence condition for 2NF fails over here.
Hence given relation is not in 2NF.
Therefore we can conclude that the given relation R is in 1NF.
Example 2.12.2 Consider a relation R(ABC) with following FD A->B, B->C and C->A.
What is the normal form of R ?
Solution :
Step 1 : We will find the candidate key
(A)+ = {ABC} =R
(B)+ = {ABC} =R
(C)+ = {ABC} =R
Hence A, B and C all are candidate keys
Prime attributes = {A,B,C}
Non prime attribute{}
Step 2 : For R being in BCNF for X->Y the X should be candidate key or super key.
From above FDs

o Consider A->B in which A is a candidate key or super key. Condition for


BCNF is satisfied.
o Consider B->C in which B is a candidate key or super key. Condition for
BCNF is satisfied.
o Consider C->A in which C is a candidate key or super key. Condition for
BCNF is satisfied.
This shows that the given relation R is in BCNF.
Example 2.12.3 Prove that any relational schema with two attributes is in BCNF.
Solution : Here, we will consider R={A,B} i.e. a relational schema with two attributes.
Now various possible FDs are A->B, B->A.
From the above FDs

o Consider A->B in which A is a candidate key or super key. Condition for


BCNF is satisfied.
Database Management Systems 2 - 51 Database Design

o Consider B->A in which B is a candidate key or super key. Condition for


BCNF is satisfied.
o Consider both A->B and B->A with both A and B is candidate key or super
key. Condition for BCNF is satisfied.
o No FD holds in relation R. In this {A,B} is candidate key or super key. Still
condition for BCNF is satisfied.
This shows that any relation R is in BCNF with two attributes.
2.13 Multivalued Dependencies and Fourth Normal Form
AU : May-14, Dec.-16, Marks 16

Concept of Multivalued Dependencies


 A table is said to have multi-valued dependency, if the following conditions are
true,
1) For a dependency A  B, if for a single value of A, multiple values of B
exists, then the table may have multi-values dependency.
2) Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
3) And, for a relation R(A,B,C), if there is a multi-valued dependency between,
A and B, then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to have multi-valued
dependency.
 In simple terms, if there are two columns A and B - and for column A if there are
multiple values of column B then we say that MVD exists between A and B
 The multivalued dependency is denoted by
 If there exists a multivalued dependency then the table is not in 4th normal form.
 For example : Consider following table for information about student

Student
sid Course Skill

1 C English
C++ German
2 Java English
French
Here sid =1 leads to multiple values for courses and skill. Following table shows this
Database Management Systems 2 - 52 Database Design

sid Course Skill


1 C English
1 C++ German

1 C German

1 C++ English

2 Java English

2 Java French

Here sid and course are dependent but the Course and Skill are independent. The
multivalued dependency is denoted as :
sid Course
sid Skill

Fourth Normal Form


Definition : For a table to satisfy the Fourth Normal Form, it should satisfy the following
two conditions :
1) It should be in the Boyce-Codd Normal Form(BCNF).
2) And, the table should not have any multi-valued dependency.
For example : Consider following student relation which is not in 4NF as it contains
multivalued dependency.

Student Table
sid Course Skill

1 C English
1 C++ German
1 C German
1 C++ English
2 Java English
2 Java French
Now to convert the above table to 4NF we must decompose the table into following
two tables.
Database Management Systems 2 - 53 Database Design

Student_Course Table

Key : (sid,Course)
sid Course
1 C
1 C++
2 Java

Student_Skill Table

Key : (sid,Skill)
sid Skill
1 English
1 German
2 English
2 French
Thus the tables are now in 4NF.

University Questions
1. Explain first normal form, second normal form, third normal form and BCNF with example.
AU : Dec.-16, Marks 13

2. Explain Boyce Codd Normal form and fourth normal form with suitable example.
AU : May-14, Marks 16

2.14 Join Dependencies and Fifth Normal Form


Concept of Join Dependencies

o Join decomposition is a further generalization of Multivalued


dependencies.
o If the join of R1 and R2 over C is equal to relation R, then we can say that a
Join Dependency (JD) exists.
o Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a
given relations R (A, B, C, D).
o Alternatively, R1 and R2 are a lossless decomposition of R.
o A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,. .... , Rn is a
lossless-join decomposition.
Database Management Systems 2 - 54 Database Design

o The *(A, B, C, D), (C, D) will be a JD of R if the join of join's attribute is


equal to the relation R.
o Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are
a JD of R.

Concept of Fifth Normal Form


The database is said to be in 5NF if -
i) It is in 4th Normal Form
ii) If we can decompose table further to eliminate redundancy and anomalies and
when we rejoin the table we should not be losing the original data or get a new
record(join Dependency Principle)
The fifth normal form is also called as project join normal form
For example - Consider following table

Seller Company Product

Rupali Godrej Cinthol


Sharda Dabur Honey
Sharda Dabur HairOil
Sharda Dabur Rosewater
Sunil Amul Icecream
Sunil Britania Biscuits
Here we assume the keys as{Seller, Company, Product}
The above table has multivalued dependency as
Seller {Company, Product}. Hence table is not in 4th Normal Form. To make the
above table in 4th normal form we decompose above table into two tables as
Seller_Company Seller_Product
Seller Company Seller Product
Rupali Godrej Rupali Cinthol
Sharda Dabur Sharda Honey
Sunil Amul Sharda HairOil
Sunil Britania Sharda RoseWater
Sunil Icecream
Sunil Biscuits
Database Management Systems 2 - 55 Database Design

The above table is in 4th Normal Form as there is no multivalued dependency. But it
is not in 5th normal form because if we join the above two table we may get

Seller Company Product


Rupali Godrej Cinthol
Sharda Dabur Honey
Sharda Dabur HairOil
Sharda Dabur Rosewater
Sunil Amul Icecream
Sunil Amul Biscuits
Sunil Britania Icecream
Sunil Britania Biscuits
Newly added records
which are not present in
original table

To avoid the above problem we can decompose the tables into three tables as
Seller_Company, Seller_Product, and Company Product table
Seller_Company Seller_Product Company_Product
Seller Company Seller Product Company Product
Rupali Godrej Rupali Cinthol Godrej Cinthol
Sharda Dabur Sharda Honey Dabur Honey
Sunil Amul Sharda HairOil Dabur HairOil
Sunil Britania Sharda RoseWater Dabur RoseWater
Sunil Icecream Amul Icecream
Sunil Biscuit Britania Biscuit

Thus the table in in 5th normal form.


Database Management Systems 2 - 56 Database Design

2.15 Two Marks Questions with Answers

Explain Entity Relationship model. AU : May-16


Ans. :  The ER data model specifies enterprise schema that represents the overall
logical structure of a database.
 The E-R model is very useful in mapping the meanings and interactions of real-
world entities onto a conceptual schema.

Give the limitations of E-R model ? How do you overcome this ? AU : May-07
Ans. : 1) Loss of information content : Some information be lost or hidden in ER
model
2) Limited relationship representation : ER model represents limited relationship as
compared to another data models like relational model etc.
3) No representation of data manipulation : It is difficult to show data manipulation
in ER model.
4) Popular for high level design : ER model is very popular for designing high level
design.

List the design phases of Entity Relationship model.


Ans. : 1) Requirement Analysis, 2) Conceptual Database Design, 3) Logical
Database Design, 4) Schema Refinement, 5) Physical Database Design,
6) Application and Security Design.

What is an entity ? AU : May-14


Ans. :  An entity is an object that exists and is distinguishable from other objects.
 For example - Student named “Poonam” is an entity and can be identified by her
name. Entity is represented as a box, in ER model.

What do you mean by derived attributes ?


Ans. :  Derived attributes are the attributes that contain values that are calculated
from other attributes.
 To represent derived attribute there is dotted ellipse inside the solid ellipse. For
example –Age can be derived from attribute DateOfBirth. In this situation,
DateOfBirth might be called Stored Attribute.

What is a weak entity ? Give example. AU : Dec.-16, May-18


Ans. : Refer section 2.3.4

What are the problems caused by redundancy ? AU : Dec.-17


Database Management Systems 2 - 57 Database Design

Ans. : Problems caused by Redundancy : Following problems can be caused by


redundancy -
i) Redundant Storage : Some information is stored repeatedly.
ii) Update Anomalies : If one copy of such repeated data is updated then
inconsistency is created unless all other copies are similarly updated.
iii) Insertion Anomalies : Due to insertion of new record repeated information get
added to the relation schema.
iv) Deletion Anomalies : Due to deletion of particular record some other important
information associated with the deleted record get deleted and thus we may lose
some other important information from the schema.

Define functional dependency. AU : Dec 04,05, May 05,14,15


Ans. : Let P and Q be sets of columns, then : P functionally determines Q, written
P → Q if and only if any two rows that are equal on (all the attributes in) P must be equal
on (all the attributes in) Q.
In other words, the functional dependency holds if
T1.P = T2.P, then T1.Q=T2.Q
Where notation T1.P projects the tuple T1 onto the attribute in P.

Why certain functional dependencies are called trivial functional dependencies ?


AU : May-06,12
Ans. :  A functional dependency FD : X → Y is called trivial if Y is a subset of X.
This kind of dependency is called trivial because it can be derived from common
sense. If one "side" is a subset of the other, it's considered trivial. The left side is
considered the determinant and the right the dependent.
 For example - {A,B} –> B is a trivial functional dependency because B is a subset of
A,B. Since {A,B} –> B includes B, the value of B can be determined. It's a trivial
functional dependency because determining B is satisfied by its relationship to
A,B

Define normalization. AU : May -14


Ans. : Normalization is the process of reorganizing data in a database so that it meets
two basic requirements :
1) There is no redundancy of data (all data is stored in only one place), and
2) data dependencies are logical (all related data items are stored together)

State anomalies of 1NF. AU : Dec.-15


Ans. : All the insertion, deletion and update anomalies are in 1NF relation
Database Management Systems 2 - 58 Database Design

What is multivalued dependency ? AU : Dec. -06


Ans. : A table is said to have multi-valued dependency, if the following
conditions are true,
1) For a dependency A  B, if for a single value of A, multiple values of B exists, then
the table may have multi-values dependency.
2) Also, a table should have at-least 3 columns for it to have a multi-valued
dependency.
3) And, for a relation R(A,B,C), if there is a multi-valued dependency between,
A and B, then B and C should be independent of each other.

Describe BCNF and describe a relation which is in BCNF. AU : Dec. -02


Ans. : Refer section 2.12.

Why 4NF in normal form is more desirable than BCNF ? AU : Dec. -14
Ans. :
 4NF is more desirable than BCNF because it reduces the repetition of information.
 If we consider a BCNF schema not in 4NF we observe that decomposition into
4NF does not lose information provided that a lossless join decomposition is used,
yet redundancy is reduced.

Give an example of a relation schema R and set of dependencies such that R is in


BCNF but not in 4NF. AU : May -12
Ans. : Consider relation R(A,B,C,D) with dependencies
AB C
ABC D
AC B
Here the only key is AB. Thus each functional dependency has superkey on the left.
But MVD has non-superky on its left. So it is not 4NF.

Show that if a relation is in BCNF, then it is also in 3NF. AU : Dec.-12


Ans. :
 Boyce and Codd Normal Form is a higher version of the Third Normal form.
 A 3NF table which does not have multiple overlapping candidate keys is said to
be in BCNF. When the table is in BCNF then it doesn’t have partial functional
dependency as well as transitive dependency.
 Hence it is true that if relation is in BCNF then it is also in 3NF.
Database Management Systems 2 - 59 Database Design

Why it is necessary to decompose a relation ? AU : May-07

Ans. :  Decomposition is the process of breaking down one table into multiple
tables.
 The decomposition is used for eliminating redundancy.

Explain atleast two desirable properties of decomposition. AU : May-03,17, Dec.-05


Ans. :
There are two properties associated with decomposition and those are –
1) Loss-less Join or non Loss Decomposition : When all information found in the
original database is preserved after decomposition, we call it as loss less or non loss
decomposition.
2) Dependency Preservation : This is a property in which the constraints on the
original table can be maintained by simply enforcing some constraints on each of
the smaller relations.

Explain with simple example lossless join decomposition. AU : May-03


Ans. : Refer section 2.10.1.





























You might also like