Chapter 9. Database Design
Chapter 9. Database Design
Database Design
Table of Contents
Page 1 of 52
Chapter 9. Database Design
Introduction
In the previous chapter, we discussed how the physical architecture for the proposed system is
determined. The physical architecture includes several components such as database, programs, and
inputs/outputs, which need to be designed in the detailed design phase. Database design marks the first
major step within the design phase. Database design deals with the selection of the best structure for
storing the required data for the proposed system. The logical system specifications that facilitate
detailed design are available at the end of high level design stage. One of the logical system
specifications is the list of unique data items for the proposed system. This list is the primary input for the
detailed database design stage. The first step in the database design process is to identify data items
that should be stored in the database. The next step is to determine how those data items should be
stored.
Learning Outcomes
We assume that the database will be implemented using a relational database management
system. Although a database can follow a “hierarchical”, “network”, or relational model, most commercial
Page 2 of 52
database management systems follow relational models. Further, a majority of databases are
implemented using relational database software. In a relational database system, a database is defined
as a collection of related tables. A table is a collection of related records (rows), and each record consists
of a set of related data items (fields). For most practical purposes, a field is the primitive data item in a
database. For example, employee and department could be two tables in a database used by a firm.
The database could store information about which employees belong to which departments, a relationship
between employees and departments. The employee table stores a record for each employee in the firm,
and each employee record could contain fields such as employee#, name, title, phone#, and
department_code. Similarly, department table could contain department records with department_code,
and department_name as fields. In a relational database, relationship between two tables is established
by creating common fields in them. For instance, in the previous example, department_code is the
common field in employee and department tables. Please note that an employee record without the
department_code, cannot relate an employee to the department to which the employee belongs. In
relational models, the relationship between the two tables is thus established with a field common to both
the tables. Such relationships can exist between two or more tables, and even within one table as
discussed later in this chapter. In contrast, the hierarchical and network databases use pointers to
identify the relationships. Although these databases can perform faster than relational databases, they
lost favor with the database designers due to complexities in creating and maintaining the databases.
Database design consists of two steps: logical database design and physical database design.
The objective of logical database design is to produce a database structure that will result in a high level
of data integrity. The logical database design process achieves this by reducing data redundancy and
maintenance anomalies. Data redundancy or data duplication not only increases the storage space, but
also, more importantly, makes data maintenance error-prone. For instance, suppose multiple copies of a
data item are stored in the database and the data item needs to be updated, then, unless the update
procedure ensures that all copies of the data item are updated, the update will result in an inconsistent
database. Similarly, maintenance (insertion, deletion, and modification) anomalies occur when the
maintenance activity results in unintended consequences. For instance, in a database designed for a
university, assume that a table stores both course and faculty data. When a course is deleted from this
Page 3 of 52
table (perhaps because the university has eliminated the course from its curriculum), then there is
potential to lose the data associated with the faculty member teaching that course. This will occur if that
faculty member’s data exist only as part of the record that is deleted. So, the faculty member data could
be deleted even though he/she is still a faculty member of the university! Such anomalies occur if the
database is poorly designed. One way to deal with the anomaly problem is to let the maintenance
programs take care of it. However, this approach makes maintenance programs complex. The cause of
such maintenance anomalies is poor database design, and hence, logical database design process
addresses database maintenance anomalies. The logical database design produces a “data model” of
the system in the form of an Entity-Relationship-Attribute (ERA) diagram. Please recall that dataflow
diagrams represent the process model of a system. In a process model, processes and the data used
and produced by each process are the focus of the representation. A process model does not represent
the relationships among the data. In contrast, a data model represents a system from the perspective of
its data and does not represent processes are the relationships among the processes. The process
model and the data model are two sides of the same coin, and provide two different views of the same
system.
The objectives of physical database design are to facilitate database implementation and to
improve the database performance, e.g., database access time and speed of transaction processing.
Normalization process used in the logical database design process improves data integrity, but degrades
database performance frequently. One of the challenges of database design process is to strike a good
balance between database integrity and database performance. Physical database design process aims
database design process includes techniques such as indexing, denormalization, and partitioning.
In a database course, you might have learned to draw an ERA diagram and identify various
tables along with their respective attributes from a description of the system and without the benefit of a
process analysis and process model. The systems development cycle following the waterfall model has
the benefit of completing the process model that leads to identification of various data items needed to
Page 4 of 52
support the system, even before the database design begins. Please recall the chapter on High Level
design (Chapter 8) that identifies a lit of data items required to support a system. The data items thus
Basic data items are those data items that come from outside of the system or found in master
tables that are created with the system. Examples for basic data items include data items such as
customer number, customer name, and customer address that come from the customer entity to an order
processing system. In this order processing system, data items such as item name and unit price are
used from master catalog file. Generated data items are those data items that are generated by various
processes within the system. Since these data items originate in the system, they should be stored in
database. Examples of generated data items in an order processing system include data items such as
order number and order date. Derived data items are those data items that are derived as a function of
other data items that are available in the system. Examples for derived data items in an order processing
system include data items such as extension (= quantity * unit price) and total extension (=sum of all
extensions).
Whereas basic and generated data items are always stored, the decision to store or not store
derived data items is debatable. Database concepts require that derived data items be not stored to
avoid redundancy and to conserve storage space. However, if one of the data items has time-variant
values, then the derived data item needs to be stored. Since the data item, extension, discussed above it
is derived from quantity and unit price available in the system, one may decide not to store this data item.
However, the data item, unit price of an item, may change over time, and any extension calculated at a
different time may give a value different from the original value. Consider an item that you buy from a
department store. A return of this item would have to be accompanied by the original receipt. The
procedures for return and accounting of such returns depend on how the unit price or extension of this
item is stored in the database. If the original transaction is available in the system, it can be reversed
upon return. A store following the principles of database design strictly would not have the unit price or
extension stored for the transaction. If the price has gone up since you bought the item, the store would
have to return more money than what the customer paid originally. On the other hand, if the unit price or
extension were stored for the transaction in the database, then the store would pay exactly what the
Page 5 of 52
customer paid. The data item, unit price has values varying with time, is called a temporal data item.
Since the design of temporal databases is complex, many designers store such data items with each
transaction. This will result in redundancy of some data that need to be carefully controlled. An
alternative to avoid such redundancies is to associate time-variant data with time to avoid inconsistencies.
Thus, if unit price of an item has dates for which the price is applicable, the redundant data item,
Another motivation to store derived data is to save processing. Certain data items that may be
derived after complex calculations and multiple table accesses should be stored to save processing time.
Even in the case of data items that involve moderate processing, it may be worthwhile to store derived
data item if it is going to be used frequently. Derived data items that are time-invariant, simple to
calculate, and not frequently used need not be stored. The total extension discussed above is a good
example for such data items because it can be calculated as a sum of all extension in a transaction.
In drawing an E-R-A diagram, a designer needs to select the attributes from the list identified in high level
The logical database design involves drawing an E-R-A diagram to determine various tables in
the database, relationships among tables, and fields in each table. The ERA diagram also helps to
identify constraints that should be imposed on data. Normalization rules can be applied, if necessary, to
ensure tables are in at least 3rd Normal Form (3NF). After finalizing the logical database design, the two
models of the system, process model and data model, are integrated, so that each process and its inputs
and outputs are defined in terms of the newly formed tables. This section will discuss the various
elements of E-R-A-diagram followed by an example for constructing the E-R-A diagram. An illustration
using the AP system will further exemplify the concepts and the integration of process model and data
model.
Page 6 of 52
E-R-A Diagram
Entity: An entity is an identifiable thing or object that is distinguishable from other objects. The reader is
cautioned about the use of the term, entity as used in the dataflow diagrams and E-R-A diagrams. An
entity in the DFDs, represent an external organization, customer, person, or a system that interacts with
the system under consideration. An entity in an E-R-A diagram can be an organization, a place, a
department, or an event. In short, an entity in an E-R-A diagram is any object about which a system
needs to collect, process, and store various data. Example entities include employee with identification
number 123, Systems Analysis and Design course with course identification MIS471, etc. An entity could
represent real or abstract objects as well as activities (e.g., shipping, registration) and others. Entities can
be described at two levels of detail. An entity class represents a collection of similar entities; an entity
instance represents a particular entity. For example, an entity called ‘student’ represents a collection of
students; hence, student is an entity class. A student with the student# ‘93844737373’ represents a
particular student; and, hence, is an instance of the student class. In the E-R-A diagram, we generally
model entity classes. Because all instances of an entity class are similar, the database design process
treats all instances of an entity in the same manner. Consequently, there is no need to model entities at
the instance level. Further, modeling the data at the instance level will make the diagram unnecessarily
big and complex without adding any new information content. We show an entity class using a
There are several types of entities such as basic/fundamental entities, associative entities, weak
entities, and super/sub entities. Later parts of this chapter explain these types of entities. Basic or
fundamental entities are independent entities that do not depend on another entity for its existence. The
entities Department, Employee, and Skill in Figure 9-2 are basic entities. Associative entities are formed
when two or more entities are involved in a many-to-many relationship. The entities Employee-Skill in
Figure 9-3, Pre-requisite in Figure 9-5, and Factory-Warehouse-Product in Figure 9-7 are associative
entities. Video copy in Figure 9-8 is a weak entity. Super and sub entities are illustrated in Figure 9-10.
entity. For instance, the entity class ‘employee’ can have attributes ‘employee#’, ‘name’, and ‘jobtitle’.
Page 7 of 52
Note that all employee instances will have the same set of attributes. Every attribute of an entity instance
has an associated value. For example, the values of ‘employee#’, ‘name’, and ‘jobtitle’ of a particular
employee could be ‘E1’, ‘John’, and ‘Manager’, respectively. An attribute of an entity is shown in an E-R-
A diagram using an oval shaped box connected to the rectangle representing the entity.
At least one attribute serves as an identifier for the entity. For example, item number can be an
identifier for item entity containing attributes such as item number, item name, and item price. Certain
tables may have more than one attribute as identifiers. For example, order number and item number can
be identifiers for order entity containing attributes such as order number, item number, and order quantity.
To identify how much item was ordered, a user needs to know both the order number and the item
number. An identifier uniquely identifies an instance. That is, no two entity instances will have the same
value for that attribute. In some cases, an attribute of an entity instance could have more than one value.
We call such an attribute as multi-valued attribute. In general, it is a good practice to avoid using multi-
valued attributes in an E-R-A diagram. Multi-valued attributes can be transformed into an equivalent
model with only single-valued attributes using relationships, which we discuss next.
Relationship: A relationship models an association between instances of entities. The number of entities
involved may be one, two, three, or more. Relationships between two entities are common place. For
example, suppose we have employee and department as entities in an E-R-A diagram. We want to store
information about employees that work in a department (alternatively, the department that an employee
works in). This information is modeled in an E-R-A diagram using a relationship between the employee
and department entities. This is an example of a binary relationship. A binary relationship connects two
entity classes, or has a degree of two. Though binary relationships are the most frequently used
relationships in an E-R-A diagram, unary or recursive (an entity related to itself or degree 1), ternary
(three entities or degree 3), and relationships having a degree greater than three (n-ary) are possible. A
An E-R-A diagram also documents the cardinalities of relationships. Cardinalities model the
nature of associations between entity instances. We will explain cardinalities using examples. Consider
Page 8 of 52
a simple E-R-A model that has three entities: Department, Employee, and Skill, as shown in Figure 9-2.
(i) Several employees work in a department, but an employee works in one department only.
(ii) An employee has several skills and needs at least one skill. In addition, several employees
can have same or different skills. Furthermore, no employee may have certain skills required in the firm.
Each relationship can be modeled as a binary relationship that connects the two entity classes mentioned
in it. Now consider the relationship between department and employees. For a particular department
(department instance), there can be many employees, and there has to be at least one employee. In
other words, a department instance is related to many instances of employees and it has to be related to
at least one instance of employee. We say that the minimum and maximum cardinalities of this
relationship on the employee side are one and many respectively. We show the minimum and maximum
cardinalities using crow’s foot notation as shown in Figure 9 -1. Various forms of cardinality possible in a
relationship are shown in Figure 9-1. Similarly, we can verify that since each employee instance is
related to a maximum of one department instance and each employee instance should be related to a
minimum of one department instance, the maximum and minimum cardinalities on the department side is
one.
In the relationship between employee and skill, since an employee instance can be related to
many skill instances and a skill instance can be related to many employee instances, the maximum
cardinality is many on both sides of the relationship. Since an employee needs at least one skill, then the
minimum cardinality on the skill side is one. A skill may not be available with any employee, and
therefore the minimum cardinality on the employee side is zero. We show the minimum and maximum
cardinalities using crow’s foot notation as shown in Figure 9 -2. Each entity would have at least one
attribute associated with it, and at least one identifier that acts as a primary to identify an instance
uniquely.
Page 9 of 52
Maximum/Minimum Cardinality Minimum/Maximum Cardinality
Representation
on Entity Side A on Entity Side B
One/One One/One A B
One/One Zero/One A B
One/One Zero/Many A B
One/One One/Many A B
One/Zero One/Many A B
Department
Employee Skill
Page 10 of 52
When two are more entities are involved in a many-to-many relationship, the relationship also
would have certain attributes associated. Suppose each employee has certain weeks of experience in
each of the skills s/he has, then to identify an employee’s experience in a skill, we need to know the
employee and the skill. The attribute, experience would be associated with the relationship. In modified
E-R-A diagrams, many-to-many relationships create a new type of entity called associative entity. Each
associative entity would have its own identifiers and attributes such as experience in the above example.
The identifiers would be the identifiers of both the entities that have the many-to-many relationship.
Associative entity Employee-Skill is represented as shown in the modified E-R-A diagram in Figure 9-3. It
shows that an employee may have experience in many skills but have at least one skill. Similarly, a skill
may be available with many employees and no employee may have certain skills.
Department
Figure 9-3. Modified E-R-A Diagram for the Example in Figure 9-2.
Cardinalities are shown for every relationship including recursive, ternary, and others. For
instance, consider the recursive relationship shown in Figure 9 - 4. In a university system, a course may
have one or more courses as its pre-requisites. The relationship shows the pre-requisite association that
exists among instances of course entity. Suppose that a course may have many pre-requisites, but not
all courses have pre-requisites. In addition, a course may serve as a pre-requisite for many other courses
and a course may not serve as a pre-requisite for any course. The maximum and minimum cardinalities
Page 11 of 52
are many and zero, respectively, on both sides of the relationship. Since this recursive relationship is
many-to-many, it would create an associative entity and have attributes such as course number and its
pre-requisite, which is also a course number. The modified E-R-A diagram is shown in Figure 9-5.
Course
Course Pre-Requisite
In a ternary relationship, three or more entities will be involved. An association of three or more
entities required to describe certain data necessitate such relationships. Figure 9 - 6 shows a ternary
relationship representing a shipping activity that transports products from factories to warehouses. In this
model, what product goes from what factory for what warehouse needs an association of the three
entities? Since zero to many products may be sent from zero to many factories to zero to many
warehouses, the minimum and maximum cardinalities are zero and many respectively.
Page 12 of 52
Factory Product
Shipping
Activity
Warehouse
The many-to-many relationships in this example would have an associative entity and have
factory number, warehouse number, product number, and the number of units of the product shipped as
its attributes. To find the number of units shipped, a user would need to know factory number, warehouse
number, and the product number. In other words, these three attributes form the identifier for this
relationship. The modified E-R-A diagram for the ternary relationship example is shown in Figure 9-7.
Factory Product
Factory-
Warehouse-
Product
Warehouse
Page 13 of 52
Figure 9 - 7. Modified E-R-A diagram for the Ternary Relationship Example.
In addition to relationships discussed above, there are some special relationships such as weak
Weak Entity: Weak entities are also known as dependent entities, as an instance in a weak entity is
dependent on an instance in another entity. In the absence of the instance in the latter entity, an instance
in the weak entity cannot exist. Let us assume a video store identifies each video by a video number,
title, year, and rating. Each video has multiple copies that are rented to various customers. Since a
customer can be related to a video copy and not to the original video, information about each copy would
have to be in a separate entity. Such an entity would have a video number, copy number of the video,
rental date, due date, and the customer, who rented the video. In order to identify data about a video, the
user would need to know the video number and its copy number. A weak entity is shown in Figure 9 -8.
An associative entity can be viewed as a weak entity dependent on two or more entities. Thus,
the associative entity, Employee-Skill in Figure 9-3 is dependent on the two entities Employee and Skill.
Super/Sub Entities: Normally, we will need to treat all instances of an entity in the same way. All
instances of an entity will have the same set of attributes and participate in same relationships. However,
sometimes there may be a need to model data that apply to only a sub set of instances of an entity. We
use sub entities to model such scenarios. For instance, consider ‘student’ entity. Suppose we want to
model data that are specific only to undergraduate students and data specific only to graduate students.
For instance, for every graduate student, we may want to store the college where the student obtained
Page 14 of 52
the undergraduate degree. We can model this by creating two sub entities, namely ‘undergraduate’ and
‘graduate’, as shown in Figure 9 -10. Each of these entities represents a sub set of the ‘student’ entity.
We call ‘student’ as the super entity for these sub entities. Often, we also refer to super entity and a sub
entity as parent and child, respectively. The parent-child relationship between a super entity and a sub
entity is also known as the ‘is-a’ relationship. That is, an instance of a sub entity is (also) an entity of the
super entity. Theoretically, a sub entity could have its own sub entities, thus forming a hierarchy of ‘is-a’
relationships.
One of the important properties of a parent-child relationship is inheritance. A sub entity inherits
all attributes and relationships of its parent. However, a parent does not inherit attributes and
relationships of any of its sub entities. In Figure 9-9, while every undergraduate and graduate student
also has a ‘name’ and ‘number’, only graduate students have the attribute ‘college’.
Student No.
Student
Student Name
Undergraduate
Undergraduate Graduate
College
Primary Key: Each basic entity in an E-R-A model would have primary key(s) that identify each instance.
An associative entity would have more than one primary key. The primary keys of the parent entities
become the primary keys for the associative entity. Since the basic entity, Employee has Employee No.
as its primary key and the basic entity, Skill has Skill Code as its primary key, the associative entity
Page 15 of 52
Employee-Skill dependent on these two entities would have the primary keys, Employee No. and Skill
Code. The primary keys of the parent entity migrate as primary keys of the associative entity. A weak
entity also would have more than one primary key. One of the primary keys would have to be the primary
key of the entity on which weak entity is dependent. Thus, Video Number would be a primary key of the
weak entity, Video Copy. Since it cannot identify a copy uniquely, it needs a Copy Number in addition to
the Video Number as its primary keys. In super/sub entity case, the primary key(s) of the super entity
would migrate as primary key(s) for each sub entity. Thus, the primary key of the super entity Student,
Student No. also becomes the primary key for the two sub entities, Graduate and Undergraduate.
Primary key(s) in E-R-A diagram can be shown by underscoring the attributes selected as primary key(s).
Foreign Key: We discussed earlier that a common field in two or more related entities would help to
relate the entities. A customer can have many orders, and therefore, the entity, Customer would have
one-to-many relationship with the entity, Order. If Customer No. were the primary key for an instance in
the Customer entity, then to relate various orders of the customer, the Customer entity would have
Customer No. as an attribute. In relational tables, this attribute would be called a foreign key. A foreign
key also helps to navigate a database in the reverse direction. For example, an order record can be
processed for its foreign key to find its parent record. In all one-to-many relationships, the primary key(s)
of the parent entity migrate as foreign key(s) in the child entities. A weak entity can have multiple
instances for each instance in its parent entity. The one-to-many relationship in weak entity also requires
a foreign key as above. An associative entity can be viewed as having multiple one-to-many
relationships. In Figure 9-3, a parent instance in Employee entity has one or more instances of
Employee-Skill entity. Therefore, Employee No., the primary key of Employee migrates as a foreign key
to Employee-Skill. Similarly, a parent instance in Skill entity has one or more instances of Employee-Skill
entity. Therefore, Skill Code., the primary key of Skill migrates as a foreign key to Employee-Skill.
Simple rules to draw the E-R-A diagram and identify the relational tables.
Identify other types of entities such as weak (a.k.a. existence or dependent) entities.
Page 16 of 52
Identify super-type and sub-type entities, if any.
Determine relationships between basic entities (relationships may be unary, binary, ternary, etc.).
Primary keys of participating basic entities become the primary keys of associative entities.
Primary key of basic entity also becomes a part of the primary key(s) for the weak entity. A weak
entity with multiple instances for an instance of a parent entity would need another attribute as an
additional primary key. If none is available, introduce a new attribute to identify each instance
uniquely.
Primary key(s) of parent entity becomes the foreign key(s) in the child entity, in one to many
relationships. Primary key(s) of basic entities also becomes foreign key(s) in weak entities and
associative entities.
We illustrate E-R-A modeling using the following simple case for a fictitious firm called Orangemen
Enterprises. Orangemen Enterprises is a software development company with the following project
1. There are several divisions in the company and each division has a number, name, and a manager.
2. A division can operate several projects but each project belongs to a single division. Some of the
divisions do not operate any projects at all. Each project has a number, name, number of person
3. A division has many employees with at least one employee but each employee is assigned to only
one division. Each employee has a number, name, date of hiring, date of birth, certain IS skills (at
Page 17 of 52
4. An employee may not have all of the skills needed in the company and no employee may have
certain skills required in the company. Each employee should have at least one skill. A skill code
5. An employee may be assigned to many projects or to none at all but each project has at least one
employee assigned to it. For each project assigned to employee, the company keeps track of the
6. An employee may or may not have dependents. Each dependent of an employee has a roll number,
name, and date of birth. A spouse, who is also an employee of the company, should be identified as
such but s/he should not be treated as a dependent of the employee. In addition, a dependent such
The first step in drawing the E-R-A diagram is identifying the entities. These are objects whose
data we want to store in the database. For Orangemen Enterprises, these are division, project,
employee, skill, dependent, and spouse. The data of interest for a division are given in (1) as number,
name, and manager. These are modeled as attributes of division. Similarly, (2), (3), (4), and (6) state the
attributes for the entities project, employee, skill, and dependent, respectively. The case also states
several relationships among entities. For instance, (2) states that division and project are related. The
maximum cardinality is many on the project side and one on the division side. The minimum cardinality is
zero on the project side and one on the division side. By analyzing each sentence given in the case
description, we can identify other relationships and cardinalities. Figure 9-10 shows the entities and the
relationships among them. Since project and employee entities, and employee and skill entities have
many-to-many relationships, the E-R-A model in Figure 9-10 is modified and shown in Figure 9-11. Using
the simple rules described in the previous section, various entities, attributes, primary key(s), and foreign
Page 18 of 52
Date of
Birth
Division Project
Employee Dependent
Dependent Spouse
Skill
Pr Cost
Legend:
Division Project Div No.
Underline - Primary Key
Pr Hours Italic - Foreign Key
Div No.
Pr. No.
Project-
Emp. No.
Employee
Hours
Employee Dependent
DOB
DOH
Employee-
Skill Spouse
Skill
Skill Code Skill Name Emp. No. Skill Code Experience Emp No. Emp. No.
Page 19 of 52
The attributes associated with each entity becomes a relational table as below:
The database design we get at this stage should be considered as a preliminary design. If the E-
R-A diagram is well designed, then the database design derived from the E-R-A diagram will also be well
designed and in 3rd normal form. However, it is better to check whether the tables are well designed at
Normalization
anomalies in a database design. Normalization procedure consists of several stages called as normal
forms. As a design in a stage is transformed into a design in a higher normal form, the design is
improved. Normalization process works at the table level. That is, each table is assessed and improved
independently, even though all tables are part of the same data base. Converting a table into a higher
normal form always involves splitting the table into two or more tables. We discuss different normal forms
A table is said to be in INF if the table does not have multi-valued attributes (repeating groups).
For example, an order has several items. If order data and item data are in the same table, then the table
is not in 1NF.
Page 20 of 52
For a table to be in 1NF, it is enough if a field in any row of the table has a single value. For
example, consider the division table in Orangemen Enterprises. Each row in this table has the data for
one division. Since each division has only one code, only one name, and only one manager, each row in
this table will have only one value for each of these three fields. Therefore, the table division is in 1NF.
Suppose that a division can have many managers, then, in this design, rows corresponding to divisions
withy many managers will have multiple values for the manager field. In that case, division will not be in
1NF. Do note that we created our E-R-A diagram and subsequently the database design based on the
assumption that a division has only one manager. If this assumption is changed, then the E-R-A diagram
will change and we will not come up with the current database design in the first place.
A table is said to be in 2NF if the table is 1NF and all non-key attributes depend on the whole set
Before we discuss the definition for 2NF, we need to define the concept of dependency among
attributes in a table. We say that an attribute, Y, depends on another attribute, X, if only one value of Y
can be associated with a given value of X. In other words, if we know the value of X in a table, say, Vx,
then in all rows that have Vx as the value of X, the value of Y is the same, say Vy. Again, consider the
division table. We can conclude that DivisionName depends on DivisionCode because given
DivisionCode, we know that there is only one row in the table that will have that value for DivisionCode;
DivisionCode is the key for the table, and that in that row, there will be only one value for DivisionName
because the attributes are single-valued in this table. Can we say that DivisionCode depends on
DivisionName? Given a DivisionName, there could be multiple rows in the table with that value for the
DivisionName attribute because there is no restriction that DivisionName be unique for each division. In
each of these rows with the given value of DivisionName, there will be a different DivisionCode because
DivisionCode is unique in each row. Consequently, we find many DivisionCode for a given DivisonName,
In a table in 2NF, every non-key attribute depends on the full set of keys. That is, no non-key
attribute can depend only on a part of the key. The question of partial dependency arises only when the
Page 21 of 52
key has more than one attribute as in Dependent, EmployeeSkill, and ProjectEmployee. Consider the
table ProjectEmployee. There is only one non-key attribute, viz., NumberOfHours. In order to determine
whether the table is in 2NF, we need to determine whether NumberOfHours depends on both
ProjectNumber and EmployeeNumber. That is, NumberOfHours cannot depend only on ProjectNumber
or only on EmployeeNumber. Given a ProjectNumber, is there only one value for NumberOfHours?
Since a project can have multiple employees working in it, and each employee works a certain number of
hours in the project, each project is associated with multiple NumberOfHours, one for each employee
working in the project. Thus, NumberOfHours does not depend on just the ProjectNumber. A similar
reasoning can be used to verify that NumberOfHours does not depend just on the EmployeeNumber.
However, given a ProjectNumber and an EmployeeNumber, there can be only one value for
A table is in 3NF if the table is in 2NF, and the table does not contain a transitive dependency. In
other words, an attribute does not have dependency on an attribute that is not a primary key.
A transitive dependency exists in a table if the table contains three attributes, X, Y, and Z, such
that Y is dependent on X and Z is dependent on Y. Consider a table that contains order number, order
date, order value, customer number, customer name, and customer address, and that order number is
the primary key. A customer name or address is dependent on a customer number and not on the order
number. Thus, both customer name and customer address violate the 3NF rule. To resolve, the
customer name and customer address should have a separate table with customer number as its primary
key. There are higher order normal forms. However, in most practical applications, it is sufficient if the
database is in 3NF. We discussed the three normal forms individually. The essence of 3NF can be
‘A table is in 3NF if every non-key attribute depends on the key, the whole key, and nothing else but the
key.’
Page 22 of 52
Suppose we find that, a table that we derived is not in 3NF. How do we convert into a 3NF design? We
illustrate the normalization procedure using the following example. Because the design for Orangemen
Enterprises is already well-designed, we use a different example for the normalization procedure.
Consider the following table design for storing purchase order data:
(i) The first step in the normalization process is to identify the dependencies among attributes in the table.
In the above notation for dependency, each of the attributes on the right hand side depends on the
(ii) The next step is to convert each dependency into a table with the left hand side as the key of the table.
(iii) Now we check for transitive dependency in each of the tables. We know that in table Order,
put each dependency that relates only non-key attributes as a separate table and eliminate the right hand
side of the dependency from the original table. In the table order, the only dependency that has only non-
key attributes is Customer# ------------- CustomerName. If we put this as a separate table, we get the
Customer table, which we already have in our design. Consequently, there is no need for another table
Page 23 of 52
that has Customer# and CustomerName. Then, we eliminate CustomerName from the original Order
We can verify that other tables are already in 3NF. Thus, the database design in IIINF will be the
following.
Again, the italicized attributes in the database design are the foreign keys.
After the logical database design is completed, the next step is to design the physical database.
The primary purpose of physical database design is to improve the database performance. The ultimate
performance measure of a database system is the speed and accuracy with which queries and updates
are performed by the system. Though the database performance can be accurately measured only after
it is implemented, a number of decisions can be made in the design stage from the performance
perspective. Tuning a database by changing the physical database design is an activity that continues
after a system is implemented. We discuss some of the physical database design techniques below.
Indexing Tables
The logical database design specifies what data should be stored in each table. It does not
specify the implementation details, or how the records are stored within a table. Indexing decisions
specify some of these details. Indexing a table based on a field improves the access speed when the
table is searched using that field. It is similar to indexes in a textbook. To a new reader, who does not
know the contents of the book, the fastest way to locate a topic is to use the index to find the page
number, and then read each line on the page sequentially. Without an index, the reader would have to
start on page one and read sequentially until the topic is found. Indexing reduces the search space to
Page 24 of 52
one page. When accessing a row in a table using its primary key, searching from the first row would be
time consuming. A table can be split into split into several pages (logically) and the page containing the
The important indexing decision to be made is which tables should be indexed and which field or
combination of fields should be used to index the table. In determining which indexes to create, we begin
with the list of operations, both queries and update operations, to be done using the database. In a
typical database system, there will be a large number of data access and manipulation operations. Since
optimizing the performance of the database system for all database operations is impossible, we prioritize
the operations in the order of importance, and make indexing decisions starting from the most important
1. We index every table on its primary key field as the key is often used to retrieve records from a table.
2. We index a table on foreign keys, if any, as the foreign keys are used to join the table with related
tables.
3. We index a table on attributes that are used in query operations. For instance, if product data are
frequently accessed using product name, the product table is indexed on ProductName attribute. We
The input-process-out tables discussed in previous chapters provide valuable information for indexing.
Since each process in an IPO table describes how each table is accessed and describes its processing
mode, it can be used to identify the indexes. The frequency of each online or real-time process identified
from work measurement data indicates tables that need to be accessed more frequently.
Denormalization
Denormalization is the reverse of normalization process. While normalization splits a table into
two or more tables, denormalization combines two or more tables. Normalization improves data integrity,
but increases data access time also. Thus, when access time performance is critical, we may sacrifice
data integrity in order to improve data access time. Consider for example, a customer table that contains
customer number, name, address, city, state, and zip code. This table is not in 3NF because both city
and state depend on zip code. Normalization up to 3NF would lead to two tables. One containing
Page 25 of 52
customer number, name, address, and zip code and the other containing zip code, city, and state. A firm
having a small number of customers throughout the country might want to denormalze these two tables
because to generate address, the two tables have to be joined frequently. In addition, city, state, and zip
code data do not change often. A downside of this denormalization is that a city name might not be
spelled the same way in all records, and data entry may involve additional key strokes. Additional
Good candidate tables to combine are those that are frequently joined to answer queries. For
instance, in a logical database design for order, there is a frequent need to generate a sales report that
contains for each product, the product name, the unit price, and quantity. This query requires the use of
Product and OrderProduct tables. Since using two tables in the same query often slows down the
database, we may choose to combine the two tables into the following one table.
Note that NewOrderProduct table is not in 3NF because ProductName depends only on Product# and
Partitioning
Partitioning splits a table for implementation and storage purposes. Unlike normalization that
splits a table at the logical level, partitioning splits a table solely for implementation (or at the physical
level). Typically, large tables are partitioned so that smaller tables can be accessed to answer queries,
which will improve response time. Partitioning is also needed when a table has to be geographically
distributed on many servers. Records needed frequently for local use would be stored on a table in the
local server.
There are two ways of partitioning a table: horizontal and vertical. In horizontal partitioning, all
the partitions have identical attributes but each partition contains only a portion of the original number of
records. Horizontal partitioning reduces the table size and enables a DBMS to handle the table more
efficiently for processing. Accesses, updates, and joins with other tables can be faster with smaller
tables. Sometimes, specific DBMSs such as Access cannot handle large tables and have to be
Page 26 of 52
In vertical partitioning, all the partitions have the same number of records but each partition
contains only a subset of the original attributes. In vertical partitioning, the primary key has to be
replicated in each partition. Vertical partitions are commonly employed in distributed databases.
Replicating a table in several remote locations requires elaborate processes to keep the tables
synchronized and consistent. If the local data items needed by each distributed site are disjointed, the
table can be split according to local needs. This simplifies and often obviates the need to keep the tables
synchronized.
Physical database design is an art and determining the optimal design requires a lot of
experimentation. However, modern database systems offer advanced tools such as index tuning wizards
and index advisors to help database designers and administrators in the physical design process.
Databases have the advantage of program-data independence. Changing the data and data
structure should have very little impact on programs. Databases can be interfaced and manipulated by a
variety of programming languages. Programs written in languages such as Visual Basic and C++ can be
connected to databases to input and output data in a database. In addition, DBMSs have an easy to
learn and use high level language called Structured Query Language (SQL) to input, output, and
manipulate data.
All modern relational databases such as Microsoft SQL Server and Oracle, and including
Microsoft Access use SQL to manipulate a database. SQL has a number of commands like any other
programming language but SQL commands are structured like English and readable. For example, the
credit check process in an order processing system has to access the following customer table to obtain
Page 27 of 52
The process can use the following SQL statement to obtain the “AvailableCredit”:
Select AvailableCredit
From Customer
Where CustomerNo = “374808”;
The above SQL statement is equivalent to the statement in natural English “get the available
credit information from the customer table, where customer number is 374808.” Note that it needs just
three lines of simple code to retrieve data from the database. A program written in a 3 rd generation
language would many more statements to accomplish the same task. In addition to retrieving data as
shown above, SQL statements can be used to add data to a table using INSERT command, delete data
with DELETE command, create new tables with CREATE command, remove tables with DROP
Stored Procedure
The SQL program as written above and executed would be compiled each time before execution.
To save compilation time, speed up execution, and enable sharing of a SQL program, it can be stored in
the database server in its compiled state. Such compiled and stored SQL programs are known as Stored
Procedures. The SQL program example discussed in the previous section can obtain the available
credit for the customer with a specific customer number of 374808. The same program as a stored
procedure can be used to obtain the available credit for any customer by what is known as
parameterization. Any customer number can be passed to this stored procedure as a parameter, and the
stored procedure used to obtain the available credit for that customer. Business rules can be
standardized and used within the same system and in other systems. Stored procedures can provide
modularity because these procedures can be called from any program module. A stored procedure can
another stored procedure in a nested operation. Since these are stored in the server, they save valuable
network bandwidth. Stored procedures can be scheduled to be executed as batch programs without
human intervention.
Page 28 of 52
Trigger
A trigger is a stored procedure that automatically executes when a certain event takes place. In
DBMS terminology, a trigger is fired. Triggers add to the flexibility offered by stored procedures. In using
a trigger, a programmer needs to identify when a trigger should be fired. A trigger can be associated with
commands (events) such as INSERT, DELETE, and UPDATE. In such cases, it can be used AFTER the
Let us discuss a simple example that can use a trigger. In an inventory control system, a
procurement order has to be placed each time, the quantity on hand in the inventory table is less than or
equal to the reorder level. Since the quantity on hand will be reduced each time the inventory table is
updated for an issue of materials to production, a necessary task is to check whether the quantity on hand
is less than reorder level. For each UPDATE of QUANTITYONHAND in INVENTORY, a trigger called
Although update for issue, comparing quantity on hand with ROL, and placing a procurement
order can be combined into a single program, the activities are tightly coupled (discussed in Chapter 12
on Program Design). By having a separate program for each of the above, the programs become
modular. If the system needs to compare quantity on hand ROL in any other place, the second program
can be reused. If a procurement order has to be placed for nay other reason, the last program can be
reused.
Orangemen Enterprises and other examples used so far are examples of cases that have already
been structured sufficiently for us to draw an E-R-A model from the description. However, in reality, the
information required for drawing an E-R-A model is rarely as well structured as Orangemen Enterprises.
Fortunately, the DFD, its supplements, and the data dictionary for the proposed system contain the
information we require to draw the E-R-A diagram and to design the database. We illustrate this
procedure using the Accounts Payable system discussed in the previous chapter.
Page 29 of 52
In chapter 8, we derived the unique set of data items for the proposed AP system. We first
identify the data items that will be stored in the database. Table 9 -1 shows the data items that will be
stored and reasons for choosing or not choosing to store an item in the database.
Table 9 -1 Data Items That Will Be Stored in the Database in the AP System
We can identify five entities in the proposed system. They are vendor, invoice, item, check, and
check book register. Vendor entity represents the collection of vendors from whom OG has bought
various items. An invoice entity represents a collection of invoices received from various vendors. Item
entity contains all items bought by OG. Check is the set of all checks written by OG to pay its vendors.
Check book register keeps a list of all checks, their date and time, and the corresponding balances. One
Page 30 of 52
way to identify relationships, other than knowledge about the problem domain, is the structure of data
Vendor Ledger: vendor # + vendor name + vendor address + {invoice # + invoice date + Amount Owed +
The above data structure contains data elements associated with three entities: vendor, invoice,
and check. Vendor #, vendor name, and vendor address are attributes of vendor. Invoice #, Invoice
date, and Amount owed are attributes of invoice. Check # and payment amount are attributes of check.
The data structure suggests that these entities are related. Further, it also shows that a vendor can have
multiple invoices (note the repeating group for invoice data for each vendor) and that an invoice is related
to only one check (note the absence of repeating group for check data for an invoice). By using similar
analysis, we can identify other relationships and maximum cardinalities. The data structure does not
indicate the minimum cardinalities; but we should determine those using domain knowledge. The E-R-A
diagram for the proposed AP system is shown in Figure 9-12. Since a credit entry (such as deposit) in
the checkbook register would not have a corresponding check, the cardinalities are as shown in the
model.
Checkbook
Vendor Check
Register
Branch
Invoice
Ledger
Item
The many-to-many relationship between the Item and the Invoice would spawn an associative
entity in the modified E-R-A diagram for the system. Various data items associated with each entity, their
primary key(s), and foreign key(s) are shown in Figure 9-13. A new data item, Item # has been added to
Page 31 of 52
identify the entity Item because item names may have multiple spellings and thus cause duplicate rows
Invoice#
Checkbook
Vendor Check
Register
Invoice# SP Name
SP Name SP Sign
Extension
Item Invoice-Item
Quantity
If the modified E-R-A diagram has been drawn correctly, the tables associated with the respective entities
should be in 3NF.
Invoice (invoice #, invoice date, payment due date, delivered to store, sales tax, shipping & handling
charges, amount owed, invoice status, vendor#, SalespersonName)
Page 32 of 52
The table InvoiceItem appears to violate 2NF because one might expect Unit Price to depend on
only Item Number. However, recall our discussion that UnitPrice may vary, and therefore it needs to be
identified with each Invoice#. All the tables are in 3NF. The next step is the physical design of the
We do not denormalize the database design, as the number of tables in the database is fairly
small and so, none of the queries will require joining a large number of tables.
After the database design is finalized, we need to revise our input-process-output tables because
the number of data stores and the contents of each data store have changed following the database
(Please note that Item # has been added to the data dictionary to identify each item uniquely, and
Page 33 of 52
Invoice (from Store) Verify Signature Verified Invoice (with Bookkeeper)
Sign(from Bookkeeper) (Manual) Invoice (Returned to store)
Invoice: delivered to store + Compare the signatures Verified Invoice: vendor # + vendor
sales person signature on the invoice and the name + vendor address + invoice # +
(other data items incuding branch ledger. invoice date + payment due date +
vendor # + vendor name + If signatures are not delivered to store + sales person
vendor address + invoice # identical signature + {item # + item name +
+ invoice date + payment Return the invoice to item quantity + unit price +
due date + {item # + item store for confirmation extension} + total extension + sales
name + item quantity + unit Endif tax + shipping & handling charges +
price + extension} + total amount owed
extension + sales tax +
shipping & handling Invoice: vendor # + vendor name +
charges + amount owed are vendor address + invoice # + invoice
not needed for the process date + payment due date + delivered
but appear on the paper to store + sales person signature +
invoice) {item # + item name + item quantity
+ unit price + extension} + total
Sign: sales person name + extension + sales tax + shipping &
sales person’s signature handling charges + amount owed
Page 34 of 52
Invoice Prepare Check (Batch) Invoice
Checkbook Checkbook
Check
Invoice: Once in ten days, Invoice:
invoice # + amount Get the Invoice table invoice # + invoice
owed + invoice For each invoice in the invoice file with invoice status
status status = outstanding”
If payment due date – current date <= 10 Checkbook:
Checkbook: Get checkbook table date/time + dr/cr
date/time + balance Get the last balance in the checkbook +balance
If amount owed <= balance
Check date/time = current date/time Check:
Generate Check # Check # + Check
Check Status = “No” Date/time + Check
Write Check #, Check Date/Time, Status + Invoice #
Check Status, and Invoice # in Check
record
Dr/cr = Dr
Balance = balance – amount owed
Update checkbook record
Invoice status = "in-process"
Update invoice record
Endif
Endif
Next invoice
Page 35 of 52
Invoice #
This process is not required because, the Vendor Payable Ledger: Vendor
outputs of this process are not stored in No. + Vendor Name + Total
the database. Purchases + Total Payments +
Total Outstanding Amount
Page 36 of 52
Query (from Provide Invoice Status (online) Response
Vendor) (Vendor)
Vendor
Invoice
Query: Vendor Upon receiving a query from a vendor, Get the Vendor No. Response:
# Get Vendor record Vendor No.
For each invoice of vendor w/ Invoice status = “Outstanding” + Vendor
Vendor: or “In-Process” Name +
Vendor No. + Get Invoice record {Invoice No.
Vendor Name If Invoice Status = Outstanding + Invoice
Write Invoice No., Invoice Date, Amount Owed, and Date +
Invoice: Invoice Status Amount
Invoice # + Total outstanding amount = total outstanding amount + Owed +
invoice date + amount owed Invoice
amount owed + Elseif Invoice Status = In-process Status} +
invoice status Write Invoice No., Invoice Date, Amount Owed, and Total
+ vendor# Invoice Status Outstanding
Total amount in-process = total amount in-process + Amount +
amount owed Total
Endif Amount in
Next invoice process
Format & display Response
In larger systems, you may find each table interacting with each process, and thus, producing a
complex DFD. In the design stage, we need information about each process and each data item that is
input or output. Since the DFD representation is no longer useful in the design stage, we will discard it
Page 37 of 52
Chapter Summary
Database design process includes two stages: logical data base design and physical data base
design.
Logical data base design shows the tables, the data elements in each table, and relationships
between tables.
The input to the logical data base design process is the data dictionary prepared at the end of the
analysis phase.
The first step in the logical data base design process is identifying the data elements that will be
This chapter discussed the various elements of an E-R-A diagram and how to construct an E-R-A
diagram.
The next step in the logical data base design process is the conversion of E-R-A diagram into
This chapter discussed the step-by-step procedure for converting an E-R-A diagram into a logical
The chapter discussed normalization as a tool to verify the goodness of logical data base design.
Once the logical data base design is completed, then the physical data base design is done.
The physical data base design tunes the data base to improve performance.
This chapter discussed indexing, partitioning, and denormalization methods to tune the data
base.
After the data base design is finalized, the input-process-output tables derived at the end of the
Finally, this chapter illustrated the data base design process using a case study.
Page 38 of 52
End of Chapter Exercises
Key Terms
Associative Entity
Attribute
Binary Relationship
Denormalization
Entity
Foreign Key
Indexing
Maximum Cardinality
Minimum Cardinality
Normalization
Partitioning
Primary Key
Relationship
SQL
Stored Procedure
Sub entity
Super Entity
Ternary Relationship
Trigger
Page 39 of 52
Weak Entity
Page 40 of 52
Self-Study Questions
b. Normalization
c. Indexing
d. b and c
e. a and b.
a. True
b. False
3. Data items identified in the data dictionary are modeled in an E-R-A diagram as
a. entities
b. attributes
c. relationships
a. The key of the entity corresponding to the one part is put in the entity corresponding to
b. The key of the entity corresponding to the may part is put in the entity corresponding to
c. Both a and b
Page 41 of 52
5. Normalization process results in
d. Smaller tables
a. true
b. false
7. In horizontal partitioning,
a. Each partition will have sub set of attributes from the table
b. Each partition will have all attributes, but a sub set of records
c. Each partition will have a sub set of attributes and a sub set of records.
d. Each partition will have a sub set of tables from the data base.
8. Indexing of a table
Page 42 of 52
9. Denormalization
10. A table that has a transitive dependency violates the condition for
Page 43 of 52
Review Questions
2. What do (i) entities, (ii) attributes, and (iii) relationships model in an E-R-A diagram?
4. What are different types of entities? Give examples for each type.
7. What is the purpose of logical database design and physical database design?
10. Under what conditions will a derived attribute be stored (or not stored) in a database?
11. Why do we say that DFDs and E-R-A diagrams of a system are like “two sides of the same coin”?
12. E-R-A diagrams capture more information about data than data documentation associated with
Page 44 of 52
Exercises
The BG bank serves the people of Bowling Green. A customer of the BG bank can have a checking account
or a savings account or both. A customer can have only one savings account, and one checking account.
BG bank serves two types of customers: individual and institutional. When the customer opens an account,
the bank obtains several data depending on the type of customer. For individual customers, data such as the
social security number, name, occupation, and salary are obtained. For institutional customers, data such as
the name, number of employees, revenue, and profit are collected. The customer is then given an account
number. The bank maintains, for each account, the account balance.
The institutional customers are assigned personal officers; one customer has one assigned officer.
However, an officer may handle multiple customers. The officer schedules meetings with the institutional
members frequently to discuss customer-related issues. The bank maintains data such as time, date, venue,
The bank also provides loans. Customers who obtain loans are called as clients. The same client can
obtain several loans from the bank. The bank gives home and car loans. Each loan has a repayment
period. This period is fixed by the bank based on the client’s financial status, the type of loan, and a
2. Consider the following data dictionary for a dentist’s office. Develop a database design in III NF.
Data Dictionary:
Page 45 of 52
7. servcode service code
13. insurname name of the insurance company to which the family belongs.
A patient gets a service only once in a day. A family has only one insurance company. There is a fixed fee
for each service. The fee owed by the patient and the insurance company for a service depends on the
insurance company.
3. IT Services, Inc. is an engineering firm with approximately 1000 employees. A database is required to
keep track of all employees, their skills, and projects assigned and departments worked in. Every
employee has a unique number assigned by the firm. The firm needs to store his/her name and date-of-
birth. Each employee is given a job title. The employees are also categorized into different groups such
as professionals, and administrative assistants. The relevant data to be recorded for professionals is the
There are several departments, each with a unique name. An employee can report to only one
To procure various kinds of equipment, each department deals with many vendors. A vendor typically
supplies equipment to many departments. It is required to store the name and address of each vendor,
Page 46 of 52
Many employees can work on a project. An employee can work on many projects. Each project is
carried out in a city. For each city, we are interested in its state and the population. An employee can
have many skills. An employee uses each skill she/he possesses in at least one project. Each skill is
assigned a number. A short description is required to be stored for each skill. Projects are distinguished
The US Airlines Company publishes a monthly flight log report that tracks which type of aircraft, and
the number of hours flown by an individual pilot during the month. A separate report is prepared for each
pilot, and is used to monitor pilot flight proficiency for two types of aircraft (fixed-wing, and rotorcraft) which a
pilot may be qualified to fly. Pilots may fly a different aircraft in each trip. Each aircraft has a single crew chief
permanently assigned to perform maintenance on the aircraft, although a crew chief may crew more than one
aircraft. Each aircraft is identified by an aircraft number. Each aircraft also has a seating capacity. The pilots
have pilot license numbers. The report also specifies, for each aircraft, its characteristics such as number of
engines and the type of propeller in the case of fixed-wing aircrafts, and the rotor speed in the case of
rotorcrafts.
Page 47 of 52
List of Figures
Figure 9 -3. Modified E-R-A Diagram for the Example in Figure 9-2
Page 48 of 52
List of Tables
Table 9 -1 Data Items That Will Be Stored in The Database in The AP System
Page 49 of 52
Chapter Index
Index entry as it will appear in the Book Index String to search for in the text body
Associative Entity many-to-many relationships create a new type of
entity called associative entity
Attribute An attribute can be defined as a property of an
entity.
Binary Relationship A binary relationship connects two entity classes, or
has a degree of two.
Denormalization Denormalization is the reverse of normalization
process
Entity An entity is an identifiable thing or object that is
distinguishable from other objects
First Normal Form A table is said to be in INF if the table does not have
multi-valued attributes
Foreign Key A foreign key also helps to navigate a database in
the reverse direction.
Indexing Indexing a table based on a field improves the
access speed when the table is searched using that
field.
Logical database design The logical database design involves drawing an E-
R-A diagram to determine various tables in the
database, relationships among tables, and fields in
each table.
Maximum Cardinality We say that the minimum and maximum
cardinalities of this relationship on the employee
side are one and many respectively.
Minimum Cardinality We say that the minimum and maximum
cardinalities of this relationship on the employee
side are one and many respectively.
Normalization Normalization is a procedure that reduces data
redundancy and mitigates maintenance anomalies
in a database design.
Partitioning Partitioning splits a table for implementation and
storage purposes
Physical database design The primary purpose of physical database design is
to improve the database performance.
Primary Key Each basic entity in an E-R-A model would have
primary key(s) that identify each instance.
Recursive Relationship an entity related to itself or degree 1
Relationship A relationship models an association between
instances of entities.
Second Normal Form A table is said to be in 2NF if the table is 1NF and
all non-key attributes depend on the whole set of
keys.
SQL In addition, DBMSs have an easy to learn and use
high level language
Stored Procedure The SQL program as written above and executed
would
Structured Query Language In addition, DBMSs have an easy to learn and use
high level language
Sub entity We use sub entities to model such scenarios.
Super Entity we also refer to super entity and a sub entity as
Page 50 of 52
parent and child, respectively.
Ternary Relationship In a ternary relationship, three or more entities will
be involved
Third Normal Form A table is in 3NF if the table is in 2NF, and the table
does not contain a transitive dependency.
Trigger A trigger is a stored procedure that automatically
executes
Weak Entity Weak entities are also known as dependent entities,
Page 51 of 52
Additional Reading
2. R. Elmasri and S. Navathe, Fundamentals of Database Systems, Third Edition, Addison Wesley,
2000.
Page 52 of 52