0% found this document useful (0 votes)
4 views

Design A Database

Uploaded by

solomon baynes
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Design A Database

Uploaded by

solomon baynes
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 72

Design a Database

Database System
Database is collection of related data which
contained the fundamental element called table.
The table is related within each other that are
categorized into tuples and fields.
Database systems are designed to manage large
data set in an organization. The data management
involves both definition and the manipulation of the
data which ranges from simple representation of
the data to considerations of structures for the
storage of information. The data management also
consider the provision of mechanisms for the
manipulation of information.
Today, Databases are essential to every business.
They are used to maintain internal records, to
present data to customers and clients on the
World-Wide-Web, and to support many other
commercial processes. Databases are likewise
found at the core of many modern organizations.

Debre Markos Polytechnic college Dessie A.


1
Design a Database

The power of databases comes from a body of


knowledge and technology that has developed over
several decades and is embodied in specialized
software called a database management system,
or DBMS. A DBMS is a powerful tool for creating
and managing large amounts of data efficiently and
allowing it to persist over long periods of time,
safely. These systems are among the most
complex types of software available.
In common dialect, the term database refers to a
collection of data that is managed by a DBMS.
Thus the DB course is about:
1. How to organize data
2. Supporting multiple users
3. Efficient and effective data retrieval
4. Secured and reliable storage of data
5. Maintaining consistent data
6. Making information useful for decision
making
Data management passes through the different
levels of development along with the development
in technology and services. These levels could best
be described by categorizing the levels into three
levels of development. Even though there is an
Debre Markos Polytechnic college Dessie A.
2
Design a Database

advantage and a problem overcome at each new


level, all methods of data handling are in use to
some extent. The major three levels are;
1. Manual Approach
2. Traditional File Based Approach
3. Database Approach
1. Manual Approach
In the manual approach, data storage and retrieval
follows the primitive and traditional way of
information handling where cards and paper are
used for the purpose. The data storage and
retrieval will be performed using human labour.
1. Files for as many event and objects as the
organization has are used to store information.
2. Each of the files containing various kinds of
information is labelled and stored in one or
more cabinets.
3. The cabinets could be kept in safe places for
security purpose based on the sensitivity of the
information contained in it.
4. Insertion and retrieval is done by searching
first for the right cabinet then for the right the
file then the information.
One could have an indexing system to facilitate
access to the data
Limitations of the Manual approach

Debre Markos Polytechnic college Dessie A.


3
Design a Database

Prone to error
Difficult to update, retrieve, integrate
You have the data but it is difficult to compile
the information
Limited to small size information
Cross referencing is difficult

Debre Markos Polytechnic college Dessie A.


4
Design a Database

2. Traditional File Based Approach


After the introduction of Computer for data
processing to the business community, the need
to use the device for data storage and processing
increase. There were, and still are, several
computer applications with file based processing
used for the purpose of data handling. Even
though the approach evolved over time, the basic
structure is still similar if not identical.

Debre Markos Polytechnic college Dessie A.


5
Design a Database

File based systems were an early attempt to


computerize the manual filing system.
This approach is the decentralized
computerized data handling method.
A collection of application programs perform
services for the end-users. In such systems,
every application program that provides service
to end users define and manage its own data
Such systems have number of programs for
each of the different applications in the
organization.
Since every application defines and manages its
own data, the system is subjected to serious
data duplication problem.
File, in traditional file based approach, is a
collection of records which contains logically
related data.
Limitations of the Traditional File Based
approach

Debre Markos Polytechnic college Dessie A.


6
Design a Database

As business application become more complex


demanding more flexible and reliable data handling
methods, the shortcomings of the file based
system became evident. These shortcomings
include, but not limited to:

Debre Markos Polytechnic college Dessie A.


7
Design a Database

Separation or Isolation of Data: Available


information in one application may not be known.
Limited data sharing
Lengthy development and maintenance time
Duplication or redundancy of data
Data dependency on the application
Incompatible file formats between different
applications and programs creating
inconsistency.
Fixed query processing which is defined during
application development
The limitations for the traditional file based data
handling approach arise from two basic reasons.
1. Definition of the data is embedded in
the application program which makes it difficult
to modify the database definition easily.
2. No control over the access and
manipulation of the data beyond that imposed
by the application programs.
3. Database Approach
Following a famous paper written by Ted Codd
in 1970, database systems changed
significantly. Codd proposed that database
systems should present the user with a view of
data organized as tables called relations.
Behind the scenes, there might be a complex
Debre Markos Polytechnic college Dessie A.
8
Design a Database

data structure that allowed rapid response to a


variety of queries. But, unlike the user of earlier
database systems, the user of a relational
system would not be concerned with the
storage structure. Queries could be expressed
in a very high-level language, which greatly
increased the efficiency of database
programmers. The database approach
emphasizes the integration and sharing of data
throughout the organization.
Thus in Database Approach:

Debre Markos Polytechnic college Dessie A.


9
Design a Database

Database is just a computerized record keeping


system or a kind of electronic filing cabinet.
Database is a repository for collection of
computerized data files.
Database is a shared collection of logically
related data designed to meet the information
needs of an organization. Since it is a shared
corporate resource, the database is integrated
with minimum amount of or no duplication.
Database is a collection of logically related data
where these logically related data comprises
entities, attributes, relationships, and business
rules of an organization's information.
In addition to containing data required by an
organization, database also contains a
description of the data which called as
“Metadata” or “Data Dictionary” or “Systems
Catalogue” or “Data about Data”.
Since a database contains information about the
data (metadata), it is called a self descriptive
collection on integrated records.
The purpose of a database is to store information
and to allow users to retrieve and update that
information on demand.
Database is designed once and used
simultaneously by many users.

Debre Markos Polytechnic college Dessie A.


10
Design a Database

Unlike the traditional file based approach in


database approach there is program data
independence. That is the separation of the data
definition from the application. Thus the
application is not affected by changes made in
the data structure and file organization.
Each database application will perform the
combination of: Creating database, Reading,
Updating and Deleting data.
Benefits of the database approach
Data can be shared: two or more users can
access and use same data instead of storing data
in redundant manner for each user.
Improved accessibility of data: by using
structured query languages, the users can easily
access data without programming experience.
Redundancy can be reduced: isolated data is
integrated in database to decrease the redundant
data stored at different applications.
Quality data can be maintained: the different
integrity constraints in the database approach
will maintain the quality leading to better
decision making
Inconsistency can be avoided: controlled data
redundancy will avoid inconsistency of the data in
the database to some extent.
Transaction support can be provided: basic
demands of any transaction support systems are
implanted in a full scale DBMS.
Debre Markos Polytechnic college Dessie A.
11
Design a Database

Integrity can be maintained: data at different


applications will be integrated together with
additional constraints to facilitate shared data
resource.
Security majors can be enforced: the shared data
can be secured by having different levels of
clearance and other data security mechanisms.
Improved decision support: the database will
provide information useful for decision making.
Standards can be enforced: the different ways of
using and dealing with data by different unite of
an organization can be balanced and
standardized by using database approach.
Compactness: since it is an electronic data
handling method, the data is stored compactly
(no voluminous papers).
Speed: data storage and retrieval is fast as it will
be using the modern fast computer systems.
Less labour: unlike the other data handling
methods, data maintenance will not demand
much resource.
Centralized information control: since relevant
data in the organization will be stored at one
repository, it can be controlled and managed at
the central level.
Limitations and risk of Database Approach
Introduction of new professional and specialized
personnel.
Complexity in designing and managing data
Debre Markos Polytechnic college Dessie A.
12
Design a Database

 cost and risk during conversion from the old to


the new system
High cost to be incurred to develop and maintain
the system
Complex backup and recovery services from the
users perspective
Reduced performance due to centralization and
data independency
Database Management System
(DBMS)
Database Management System (DBMS) is a
Software package used for providing EFFICIENT,
CONVENIENT and SAFE MULTI-USER (many
people/programs accessing same database, or
even same data, simultaneously) storage of and
access to MASSIVE amounts of PERSISTENT (data
outlives programs that operate on it) data.
A DBMS also provides a systematic method for
creating, updating, storing, retrieving data in a
database. DBMS also provides the service of
controlling data access, enforcing data integrity,
managing concurrency control, and recovery.
Having this in mind, a full scale DBMS should at
least have the following services to provide to the
user.
Data storage, retrieval and update in the
database

Debre Markos Polytechnic college Dessie A.


13
Design a Database

1. A user accessible catalogue


2. Transaction support service: ALL
or NONE transaction, which minimize data
inconsistency.
3. Concurrency Control Services:
access and update on the database by different
users simultaneously should be implemented
correctly.
4. Recovery Services: a mechanism
for recovering the database after a failure must
be available.
5. Authorization Services: In this the
data administrator uses to create account and
account restrictions. Then DBMS should enforce
these restrictions automatically.
6. (Security): must support the
implementation of access and authorization
service to database administrator and users.
7. Support for Data
Communication: should provide the facility to
integrate with data transfer software or data
communication managers.
8. Integrity Services: rules about
data and the change that took place on the data,
correctness and consistency of stored data, and
quality of data based on business constraints.
9. Services to promote data
independency between the data and the
application
Debre Markos Polytechnic college Dessie A.
14
Design a Database

10. Utility services: sets of utility service


facilities like Importing data

Debre Markos Polytechnic college Dessie A.


15
Design a Database

DBMS and Components of DBMS


Environment
A DBMS is software package used to design,
manage, and maintain databases. Each DBMS
should have facilities to define the database,
manipulate the content of the database and
control the database. These facilities will help
the designer, the user as well as the database
administrator to discharge their responsibility in
designing, using and managing the database. It
provides the following facilities:
Data Definition Language (DDL):
 Language used to define each data element
required by the organization.
 Commands for setting up schema or the
intension of database
 These commands are used to setup a database,
create, delete and alter table with the facility of
handling constraints
Data Manipulation Language (DML):
 Is a core command used by end-users and
programmers to store, retrieve, and access the
data in the database e.g. SQL
 Since the required data or Query by the user
will be extracted using this type of language, it
is also called "Query Language"
Data Dictionary:
Debre Markos Polytechnic college Dessie A.
16
Design a Database

 Due to the fact that a database is a self


describing system, this tool, Data
Dictionary, is used to store and organize
information about the data stored in the
database.
Data Control Language:
 Database is a shared resource that demands
control of data access and usage. The database
administrator should have the facility to control
the overall operation of the system.
 Data Control Languages are commands that will
help the Database Administrator to control the
database.
 The commands include grant or revoke
privileges to access the database or particular
object within the database and to store or
remove database transactions
GRANT - gives user's access privileges to
database
REVOKE - withdraw access privileges given
with the GRANT command
Transaction Control (TCL) is statements used to manage
the changes made by DML statements. It allows statements
to be grouped together into logical transactions.
eg. COMMIT - save work done
The DBMS is software package that helps to
design, manage, and use data using the
Debre Markos Polytechnic college Dessie A.
17
Design a Database

database approach. Taking a DBMS as a


system, one can describe it with respect to it
environment or other systems interacting with
the DBMS. The DBMS environment has five
components. To design and use a database,
there will be the interaction or integration of
Hardware, Software, Data, Procedure and
People.
1. Hardware: are components that one can touch
and feel. These components are comprised of
various types of personal computers, mainframe
or any server computers to be used in multi-user
system, network infrastructure, and other
peripherals required in the system.
2. Software: are collection of commands and
programs used to manipulate the hardware to
perform a function. These include components
like the DBMS software, application programs,
operating systems, network software, language
software and other relevant software.
3. Data: since the goal of any database system is
to have better control of the data and making
data useful, Data is the most important
component to the user of the database. There are
two categories of data in any database system:
that is Operational and Metadata. Operational
data is the data actually stored in the system to
be used by the user. Metadata is the data that is
used to store information about the database
Debre Markos Polytechnic college Dessie A.
18
Design a Database

itself.The structure of the data in the database is


called the schema, which is composed of the
Entities, Properties of entities, and relationship
between entities.
4. Procedure: this is the rules and regulations on
how to design and use a database. It includes
procedures like how to log on to the DBMS, how
to use facilities, how to start and stop transaction,
how to make backup, how to treat hardware and
software failure, how to change the structure of
the database.
5. People: this component is composed of the
people in the organization that are responsible or
play a role in designing, implementing,
managing, administering and using the resources
in the database. This component includes group
of people with high level of knowledge about the
database and the design technology to other with
no knowledge of the system except using the
data in the database.

Debre Markos Polytechnic college Dessie A.


19
Design a Database

Database Development Life Cycle


As it is one component in most information system
development tasks, there are several steps in
designing a database system. Here more emphasis
is given to the design phases of the system
development life cycle.

The major steps in database design are;

1. Planning: that is identifying information gap in


an organization and propose a database solution
to solve the problem.

2. Analysis: that concentrates more on fact


finding about the problem or the opportunity.
Feasibility analysis, requirement determination
and structuring, and selection of best design
method are also performed at this phase.

3. Design: in database designing more emphasis


is given to this phase. The phase is further
divided into three sub-phases.
A. Conceptual Design: concise description
of the data, data type, relationship between
data and constraints on the data.
 There is no implementation or physical detail
consideration.

Debre Markos Polytechnic college Dessie A.


20
Design a Database

 Used to elicit and structure all information


requirements
B. Logical Design: a higher level
conceptual abstraction with selected
specific data model to implement the data
structure.
 It is particular DBMS independent and
with no other physical considerations.
C. Physical Design: physical implementation
of the upper level design of the database with
respect to internal storage and file structure of
the database for the selected DBMS.
 To develop all technology and
organizational specification.
4. Implementation: the testing and deployment
of the designed database for use.
5. Operation and Support: administering and
maintaining the operation of the database system
and providing support to users.

Roles in Database Design and Use

As people are one of the components in DBMS


environment, there are group of roles played by
different stakeholders of the designing and
operation of a database system.
Database Administrator (DBA)

Debre Markos Polytechnic college Dessie A.


21
Design a Database

Responsible to oversee, control and manage


the database resources (the database itself, the
DBMS and other related software)
Authorizing access to the database
Coordinating and monitoring the use of the
database
Responsible for determining and acquiring
hardware and software resources
Accountable for problems like poor security,
poor performance of the system
Involves in all steps of database development
We can have further classifications of this role in
big organizations having huge amount of data
and user requirement.
1. Data Administrator (DA): is responsible
on management of data resources. Involves in
database planning, development,
maintenance of standards policies and
procedures at the conceptual and logical
design phases.

2. Database Administrator (DBA): is more


technically oriented role. Responsible for the
physical realization of the database. Involves
in physical design, implementation, security
and integrity control of the database.

Debre Markos Polytechnic college Dessie A.


22
Design a Database

1. Database Designer (DBD)


Identifies the data to be stored and choose the
appropriate structures to represent and store
the data.
Should understand the user requirement and
should choose how the user views the
database.
Involve on the design phase before the
implementation of the database system.
We have two distinctions of database designers,
one involving in the logical and conceptual design
and another involving in physical design.
a) Logical and Conceptual DBD
Identifies data (entity, attributes and
relationship) relevant to the organization
Identifies constraints on each data
Understand data and business rules in the
organization
Sees the database independent of any data
model at conceptual level and consider one
specific data model at logical design phase.
b) Physical DBD
 Take logical design specification as input and
decide how it should be physically realized.
 Map the logical data model on the specified
DBMS with respect to tables and integrity
constraints. (DBMS dependent designing)
 Select specific storage structure and access
path to the database
Debre Markos Polytechnic college Dessie A.
23
Design a Database

 Design security measures required on the


database
2. Application Programmer and Systems
Analyst
System analyst determines the user
requirement and how the user wants to view
the database.
The application programmer implements
these specifications as programs; code, test,
debug, document and maintain the
application program.
Determines the interface on how to retrieve,
insert, update and delete data in the
database.
The application could use any high level
programming language according to the
availability, the facility and the required
service.
3. End Users
Workers, whose job requires accessing the
database frequently for various purpose. There
are different group of users in this category.
Naïve Users:
Sizable proportion of users
Unaware of the DBMS
Only access the database based on their
access level and demand
Use standard and pre-specified types of
queries.
Debre Markos Polytechnic college Dessie A.
24
Design a Database

a) Sophisticated Users
Are users familiar with the structure of the
Database and facilities of the DBMS.
Have complex requirements
Have higher level queries
Are most of the time engineers, scientists,
business analysts, etc
a) Casual Users
Users who access the database occasionally.
Need different information from the database
each time.
Use sophisticated database queries to satisfy
their needs.
Are most of the time middle to high level
managers.
 Users may be divided into those who actually
use and control the database content, and those
who design, develop and maintain database
applications(called “Actors on the scene”), and
those who design and develop the DBMS and
related tools, and the computer systems
operators(called “Workers behind the scene”). eg
of“ Actors on the Scene”.
Actors On the Scene:
 Data Administrator
 Database Administrator
 Database Designer
 End Users
Workers behind the Scene
Debre Markos Polytechnic college Dessie A.
25
Design a Database

 DBMS designers and implementers: who


design and implement different DBMS
software.
 Tool Developers: experts who develop
software packages that facilitates database
system designing and use. Prototype,
simulation, code generator developers could be
an example. Independent software vendors
could also be categorized in this group.
 Operators and Maintenance Personnel:
system administrators who are responsible for
actually running and maintaining the hardware
and software of the database system and the
information technology facilities.
Data Independence
Logical Data Independence:
 Refers to immunity of external schemas to
changes in conceptual schema.
 Conceptual schema changes e.g.
addition/removal of entities should not require
changes to external schema or rewrites of
application programs.
 The capacity to change the conceptual schema
without having to change the external
schemas and their application programs.
Physical Data Independence
 The ability to modify the physical schema
without changing the logical schema
 Applications depend on the logical schema
Debre Markos Polytechnic college Dessie A.
26
Design a Database

 In general, the interfaces between the various


levels and components should be well defined
so that changes in some parts do not seriously
influence others.
 The capacity to change the internal schema
without having to change the conceptual
schema
 Refers to immunity of conceptual schema to
changes in the internal schema
 Internal schema changes e.g. using different
file organizations, storage structures/devices
should not require change to conceptual or
external schemas.
The distinction between a Data Definition
Language (DDL) and a Data Manipulation
Language (DML)
Data Definition Language (DDL)
 Allows DBA or user to describe and name
entitles, attributes and relationships required
for the application.
 Specification notation for defining the
database schema
Data Manipulation Language (DML)
 Provides basic data manipulation operations
on data held in the database.
 Language for accessing and manipulating the
data organized by the appropriate data model
 DML also known as query language

Debre Markos Polytechnic college Dessie A.


27
Design a Database

Procedural DML: user specifies what data is


required and how to get the data.
Non-Procedural DML: user specifies what
data is required but not how it is to
be retrieved
SQL is the most widely used non-procedural
language query language.
Fourth Generation Language (4GL)
 Query Languages  Graphics
 Forms Generators Generators
 Report Generators  Application
Generators
A Classification of data models
A specific DBMS has its own specific Data Definition
Language, but this type of language is too low level
to describe the data requirements of an
organization in a way that is readily
understandable by a variety of users.
We need a higher-level language.
Such a higher-level is called data-model.
Data Model: a set of concepts to describe the
structure of a database, and certain
constraints that the database should obey.
A data model is a description of the way that data
is stored in a database. Data model helps to
understand the relationship between entities and
to create the most effective structure to hold data.
Data Model is a collection of tools or concepts for
describing
Debre Markos Polytechnic college Dessie A.
28
Design a Database

 Data  Data semantics


 Data relationships  Data constraints
The main purpose of Data Model is to represent
the data in an understandable way.
Categories of data models include:
 Object-based
 Record-based
 Physical
Record-based Data Models
Consist of a number of fixed format records.
Each record type defines a fixed number of fields,
Each field is typically of a fixed length.
 Hierarchical Data Model
 Network Data Model
 Relational Data Model
1. Hierarchical Model
The simplest data model
Record type is referred to as node or
segment
The top node is the root node
Nodes are arranged in a hierarchical
structure as sort of upside-down tree
A parent node can have more than one child
node
A child node can only have one parent node
The relationship between parent and child is
one-to-many
Relation is established by creating physical
link between stored records (each is stored
Debre Markos Polytechnic college Dessie A.
29
Design a Database

with a predefined access path to other


records)
To add new record type or relationship, the
database must be redefined and then stored
in a new form.

Department

Employee Job

Time Card Activity

ADVANTAGES of Hierarchical Data Model:


 Hierarchical Model is simple to construct and
operate on
 Corresponds to a number of natural
hierarchically organized domains - e.g.,
assemblies in manufacturing, personnel
organization in companies
 Language is simple; uses constructs like GET,
GET UNIQUE, GET NEXT, GET NEXT WITHIN
PARENT etc.
DISADVANTAGES of Hierarchical Data Model:
 Navigational and procedural nature of
processing
 Database is visualized as a linear
arrangement of records
Debre Markos Polytechnic college Dessie A.
30
Design a Database

 Little scope for "query optimization"


2. Network Model
Allows record types to have more than one
parent unlike hierarchical model
A network data models sees records as set
members
Each set has an owner and one or more
members
Allow many to many relationship between
entities
Like hierarchical model network model is a
collection of physically linked records.
Allow member records to have more than one
owner

Department Job

Employee
Activity

Time Card

ADVANTAGES of Network Data Model:


 Network Model is able to model complex
relationships and represents semantics of
add/delete on the relationships.
 Can handle most situations for modeling using
record types and relationship types.
Debre Markos Polytechnic college Dessie A.
31
Design a Database

 Language is navigational; uses constructs like


FIND, FIND member, FIND owner, FIND NEXT
within set, GET etc. Programmers can do
optimal navigation through the database.
DISADVANTAGES of Network Data Model:
 Navigational and procedural nature of
processing
 Database contains a complex array of
pointers that thread through a set of records.
 Little scope for automated "query
optimization”
3. Relational Data Model
Developed by Dr. Edgar Frank Coded in
1970 (famous paper, 'A Relational Model for
Large Shared Data Banks')
Terminologies originates from the branch of
mathematics called set theory and relation
Can define more flexible and complex
relationship
Viewed as a collection of tables called
“Relations” equivalent to collection of record
types
Relation: Two dimensional table
Stores information or data in the form of
tables  rows and columns
A row of the table is called tuple equivalent
to record
A column of a table is called attribute
equivalent to fields
Debre Markos Polytechnic college Dessie A.
32
Design a Database

Data value is the value of the Attribute


Records are related by the data stored jointly
in the fields of records in two tables or files.
The related tables contain information that
creates the relation
The tables seem to be independent but are
related some how.
No physical consideration of the storage is
required by the user
Many tables are merged together to come up
with a new virtual view of the relationship
Alternative
terminologies
Relatio Table File
n
Tuple Row
Record
Attribu Colu Field
te mn
The rows represent records (collections of
information about separate items)
The columns represent fields (particular
attributes of a record)
Conducts searches by using data in specified
columns of one table to find additional data in
another table
In conducting searches, a relational database
matches information from a field in one table
with information in a corresponding field of
Debre Markos Polytechnic college Dessie A.
33
Design a Database

another table to produce a third table that


combines requested data from both tables
Relational Data Model
Properties of Relational Databases
 Each row of a table is uniquely identified by a
PRIMARY KEY composed of one or more
columns
 Each tuple in a relation must be unique
 Group of columns, that uniquely identifies a
row in a table is called a CANDIDATE KEY
 ENTITY INTEGRITY RULE of the model
states that no component of the primary key
may contain a NULL value.
 A column or combination of columns that
matches the primary key of another table is
called a FOREIGN KEY. Used to cross-
reference tables.
 The REFERENTIAL INTEGRITY RULE of the
model states that, for every foreign key value
in a table there must be a corresponding
primary key value in another table in the
database or it should be NULL.
 All tables are LOGICAL ENTITIES
 A table is either a BASE TABLES (Named
Relations) or VIEWS (Unnamed Relations)
 Only Base Tables are physically stores
 VIEWS are derived from BASE TABLES with
SQL instructions like: [SELECT .. FROM ..
WHERE .. ORDER BY]
Debre Markos Polytechnic college Dessie A.
34
Design a Database

 Is the collection of tables


o Each entity in one table
o Attributes are fields (columns) in table
 Order of rows and columns is immaterial
 Entries with repeating groups are said to be
un-normalized
 Entries are single-valued
 Each column (field or attribute) has a distinct
name
All values in a column represent the same attribute
and have the same data format
Building Blocks of the Relational Data Model
The building blocks of the relational data model
are:
 Entities: real world physical or logical object
 Attributes: properties used to describe each
Entity or real world object.
 Relationship: the association between Entities
 Constraints: rules that should be obeyed while
manipulating the data.
1. The ENTITIES (persons, places, things etc.)
which the organization has to deal with.
Relations can also describe relationships
The name given to an entity should always
be a singular noun descriptive of each item
to be stored in it. E.g.: student NOT
students.
Debre Markos Polytechnic college Dessie A.
35
Design a Database

Every relation has a schema, which


describes the columns, or fields the relation
itself corresponds to our familiar notion of a
table:
A relation is a collection of tuples, each of
which contains values for a fixed number of
attributes
 Existence Dependency: the dependence of
an entity on the existence of one or more
entities.
 Weak entity : an entity that can not exist
without the entity with which it has a
relationship – it is indicated by a double
rectangle
2. The ATTRIBUTES - the items of information
which characterize and describe these
entities.
Attributes are pieces of information ABOUT
entities. The analysis must of course
identify those which are actually relevant to
the proposed application. Attributes will
give rise to recorded items of data in the
database
At this level we need to know such things
as:
 Attribute name (be explanatory words
or phrases)
Debre Markos Polytechnic college Dessie A.
36
Design a Database

The domain from which attribute


values are taken (A DOMAIN is a set
of values from which attribute values
may be taken.) Each attribute has
values taken from a domain. For
example, the domain of Name is
string and that for salary is real
 Whether the attribute is part of the

entity identifier (attributes which just


describe an entity and those which
help to identify it uniquely)
 Whether it is permanent or time-
varying (which attributes may change
their values over time)
 Whether it is required or optional for

the entity (whose values will


sometimes be unknown or irrelevant)
Types of Attributes
(1) Simple (atomic) Vs Composite
attributes
 Simple : contains a single value (not
divided into sub parts)
E.g. Age, gender
 Composite: Divided into sub parts
(composed of other attributes)
E.g. Name, address
(2) Single-valued Vs multi-valued
attributes

Debre Markos Polytechnic college Dessie A.


37
Design a Database

 Single-valued : have only single


value(the value may change but
has only one value at one time)
E.g. Name, Sex, Id. No.
color_of_eyes
 Multi-Valued: have more than one
value
E.g. Address, dependent-name
Person may have several college
degrees
(3) Stored vs. Derived Attribute
 Stored : not possible to derive or
compute
E.g. Name, Address
 Derived: The value may be derived
(computed) from the values of other
attributes.
E.g. Age (current year – year of
birth)
Length of employment (current
date- start date)
Profit (earning-cost)
G.P.A (grade point/credit hours)
(4) Null Values
 NULL applies to attributes which are
not applicable or which do not have
values.
 You may enter the value NA
(meaning not applicable)
Debre Markos Polytechnic college Dessie A.
38
Design a Database

 Value of a key attribute can not be


null.
Default value - assumed value if no
explicit value
Entity versus Attributes
When designing the conceptual specification
of the database, one should pay attention to
the distinction between an Entity and an
Attribute.
 Consider designing a database of
employees for an organization:
 Should address be an attribute of
Employees or an entity (connected to
Employees by a relationship)?
 If we have several addresses per
employee, address must be an
entity (attributes cannot be set-
valued/multi valued)
3. If the structure (city, Woreda, Kebele, etc)
is important, e.g. want to retrieve
employees in a given city, address must be
modeled as an entity (attribute values are
atomic)
4. The RELATIONSHIPS between entities
which exist and must be taken into account
when processing information. In any
business processing one object may be
associated with another object due to some
event. Such kind of association is what we
Debre Markos Polytechnic college Dessie A.
39
Design a Database

call a RELATIONSHIP between entity


objects.
 One external event or process may affect
several related entities.
 Related entities require setting of LINKS
from one part of the database to another.
 A relationship should be named by a word
or phrase which explains its function
 Role names are different from the names
of entities forming the relationship: one
entity may take on many roles, the same
role may be played by different entities
 For each RELATIONSHIP, one can talk
about the Number of Entities and the
Number of Tuples participating in the
association. These two concepts are called
DEGREE and CARDINALITY of a
relationship respectively.
Degree of a Relationship
 An important point about a relationship is
how many entities participate in it. The
number of entities participating in a
relationship is called the DEGREE of the
relationship.
Among the Degrees of relationship, the
following are the basic:
O UNARY/RECURSIVE RELATIONSHIP:
Tuples/records of a Single entity are
related with each other.
Debre Markos Polytechnic college Dessie A.
40
Design a Database

O BINARY RELATIONSHIPS: Tuples/records


of two entities are associated in a
relationship
O TERNARY RELATIONSHIP: Tuples/records
of three different entities are associated
o And a generalized one:
 N-NARY RELATIONSHIP: Tuples from
arbitrary number of entity sets are
participating in a relationship.
Cardinality of a Relationship
 Another important concept about
relationship is the number of
instances/tuples that can be associated
with a single instance from one entity in a
single relationship. The number of
instances participating or associated with
a single instance from an entity in a
relationship is called the CARDINALITY of
the relationship. The major cardinalities of
a relationship are:
o ONE-TO-ONE: one tuple is associated
with only one other tuple.
 E.g. Building – Location as a single
building will be located in a single
location and as a single location will
only accommodate a single
Building.
o ONE-TO-MANY, one tuple can be
associated with many other tuples, but
Debre Markos Polytechnic college Dessie A.
41
Design a Database

not the reverse.


 E.g. Department-Student as one
department can have multiple
students.
o MANY-TO-ONE, many tuples are
associated with one tuple but not the
reverse.
 E.g. Employee – Department: as
many employees belong to a single
department.
o MANY-TO-MANY: one tuple is
associated with many other tuples and
from the other side, with a different
role name one tuple will be associated
with many tuples
 E.g. Student – Courseas a student
can take many courses and a single
course can be attended by many
students.
5. Relational Constraints/Integrity Rules
 Relational Integrity
 Domain Integrity: No value of the
attribute should be beyond the
allowable limits
 Entity Integrity: In a base relation,
no attribute of a Primary Key can
assume a value of NULL

Debre Markos Polytechnic college Dessie A.


42
Design a Database

 Referential Integrity: If a Foreign


Key exists in a relation, either the
Foreign Key value must match a
Candidate Key value in its home
relation or the Foreign Key value
must be NULL
 Enterprise Integrity: Additional
rules specified by the users or
database administrators of a
database are incorporated
 Key constraints
If tuples are need to be unique in the
database, and then we need to make each
tuple distinct. To do this we need to have
relational keys that uniquely identify each
relation.
Super Key: an attribute or set of
attributes that uniquely identifies a
tuple within a relation.
Candidate Key: a super key such that no
proper subset of that collection is a
Super Key within the relation.
A candidate key has two properties:
1. Uniqueness
2. Irreducibility
If a super key is having only one
attribute, it is automatically a
Candidate key.

Debre Markos Polytechnic college Dessie A.


43
Design a Database

If a candidate key consists of more


than one attribute it is called
Composite Key.
Primary Key: the candidate key that is
selected to identify tuples uniquely
within the relation.
The entire set of attributes in a
relation can be considered as a
primary case in a worst case.
Foreign Key: an attribute, or set of
attributes, within one relation that
matches the candidate key of some
relation.
A foreign key is a link between
different relations to create the view or
the unnamed relation
 Relational Views
Relations are perceived as a Table from the
users’ perspective. Actually, there are two
kinds of relation in relational database. The
two categories or tyapes of Relations are
Named and Unnamed Relations. The basic
difference is on how the relation is created,
used and updated:
1. Base Relation
A Named Relation corresponding to an
entity in the conceptual schema, whose
tuples are physically stored in the
database.
Debre Markos Polytechnic college Dessie A.
44
Design a Database

2. View (Unnamed Relation)


A View is the dynamic result of one or
more relational operations operating on
the base relations to produce another
virtual relation that does not actually
exist as presented. So a view is virtually
derived relation that does not necessarily
exist in the database but can be produced
upon request by a particular user at the
time of request. The virtual table or
relation can be created from single or
different relations by extracting some
attributes and records with or without
conditions.
Purpose of a view
 Hides unnecessary information from
users: since only part of the base
relation (Some collection of attributes,
not necessarily all) are to be included
in the virtual table.
 Provide powerful flexibility and
security: since unnecessary
information will be hidden from the
user there will be some sort of data
security.
 Provide customized view of the
database for users: each users are
going to be interfaced with their own

Debre Markos Polytechnic college Dessie A.


45
Design a Database

preferred data set and format by


making use of the Views.
 A view of one base relation can be
updated.
 Update on views derived from various
relations is not allowed since it may
violate the integrity of the database.
 Update on view with aggregation and
summary is not allowed. Since
aggregation and summary results are
computed from a base relation and
does not exist actually.
Schemas and Instances and Database
State
When a database is designed using a Relational
data model, all the data is represented in a form of
a table. In such definitions and representation,
there are two basic components of the database.
The two components are the definition of the
Relation or the Table and the actual data stored in
each table. The data definition is what we call the
Schema or the skeleton of the database and the
Relations with some information at some point in
time is the Instance or the flesh of the database.
Schemas
 Schema describes how data is to be structured,
defined at setup/Design time (also called
"metadata")

Debre Markos Polytechnic college Dessie A.


46
Design a Database

 Since it is used during the database


development phase, there is rare tendency of
changing the schema unless there is a need for
system maintenance which demands change to
the definition of a relation.
 Database Schema (Intension): specifies
name of relation and the collection of the
attributes (specifically the Name of attributes).
 refer to a description of database (or
intention)
 specified during database design
 should not be changed unless during
maintenance
 Schema Diagrams
 convention to display some aspect of a
schema visually
 Schema Construct
 refers to each object in the schema (e.g.
STUDENT)
E.g.: STUNEDT
(FName,LName,Id,Year,Dept,Sex)
Instances
 Instance: is the collection of data in the
database at a particular point of time (snap-
shot).
 Also called State or Snap Shot or
Extension of the database
 Refers to the actual data in the database at
a specific point in time
Debre Markos Polytechnic college Dessie A.
47
Design a Database

 State of database is changed any time we


add, delete or update an item.
 Valid state: the state that satisfies the
structure and constraints specified in the
schema and is enforced by DBMS
 Since Instance is actual data of database at
some point in time, changes rapidly
 To define a new database, we specify its
database schema to the DBMS (database is
empty)
 database is initialized when we first load it with
data
Database Design
Database design is the process of coming up with
different kinds of specification for the data to be
stored in the database. The database design part is
one of the middle phases we have in information
systems development where the system uses a
database approach. Design is the part on which we
would be engaged to describe how the data should
be perceived at different levels and finally how it is
going to be stored in a computer system.
Information System with Database application
consists of several tasks which include:
 Planning of Information systems Design
 Requirements Analysis,
 Design (Conceptual, Logical and Physical
Design)
 Tuning
Debre Markos Polytechnic college Dessie A.
48
Design a Database

 Implementation
 Operation and Support
From these different phases, the prime interest of a
database system will be the Design part which is
again sub divided into other three sub-phases.
These sub-phases are:
1. Conceptual Design
2. Logical Design, and
3. Physical Design
 In general, one has to go back and forth
between these tasks to refine a database
design, and decisions in one task can influence
the choices in another task.
 In developing a good design, one should answer
such questions as:
 What are the relevant Entities for the
Organization
 What are the important features of each
Entity
 What are the important Relationships
 What are the important queries from the
user
 What are the other requirements of the
Organization and the Users
The Three levels of Database Design

Debre Markos Polytechnic college Dessie A.


49
Design a Database

Conceptual Design

Logical Design

Physical Design
Conceptual Database Design
 Conceptual design is the process of
constructing a model of the information used in
an enterprise, independent of any physical
considerations.
 It is the source of information for the logical
design phase.
 Mostly uses an Entity Relationship Model to
describe the data at this level.
 After the completion of Conceptual Design one
has to go for refinement of the schema, which
is verification of Entities, Attributes, and
Relationships
Logical Database Design
 Logical design is the process of constructing a
model of the information used in an enterprise
based on a specific data model (e.g. relational,
hierarchical or network or object), but
independent of a particular DBMS and other
physical considerations.
 Normalization process
 Collection of Rules to be maintained
 Discover new entities in the process
 Revise attributes based on the rules and
the discovered Entities
Debre Markos Polytechnic college Dessie A.
50
Design a Database

Physical Database Design


 Physical design is the process of producing a
description of the implementation of the
database on secondary storage. -- defines
specific storage or access methods used by
database
 Describes the storage structures and access
methods used to achieve efficient access to
the data.
 Tailored to a specific DBMS system --
Characteristics are function of DBMS and
operating systems
 Includes estimate of storage space
Conceptual Database Design
 Conceptual design revolves around discovering
and analyzing organizational and user data
requirements
 The important activities are to identify
 Entities
 Attributes
 Relationships
 Constraints
 And based on these components develop the
ER model using
 ER diagrams
The Entity Relationship (E-R) Model
 Entity-Relationship modeling is used to
represent conceptual view of the database
 The main components of ER Modeling are:
Debre Markos Polytechnic college Dessie A.
51
Design a Database

o Entities
 Corresponds to entire table, not row
 Represented by Rectangle
o Attributes
 Represents the property used to
describe an entity or a relationship
 Represented by Oval
o Relationships
 Represents the association that exist
between entities
 Represented by Diamond
o Constraints
 Represent the constraint in the data
Before working on the conceptual design of
the database, one has to know and answer
the following basic questions.
 What are the entities and relationships in the
enterprise?
 What information about these entities and
relationships should we store in the database?
 What are the integrity constraints that hold?
Constraints on each data with respect to
update, retrieval and store.
 Represent this information pictorially in ER
diagrams, then map ER diagram into a
relational schema.
Developing an E-R Diagram
 Designing conceptual model for the database is not
a one linear process but an iterative activity where
Debre Markos Polytechnic college Dessie A.
52
Design a Database

the design is refined again and again.


 To identify the entities, attributes, relationships,
and constraints on the data, there are different set
of methods used during the analysis phase. These
include information gathered by…
 Interviewing end users individually and in a
group
 Questionnaire survey
 Direct observation
 Examining different documents
 The basic E-R model is graphically depicted and
presented for review.
 The process is repeated until the end users and
designers agree that the E-R diagram is a fair
representation of the organization’s activities and
functions.
 Checking for Redundant Relationships in the ER
Diagram. Relationships between entities indicate
access from one entity to another - it is therefore
possible to access one entity occurrence from
another entity occurrence even if there are other
entities and relationships that separate them - this
is often referred to as Navigation' of the ER
diagram
 The last phase in ER modeling is validating an ER
Model against requirement of the user.
Graphical Representations in ER
Diagramming

Debre Markos Polytechnic college Dessie A.


53
Design a Database

 Entity is represented by a RECTANGLE


containing the name of the entity.
Strong Entity Weak

 Connected entities are called relationship


participants
 Attributes are represented by OVALS and are
Ovals
connected toOvals
the entity
Ovalsby a line.Ovals Ovals

Multi-valued Composite Ovals


Attribute Attribute Attribute

 A derived attribute is indicated by a DOTTED


LINE. (……..)
Ovals

 PRIMARY KEYS are underlined.


Key

 Relationships are represented by DIAMOND


shaped symbols
 Weak Relationship is a relationship between
Weak and Strong Entities
 Strong Relationship is a relationship
between two strong Entities
Diamon Diamond
d
Strong Relationship Weak Relationship

Example 1: Build an ER Diagram for the following


information:
Debre Markos Polytechnic college Dessie A.
54
Design a Database

 A student record management system will have


the following two basic data object categories
with their own features or properties: Students
will have an Id, Name, Dept, Age, GPA and
Course will have an Id, Name, Credit Hours
 Whenever a student enroll in a course in a
specific Academic Year and Semester, the
Student will have a grade for the course
Name Dept DoB Id Name Credit

Id Gpa
Students Courses

Age

Enrolled_In Semester
Academic
Year

Grade

Example 2: Build an ER Diagram for the following


information:
 A Personnel record management system will
have the following two basic data object
categories with their own features or
properties: Employee will have an Id, Name,
DoB, Age, Tel and Department will have an Id,
Name, Location
 Whenever an Employee is assigned in one
Department, the duration of his stay in the
respective department should be registered.
Debre Markos Polytechnic college Dessie A.
55
Design a Database

Structural Constraints on Relationship


1. Constraints on Relationship / Multiplicity/
Cardinality Constraints
 Multiplicity constraint is the number or range of
possible occurrence of an entity type/relation that
may relate to a single occurrence/tuple of an entity
type/relation through a particular relationship.
 Mostly used to insure appropriate enterprise
constraints.
One-to-one relationship:
 A customer is associated with at most one loan via
the relationship borrower
 A loan is associated with at most one customer via
borrower

E.g.: Relationship Manages between STAFF and


BRANCH
The multiplicity of the relationship is:
 One branch can only have one manager
 One employee could manage either one or no
branches
1..1 Manages 0..1
Employee Branch

One-To-Many Relationships

Debre Markos Polytechnic college Dessie A.


56
Design a Database

 In the one-to-many relationship a loan is associated


with at most one customer via borrower, a
customer is associated with several (including 0)
loans via borrower

E.g.: Relationship Leads between STAFF and


PROJECT
The multiplicity of the relationship
 One staff may Lead one or more project(s)
 One project is Lead by one staff
1..1 Leads 0..*
Employee Project

Many-To-Many Relationship
 A customer is associated with several (possibly 0)
loans via borrower
 A loan is associated with several (possibly 0)
customers via borrower

E.g.: Relationship Teaches between


INSTRUCTOR and COURSE
The multiplicity of the relationship
 One Instructor Teaches one or more
Course(s)
Debre Markos Polytechnic college Dessie A.
57
Design a Database

 One Course Thought by Zero or more


Instructor(s)
0..* Teaches 1..*
Instructor Course

Participation of an Entity Set in a


Relationship Set
Participation constraint of a relationship is involved
in identifying and setting the mandatory or optional
feature of an entity occurrence to take a role in a
relationship. There are two distinct participation
constraints with this respect, namely: Total
Participation and Partial Participation
 Total participation: every tuple in the entity or
relation participates in at least one relationship by
taking a role. This means, every tuple in a relation
will be attached with at least one other tuple. The
entity with total participation in a relationship will
be connected to the relationship using a double
line.
 Partial participation: some tuple in the entity or
relation may not participate in the relationship.
This means, there is at least one tuple from that
Relation not taking any role in that specific
relationship. The entity with partial participation in
a relationship will be connected to the relationship
using a single line.
 E.g. 1: Participation of EMPLOYEE in “belongs
to” relationship with DEPARTMENT is total

Debre Markos Polytechnic college Dessie A.


58
Design a Database

since every employee should belong to a


department.
Participation of DEPARTMENT in “belongs to”
relationship with EMPLOYEE is total since
every department should have more than
one employee.
Employee Belongs To Department

 E.g. 2: Participation of EMPLOYEE in “manages”


relationship with DEPARTMENT, is partial
participation since not all employees are
managers.
Participation of DEPARTMENT in “Manages”
relationship with EMPLOYEE is total since
every department should have a manager.
Employee Manages Department

Problem in ER Modeling
The Entity-Relationship Model is a conceptual data
model that views the real world as consisting of
entities and relationships. The model visually
represents these concepts by the Entity-
Relationship diagram. The basic constructs of the
ER model are entities, relationships, and attributes.
Entities are concepts, real or abstract, about which
information is collected. Relationships are
associations between the entities. Attributes are
properties which describe the entities.
Debre Markos Polytechnic college Dessie A.
59
Design a Database

While designing the ER model one could face a


problem on the design which is called a connection
traps. Connection traps are problems arising
from misinterpreting certain relationships
There are two types of connection traps;
1. Fan trap:
Occurs where a model represents a relationship
between entity types, but the pathway between
certain entity occurrences is ambiguous.
May exist where two or more one-to-many (1:M)
relationships fan out from an entity. The
problem could be avoided by restructuring the
model so that there would be no 1:M
relationships fanning out from a singe entity
and all the semantics of the relationship is
preserved.
Example:
1..* Works 1..1 1..1 IsAssigned 1..*
For
Employee Branch Car
Semantics description of the problem;
Emp1 Bra1 Car1
Emp2 Bra2 Car2
Emp3 Bra3 Car3
Emp4 Bra4 Car4
Emp5 Car5
Emp6 Car6
Emp7 Car7

Debre Markos Polytechnic college Dessie A.


60
Design a Database

Problem: Which car (Car1 or Car3 or Car5) is used


by Employee 6 Emp6 working in Branch 1 (Bra1)?
Thus from this ER Model one can not tell which car
is used by which staff since a branch can have
more than one car and also a branch is populated
by more than one employee. Thus we need to
restructure the model to avoid the connection trap.
To avoid the Fan Trap problem we can go for
restructuring of the E-R Model. This will result in the
following E-R Model.
1..1 Has 1..* 1..* Used By 1..*
Branch Car Employee

Semantics
Bra1
descriptionCar1
of the problem; Emp1
Car2
Bra2 Emp2
Car3
Bra3 Emp3
Car4
Bra4 Emp4
Car5
Emp5
Car6
Emp6
Car7
Emp7

2. Chasm Trap:
Occurs where a model suggests the existence
of a relationship between entity types, but the
path way does not exist between certain entity
occurrences.
May exist when there are one or more
relationships with a minimum multiplicity on
cardinality of zero forming part of the pathway
between related entities.
Debre Markos Polytechnic college Dessie A.
61
Design a Database

Example:
1..1 Has 1..* 0..1 Manages 0..*
Branch Employee project
If we have a set of projects that are not active
currently then we can not assign a project
manager for these projects. So there are project
with no project manager making the
participation to have a minimum value of zero.
Problem:
How can we identify which BRANCH is
responsible for which PROJECT? We know that
whether the PROJECT is active or not there is a
responsible BRANCH. But which branch is a
question to be answered, and since we have a
minimum participation of zero between
employee and PROJECT we can’t identify the
BRANCH responsible for each PROJECT.
The solution for this Chasm Trap problem is to
add another relation ship between the extreme
entities (BRANCH and PROJECT)

1..1 Has 1..* 0..1 Manages 0..*


Branch Employee project

1..1 Responsible for 1..*

Database Normalization Basics


If you've been working with databases for a while, chances are
you've heard the term normalization. Perhaps someone's asked
you "Is that database normalized?" or "Is that in BCNF?" All too
Debre Markos Polytechnic college Dessie A.
62
Design a Database

often, the reply is "Uh, yeah." Normalization is often brushed


aside as a luxury that only academics have time for. However,
knowing the principles of normalization and applying them to
your daily database design tasks really isn't all that complicated
and it could drastically improve the performance of your DBMS.

In this article, we'll introduce the concept of normalization and


take a brief look at the most common normal forms. Future
articles will provide in-depth explorations of the normalization
process.
What is Normalization?
Normalization is the process of efficiently organizing data in a
database. There are two goals of the normalization process:
eliminating redundant data (for example, storing the same data
in more than one table) and ensuring data dependencies make
sense (only storing related data in a table). Both of these are
worthy goals as they reduce the amount of space a database
consumes and ensure that data is logically stored.
The Normal Forms
The database community has developed a series of guidelines for
ensuring that databases are normalized. These are referred to as
normal forms and are numbered from one (the lowest form of
normalization, referred to as first normal form or 1NF) through
five (fifth normal form or 5NF). In practical applications, you'll
often see 1NF, 2NF, and 3NF along with the occasional 4NF.
Fifth normal form is very rarely seen and won't be discussed in
this article.

Before we begin our discussion of the normal forms, it's


important to point out that they are guidelines and guidelines
Debre Markos Polytechnic college Dessie A.
63
Design a Database

only. Occasionally, it becomes necessary to stray from them to


meet practical business requirements. However, when variations
take place, it's extremely important to evaluate any possible
ramifications they could have on your system and account for
possible inconsistencies. That said, let's explore the normal
forms.
Modification Anomalies
In relational database design, we not only want to create a
structure that stores all of the data, but we also want to do it in a
way that minimize potential errors when we work with the data.
The default language for accessing data from a relational
database is SQL. In particular, SQL can be used to manipulate
data in the following ways: insert new data, delete unwanted
data, and update existing data. Similarly, in an un-normalized
design, there are 3 problems that can occur when we work with
the data:
INSERT ANOMALY: This refers to the situation when it is
impossible to insert certain types of data into the database.
DELETE ANOMALY: The deletion of data leads to
unintended loss of additional data, data that we had wished to
preserve.
UPDATE ANOMALY: This refers to the situation where
updating the value of a column leads to database inconsistencies
(i.e., different rows on the table have different values).
To address the 3 problems above, we go through the process of
normalization. When we go through the normalization process,
we increase the number of tables in the database, while
decreasing the amount of data stored in each table. There are
several different levels of database normalization:
Debre Markos Polytechnic college Dessie A.
64
Design a Database

 1st Normal Form (1NF)


 2nd Normal Form (2NF)
 3rd Normal Form (3NF)
 Bryce-Codd Normal Form (BCNF)
 4th Normal Form (4NF)
 5th Normal Form (5NF)
The opposite of normalization is denormalization, where we
want to combine multiple tables together into a larger table.
Denormalization is most frequently associated with designing
the fact table in a data warehouse.
1st Normal Form Definition
A database is in first normal form if it satisfies the following
conditions:
 Contains only atomic values
 There are no repeating groups
An atomic value is a value that cannot be divided. For example,
in the table shown below, the values in the [Color] column in the
first row can be divided into "red" and "green", hence
[TABLE_PRODUCT] is not in 1NF.
A repeating group means that a table contains two or more
columns that are closely related. For example, a table that
records data on a book and its author(s) with the following
columns: [Book ID], [Author 1], [Author 2], [Author 3] is not in
1NF because [Author 1], [Author 2], and [Author 3] are all
repeating the same attribute.
1st Normal Form Example
Debre Markos Polytechnic college Dessie A.
65
Design a Database

How do we bring an unnormalized table into first normal form?


Consider the following example:

This table is not in first normal form because the [Color] column
can contain multiple values. For example, the first row includes
values "red" and "green."
To bring this table to first normal form, we split the table into
two tables and now we have the resulting tables:

Now first normal form is satisfied, as the columns on each table


all hold just one value.
2nd Normal Form Definition
A database is in second normal form if it satisfies the following
conditions:
 It is in first normal form

Debre Markos Polytechnic college Dessie A.


66
Design a Database

 All non-key attributes are fully functional dependent on the


primary key
In a table, if attribute B is functionally dependent on A, but is
not functionally dependent on a proper subset of A, then B is
considered fully functional dependent on A. Hence, in a 2NF
table, all non-key attributes cannot be dependent on a subset of
the primary key. Note that if the primary key is not a composite
key, all non-key attributes are always fully functional dependent
on the primary key. A table that is in 1st normal form and
contains only a single key as the primary key is automatically in
2nd normal form.
2nd Normal Form Example
Consider the following example:

This table has a composite primary key [Customer ID, Store ID].
The non-key attribute is [Purchase Location]. In this case,
[Purchase Location] only depends on [Store ID], which is only
part of the primary key. Therefore, this table does not satisfy
second normal form.
To bring this table to second normal form, we break the table
into two tables, and now we have the following:

Debre Markos Polytechnic college Dessie A.


67
Design a Database

What we have done is to remove the partial functional


dependency that we initially had. Now, in the table
[TABLE_STORE], the column [Purchase Location] is fully
dependent on the primary key of that table, which is [Store ID].
3rd Normal Form Definition
A database is in third normal form if it satisfies the following
conditions:
 It is in second normal form
 There is no transitive functional dependency
By transitive functional dependency, we mean we have the
following relationships in the table: A is functionally dependent
on B, and B is functionally dependent on C. In this case, C is
transitively dependent on A via B.
3rd Normal Form Example
Consider the following example:

Debre Markos Polytechnic college Dessie A.


68
Design a Database

In the table able, [Book ID] determines [Genre ID], and [Genre
ID] determines [Genre Type]. Therefore, [Book ID] determines
[Genre Type] via [Genre ID] and we have transitive functional
dependency, and this structure does not satisfy third normal
form.
To bring this table to third normal form, we split the table into
two as follows:

Now all non-key attributes are fully functional dependent only


on the primary key. In [TABLE_BOOK], both [Genre ID] and
[Price] are only dependent on [Book ID]. In [TABLE_GENRE],
[Genre Type] is only dependent on [Genre ID].

Boyce-Codd Normal Form (BCNF or 3.5NF)


The Boyce-Codd Normal Form, also referred to as the "third and
half (3.5) normal form", adds one more requirement:
 Meet all the requirements of the third normal form.

 Every determinant must be a candidate key.

Fourth Normal Form (4NF)


Finally, fourth normal form (4NF) has one additional
requirement:
 Meet all the requirements of the third normal form.

 A relation is in 4NF if it has no multi-valued dependencies.

NB: - Remember, these normalization guidelines are cumulative.


For a database to be in 2NF, it must first fulfill all the criteria of
Debre Markos Polytechnic college Dessie A.
69
Design a Database

a 1NF database.

Debre Markose Poly Technic College


Department of Database Administration
Mid exam for Design a Database

I. Write true or false.


1)The main limitation of database approach is it require
technical person.
2)DBMS provides for creating, updating and storing data
rather than retrieving data in a database.
3)Authorization service is providing to create an account and
account restriction in a database.
4)DDL commands are used to setup a database.
5)Metadata is stored collection of information about a
database.
II. Choose the best answer from the given
alternatives.
1. One of the following is not true about database.
Debre Markos Polytechnic college Dessie A.
70
Design a Database

A. Database is an organized data that are related.


B. Databases are relevant to every business.
C. DBMS are software package which is used to
create and manipulate.
D. Database is a storeroom for
collection of computerized data files E.
None
2. One of the following is not characteristics of
database.
A. How to organize data
B. Accessing data is alternatively
C. Efficient and effective data retrieve
D. All
3. Which one of the following is not true about limitation of
manual approach?
A. Prone to error C. Hard to compile
information
B. Difficult to modify D. None
4. Which one of the following is incorrect about traditional
approach?
A. It is a centralized computerized data C. There is
data duplication problem
B. Application dependent D. Limited
data sharing
5. Which one of the following is not true about database
approach?
A. Data can be shared C. Inconsistency
can be avoided

Debre Markos Polytechnic college Dessie A.


71
Design a Database

B. Redundancy can be increased D. Improved


decision support
III. Matching
“A” “B”
1. Physical design A. Concise description of data, data
type and relationship b/n the data
2. Logical design B. The structure of the data in a
database
3. Conceptual design C. storing database for selected DBMS
4. Schema D. accessing database by different users
simultaneously
5. Transaction services E. Implementing the specific data
model
IV. Give short answer accordingly
1. List down at least three DBMS facilities.
2. List down at least four DBMS components.
3. List down database development life cycle sequentially.
4. List down at least three major database users.
5. Mention and discus database administrator’s function on a
database.

Debre Markos Polytechnic college Dessie A.


72

You might also like