0% found this document useful (0 votes)
967 views

Database Management System PDF

A DBMS is software that provides efficient, convenient, and safe multi-user access and storage of massive amounts of persistent data. It provides services like data storage, retrieval and updating; transaction support; concurrency control; recovery; authorization; and data integrity. Key components of a DBMS environment include hardware, software, data, procedures, and people playing various roles like database administrators managing resources, designers structuring data, programmers developing applications, and end users accessing information.

Uploaded by

Emente Emente
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
967 views

Database Management System PDF

A DBMS is software that provides efficient, convenient, and safe multi-user access and storage of massive amounts of persistent data. It provides services like data storage, retrieval and updating; transaction support; concurrency control; recovery; authorization; and data integrity. Key components of a DBMS environment include hardware, software, data, procedures, and people playing various roles like database administrators managing resources, designers structuring data, programmers developing applications, and end users accessing information.

Uploaded by

Emente Emente
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Database Management System (DBMS)

Database Management System (DBMS) is a Software package used for providing EFFICIENT,
CONVENIENT and SAFE MULTI-USER (many people/programs accessing same database, or even
same data, simultaneously) storage of and access to MASSIVE amounts of PERSISTENT (data
outlives programs that operate on it) data. A DBMS also provides a systematic method for creating,
updating, storing, retrieving data in a database. DBMS also provides the service of controlling data
access, enforcing data integrity, managing concurrency control, and recovery. Having this in mind, a
full scale DBMS should at least have the following services to provide to the user.
1. Data storage, retrieval and update in the database.
2. A user accessible catalogue.
3. Transaction support service: ALL or NONE transaction, which minimize data inconsistency.
4. Concurrency Control Services: access and update on the database by different users
simultaneously should be implemented correctly.
5. Recovery Services: a mechanism for recovering the database after a failure must be available.
6. Authorization Services (Security): must support the implementation of access and
authorization service to database administrator and users.
7. Support for Data Communication: should provide the facility to integrate with data transfer
software or data communication managers.
8. Integrity Services: rules about data and the change that took place on the data, correctness
and consistency of stored data, and quality of data based on business constraints.
9. Services to promote data independency between the data and the application.
10. Utility services: sets of utility service facilities like
 Importing data.
 Statistical analysis support.
 Index reorganization.
 Garbage collection.

DBMS and Components of DBMS Environment


A DBMS is software package used to design, manage, and maintain databases. Each DBMS should
have facilities to define the database, manipulate the content of the database and control the database.
These facilities will help the designer, the user as well as the database administrator to discharge their
responsibility in designing, using and managing the database. It provides the following facilities:
 Data Definition Language (DDL):
 Language used to define each data element required by the organization.
 Commands for setting up schema or the intension of database.
 These commands are used to setup a database, create, delete and alter table with the
facility of handling constraints
 Data Manipulation Language (DML):
 Is a core command used by end-users and programmers to store, retrieve, and access the
data in the database e.g. SQL.
 Since the required data or Query by the user will be extracted using this type of language,
it is also called "Query Language".
 Data Dictionary:
 Due to the fact that a database is a self describing system, this tool, Data Dictionary, is
used to store and organize information about the data stored in the database.
 Data Control Language:
 Database is a shared resource that demands control of data access and usage. The database
administrator should have the facility to control the overall operation of the system.
 Data Control Languages are commands that will help the Database Administrator to
control the database.
 The commands include grant or revoke privileges to access the database or particular
object within the database and to store or remove database transactions
The DBMS is software package that helps to design, manage, and use data using the database
approach. Taking a DBMS as a system, one can describe it with respect to it environment or other
systems interacting with the DBMS. The DBMS environment has five components. To design and use
a database, there will be the interaction or integration of Hardware, Software, Data, Procedure and
People.
1. Hardware: are components that one can touch and feel. These components are comprised of
various types of personal computers, mainframe or any server computers to be used in multi-
user system, network infrastructure, and other peripherals required in the system.
2. Software: are collection of commands and programs used to manipulate the hardware to
perform a function. These include components like the DBMS software, application
programs, operating systems, network software, language software and other relevant
software.
3. Data: since the goal of any database system is to have better control of the data and making
data useful, Data is the most important component to the user of the database. There are two
categories of data in any database system: that is Operational and Metadata. Operational data
is the data actually stored in the system to be used by the user. Metadata is the data that is
used to store information about the database itself. The structure of the data in the database
is called the schema, which is composed of the Entities, Properties of entities, and
relationship between entities.
4. Procedure: this is the rules and regulations on how to design and use a database. It includes
procedures like how to log on to the DBMS, how to use facilities, how to start and stop
transaction, how to make backup, how to treat hardware and software failure, how to change
the structure of the database.
5. People: this component is composed of the people in the organization that are responsible or
play a role in designing, implementing, managing, administering and using the resources in
the database. This component includes group of people with high level of knowledge about
the database and the design technology to other with no knowledge of the system except
using the data in the database.

Roles in Database Design and Use

As people are one of the components in DBMS environment, there are group of roles played by
different stakeholders of the designing and operation of a database system.

1. Database Administrator (DBA)


 Responsible to oversee, control and manage the database resources (the database itself, the
DBMS and other related software).
 Authorizing access to the database.
 Coordinating and monitoring the use of the database.
 Responsible for determining and acquiring hardware and software resources.
 Accountable for problems like poor security, poor performance of the system.
 Involves in all steps of database development
We can have further classifications of this role in big organizations having huge amount of data and
user requirement.
1. Data Administrator (DA): is responsible on management of data resources. It involves in
database planning, development, maintenance of standards policies and procedures at the
conceptual and logical design phases.
2. Database Administrator (DBA): is more technically oriented role. He/she is Responsible
for the physical realization of the database. It involves in physical design, implementation,
security and integrity control of the database.
2. Database Designer (DBD)
 Identifies the data to be stored and choose the appropriate structures to represent and store the
data.
 Should understand the user requirement and should choose how the user views the database.
 Involve on the design phase before the implementation of the database system.
We have two distinctions of database designers, one involving in the logical and conceptual design
and another involving in physical design.
1. Logical and Conceptual DBD
 Identifies data (entity, attributes and relationship) relevant to the organization.
 Identifies constraints on each data.
 Understand data and business rules in the organization.
 Sees the database independent of any data model at conceptual level and consider
one specific data model at logical design phase.
2. Physical DBD
 Take logical design specification as input and decide how it should be physically
realized.
 Map the logical data model on the specified DBMS with respect to tables and
integrity constraints. (DBMS dependent designing).
 Select specific storage structure and access path to the database.
 Design security measures required on the database.
3. Application Programmer and Systems Analyst
 System analyst determines the user requirement and how the user wants to view the database.
 The application programmer implements these specifications as programs; code, test, debug,
document and maintain the application program.
 Determines the interface on how to retrieve, insert, update and delete data in the database.
 The application could use any high level programming language according to the availability,
the facility and the required service.
4. End Users
Workers, whose job requires accessing the database frequently for various purposes, there are
different group of users in this category.
1. Naive Users:
 Sizable proportion of users.
 Unaware of the DBMS.
 Only access the database based on their access level and demand.
 Use standard and pre-specified types of queries.
2. Sophisticated Users
 Are users familiar with the structure of the Database and facilities of the DBMS?
 Have complex requirements.
 Have higher level queries.
 Are most of the time engineers, scientists, business analysts, etc.
3. Casual Users
 Users who access the database occasionally.
 Need different information from the database each time.
 Use sophisticated database queries to satisfy their needs.
 Are most of the time middle to high level managers.
Chapter Two

ANSI-SPARC Architecture

The purpose and origin of the Three-Level database architecture

 All users should be able to access same data. This is important since the database is having a
shared data feature where all the data is stored in one location and all users will have their own
customized way of interacting with the data.
 A user's view is unaffected or immune to changes made in other views. Since the requirement of
one user is independent of the other, a change made in one user’s view should not affect other
users.
 Users should not need to know physical database storage details. As there are naïve users of the
system, hardware level or physical details should be a black-box for such users.
 DBA should be able to change database storage structures without affecting the users' views. A
change in file organization, access method should not affect the structure of the data which in
turn will have no effect on the users.
 Internal structure of database should be unaffected by changes to physical aspects of storage.
 DBA should be able to change conceptual structure of database without affecting all users. In
any database system, the DBA will have the privilege to change the structure of the database,
like adding tables, adding and deleting an attribute, changing the specification of the objects in
the database.
All the above and much other functionality are possible due to the three levels ANSI-SPARC
architecture.

Database Development Life Cycle.


The database development process comprises a series of phases. The major phases in information
engineering are:

1. Planning

2. Analysis

3. Design

4. DBMS Selection

5. Implementation

6. Maintenance
1. Database planning
d
The database-planning phase begins when a customer requests to develop a database project. It is set
of tasks or activities, which decide the resources required in the database development and time limits
of different activities. During planning phase, four major activities are performed.
 Review and approve the database project request.
 Prioritize the database project request.
 Allocate resources such as money, people and tools.
 Arrange a development team to develop the database project.
Database planning should also include the development of standards that govern how data will be
collected, how the format should be specified, what necessary documentation will be needed.

2. Requirements Analysis

Requirements analysis is done in order to understand the problem, which is to be solved. It is very
important activity for the development of database system. The person responsible for the
requirements analysis is often called "Analyst".
In requirements analysis phase, the requirements and expectations of the users are collected and
analyzed. The collected requirements help to understand the system that does not yet exist. There are
two major activities in requirements analysis.

 Problem understanding or analysis


 Requirement specifications.

3. Design
design
The database design is the major phase of information engineering. In this phase, the information
models that were developed during analysis are used to design a conceptual schema for the database
and to design transaction and application.
 In conceptual schema design, the data requirements collected in Requirement Analysis phase
are examined and a conceptual database schema is produced.
 In transaction and application design, the database applications analyzed in Requirement
Analysis phase are examined and specifications of these applications are produced. There are
two major steps in design phase:
 Database Design
 Process Design

User 1 User 2 User n

External View 1 View 2 …………….. View n


level

Conceptual Conceptual
level Schema

Internal Internal
Level Schema

Physical Data
Organization Data Base

Three-level ANSI-SPARC Architecture of a Database

ANSI-SPARC Architecture and Database Design Phases


External Schema

Logical / Conceptual Database Design


Conceptual Schema

Internal Schema

Physical Database Design

Physical Schema

External Level: Users' view of the database. It describes that part of database that is relevant to a
particular user. Different users have their own customized view of the database independent of
other users.
Conceptual Level: Community view of the database. It describes what data is stored in database and
relationships among the data.
Internal Level: Physical representation of the database on the computer. It describes how the data is
stored in the database.
The following example can be taken as an illustration for the difference between the three levels in
the ANSI-SPARC database Architecture. Where:
• The first level is concerned about the group of users and their respective data requirement
independent of the other.
• The second level is describing the whole content of the database where one piece of
information will be represented once.
• The third level
External View 1
Sno FName LName Age Salary
External View 2
Staff_No LName Bno
Conceptual level
Staff_No FName LName DOB Salary Bno

Internal level

Struct STAFF
{ Int Staff_No;
Int Branch_No;
Char FName[15];
Char LName[15];
Date Date_of_Birth;
Float Salary;
Strcut STAFF *next;
};

Differences between Three Levels of ANSI-SPARC Architecture

Defines d
DBMS schemas at three levels:

Internal schema: at the internal level to describe physical storage structures and access paths.
Typically uses a physical data model.
Conceptual schema: at the conceptual level to describe the structure and constraints for the
whole database for a community of users. It uses a conceptual or an
implementation data model.
External schema: at the external level to describe the various user views. It usually uses the
same data model as the conceptual level.

d
Data Independence

Logical Data Independence:


 Refers to immunity of external schemas to changes in conceptual schema.
 Conceptual schema changes e.g. addition/removal of entities should not require changes to
external schema or rewrites of application programs.
 The capacity to change the conceptual schema without having to change the external
schemas and their application programs.

Physical Data Independence

 The ability to modify the physical schema without changing the logical schema
 Applications depend on the logical schema.
 In general, the interfaces between the various levels and components should be well
defined so that changes in some parts do not seriously influence others.
 The capacity to change the internal schema without having to change the conceptual
schema.
 Refers to immunity of conceptual schema to changes in the internal schema.
 Internal schema changes e.g. using different file organizations, storage structures/devices
should not require change to conceptual or external schemas.

External External External


Schema Schema Schema

External / Conceptual
Mapping Logical Data independency
Conceptual
Schema
Conceptual / Internal
Mapping Physical Data independency
Internal
Schema

Data Independence and the ANSI-SPARC Three-level Architecture


The distinction between a Data Definition Language (DDL) and a Data Manipulation
Language (DML)

Database Languages

Data Definition Language (DDL)

 Allows DBA or user to describe and name entitles, attributes and relationships required for
the application.
 Specification notation for defining the database schema.

Data Manipulation Language (DML)

 Provides basic data manipulation operations on data held in the database.


 Language for accessing and manipulating the data organized by the appropriate data model.
 DML also known as query language
o Procedural DML: user specifies what data is required and how to get the data.
o Non-Procedural DML: user specifies what data is required but not how it is to
be retrieved.
SQL is the most widely used non-procedural language query language
Fourth Generation Language (4GL)
 Query Languages
 Forms Generators
 Report Generators
 Graphics Generators
 Application Generators

4. DBMS selection
In this phase an appropriate DBMS is selected to support the information system. A number of
factors are involved in DBMS selection. They may be technical and economical factors. The
technical factors are concerned with the suitability of the DBMS for information system. The
following technical factors are considered.

 Type of DBMS such as relational, object-oriented etc


 Storage structure and access methods that the DBMS supports.
 User and programmer interfaces available.
 Type of query languages.
 Development tools etc.

5. Implementation

After the design phase and selecting a suitable DBMS, the database system is implemented. The
purpose of this phase is to construct and install the information system according to the plan and
design as described in previous phases. Implementation involves a series of steps leading to
operational information system that includes creating database definitions (such as tables,
indexes etc), developing applications, testing the system, developing operational procedures and
documentation, training the users and populating the database. In the context of information
engineering, it involves two steps.

 Database definitions.
 Creating applications.

6. Operational Maintenance

Once the database system is implemented, the operational maintenance phase of the database
system begins. The operational maintenance is the process of monitoring and maintaining the
database system. Maintenance includes activities such as adding new fields, changing the size of
existing field, adding new tables, and so on. As the database system requirement change, it
becomes necessary to add new tables or remove existing tables and to reorganize some files by
changing primary access methods or by dropping old indexes and constructing new ones. Some
queries or transactions may be rewritten for better performance. Database tuning or
reorganization continues throughout the life of database and while the requirements keep
changing.

Chapter Three
Data Model
A specific DBMS has its own specific Data Definition Language, but this type of
language is too low level to describe the data requirements of an organization
in a way that is readily understandable by a variety of users. We need a higher-
level language. Such a higher-level is called data-model.

Data Model: a set of concepts to describe the structure of a database, and


certain constraints that the database should obey.
A Data model is a description of the way that data is stored in a database.
Data model helps to understand the relationship between entities and to create
the most effective structure to hold data.
Data Model is a collection of tools or concepts for describing
 Data
 Data relationships
 Data semantics
 Data constraints
The main purpose of Data Model is to represent the data in an understandable
way.
Categories of data models include:
 Object-based
 Record-based
 Physical
Record-based Data Models
Consist of a number of fixed format records. Each record type defines a fixed
number of fields; each field is typically of a fixed length.
 Hierarchical Data Model
 Network Data Model
 Relational Data Model

1. Hierarchical Model
 The simplest data model.
 Record type is referred to as node or segment.
 The top node is the root node.
 Nodes are arranged in a hierarchical structure as sort of upside-down tree.
 A parent node can have more than one child node.
 A child node can only have one parent node.
 The relationship between parent and child is one-to-many.
 Relation is established by creating physical link between stored records
(each is stored with a predefined access path to other records).
 To add new record type or relationship, the database must be redefined
and then stored in a new form.

Department

Employee Job

Time Card
Activity
Advantages of Hierarchical Data Model
 Hierarchical Model is simple to construct and operate on.
 Corresponds to a number of natural hierarchically organized domains e.g.,
assemblies in manufacturing, personnel organization in companies
 Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT,
GET NEXT WITHIN PARENT etc.
Disadvantages of Hierarchical Data Model
 Navigational and procedural nature of processing
 Database is visualized as a linear arrangement of records
 Little scope for "query optimization"
2. Network Model
 Allows record types to have more that one parent unlike hierarchical
model.
 A network data models sees records as set members.
 Each set has an owner and one or more members.
 Do not allow many to many relationships between entities.
 Like hierarchical model network model is a collection of physically linked
records.
 Allow member records to have more than one owner.

Department Job

Employee Activity

Time Card
Advantages of Network Data Model
 Network Model is able to model complex relationships and represents
semantics of add/delete on the relationships.
 Can handle most situations for modeling using record types and
relationship types.
 Language is navigational; uses constructs like FIND, FIND member, FIND
owner, FIND NEXT within set, GET etc. Programmers can do optimal
navigation through the database.
Disadvantages of Network Data Model
 Navigational and procedural nature of processing
 Database contains a complex array of pointers that thread through a set of
records.
 Little scope for automated "query optimization”
3. Relational Data Model
 Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational
Model for Large Shared Data Banks').
 A terminology originates from the branch of mathematics called set theory
and relation.
 Can define more flexible and complex relationship.
 Viewed as a collection of tables called “Relations” equivalent to collection of
record types.
 Relation Two dimensional table.
 Stores information or data in the form of table’s  rows and columns.
 A row of the table is called tuple  equivalent to record.
 A column of a table is called attribute  equivalent to fields.
 Data value is the value of the Attribute.
 Records are related by the data stored jointly in the fields of records in two
tables or files. The related tables contain information that creates the
relation.
 The tables seem to be independent but are related some how.
 No physical consideration of the storage is required by the user.
 Many tables are merged together to come up with a new virtual view of the
relationship.
Alternative terminologies
Relation Table File
Tuple Row Record
Attribute Column Field

 The rows represent records (collections of information about separate


items).
 The columns represent fields (particular attributes of a record).
 Conducts searches by using data in specified columns of one table to find
additional data in another table.
 In conducting searches, a relational database matches information from a
field in one table with information in a corresponding field of another table
to produce a third table that combines requested data from both tables.
Chapter Four
Relational Data Model
Properties of Relational Databases
 Each row of a table is uniquely identified by a Primary Key composed of
one or more columns.
 Each tuple in a relation must be unique.
 Group of columns that uniquely identifies a row in a table is called a
Candidate Key.
 Entity Integrity Rule of the model states that no component of the
primary key may contain a NULL value.
 A column or combination of columns that matches the primary key of
another table is called a Foreign Key. Foreign key is used for cross-
reference tables.
 The Referential Integrity Rule of the model states that, for every foreign
key value in a table there must be a corresponding primary key value in
another table in the database or it should be NULL.
 All tables are logical entities.
 A table is either a base tables (Named Relations) or views (Unnamed
Relations).
 Only Base Tables are physically stores.
 Views are derived from base tables with SQL instructions like: [SELECT,
FROM, WHERE, ORDER BY].
 It is the collection of tables.
 Each entity in one table.
 Attributes are fields (columns) in table.
 Order of rows and columns is immaterial.
 Entries with repeating groups are said to be un-normalized.
 Entries are single-valued.
 Each column (field or attribute) has a distinct name
All values in a column represent the same attribute and have the same data
format.
Building Blocks of the Relational Data Model
The building blocks of the relational data model are:
 Entities: real world physical or logical object.
 Attributes: properties used to describe each Entity or real world object.
 Relationship: the association between Entities.
 Constraints: rules that should be obeyed while manipulating the data.
1. The Entities (persons, places, things etc.) which the organization has to
deal with. Relations can also describe relationships.
 The name given to an entity should always be a singular noun descriptive
of each item to be stored in it. E.g.: student NOT students.
 Every relation has a schema, which describes the columns, or fields the
relation itself corresponds to our familiar notion of a table.
 A relation is a collection of tuples, each of which contains values for a fixed
number of attributes.
 Existence Dependency: the dependence of an entity on the existence of one
or more entities.
 Weak entity : an entity that can not exist without the entity with which it
has a relationship – it is indicated by a double rectangle
2. The Attributes - the items of information which characterize and describe
these entities.
 Attributes are pieces of information ABOUT entities. The analysis must of
course identify those which are actually relevant to the proposed
application. Attributes will give rise to recorded items of data in the
database.
 At this level we need to know such things as:
 Attribute name (be explanatory words or phrases)
 The domain from which attribute values are taken (A DOMAIN is a
set of values from which attribute values may be taken.) Each
attribute has values taken from a domain. For example, the
domain of Name is string and that for salary is real.
 Whether the attribute is part of the entity identifier (attributes
which just describe an entity and those which help to identify it
uniquely).
 Whether it is permanent or time-varying (which attributes may
change their values over time).
 Whether it is required or optional for the entity (whose values will
sometimes be unknown or irrelevant).
Types of Attributes
(1) Simple (atomic) Vs Composite attributes
 Simple : contains a single value (not divided into sub parts)
E.g. Age, gender
 Composite: Divided into sub parts (composed of other attributes)
E.g. Name, address
(2) Single-valued Vs multi-valued attributes
 Single-valued: have only single value (the value may change but has
only one value at one time).
E.g. Name, Sex, Id. No. color_of_eyes
 Multi-Valued: have more than one value.
E.g. Address, dependent-name
Person may have several college degrees
(3) Stored vs. Derived Attribute
 Stored: not possible to derive or compute.
E.g. Name, Address
 Derived: The value may be derived or computed from the values of
other attributes.
E.g. Age (current year – year of birth)
Length of employment (current date- starts date)
Profit (earning-cost)
G.P.A (grade point/credit hours)
(4) Null Values
 NULL applies to attributes which are not applicable or which do not
have values.
 You may enter the value NA (meaning not applicable).
 Value of a key attribute can not be null.
Default value - assumed value if no explicit value.
Entity versus Attributes
When designing the conceptual specification of the database, one should pay
attention to the distinction between an Entity and an Attribute.
 Consider designing a database of employees for an organization.
 Should address be an attribute of Employees or an entity (connected to
Employees by a relationship)?
 If we have several addresses per employee, address must be an
entity (attributes cannot be set-valued/multi valued).
 If the structure (city, Woreda, Kebele, etc) is important, e.g. want to
retrieve employees in a given city, address must be modeled as an entity
(attribute values are atomic)
3. The Relationships between entities which exist and must be taken into
account when processing information. In any business processing one
object may be associated with another object due to some event. Such kind
of association is what we call a Relationship between entity objects.
 One external event or process may affect several related entities.
 Related entities require setting of LINKS from one part of the database to
another.
 A relationship should be named by a word or phrase which explains its
function.
 Role names are different from the names of entities forming the
relationship: one entity may take on many roles; the same role may be
played by different entities.
 For each Relationship, one can talk about the Number of Entities and the
Number of Tuples participating in the association. These two concepts
are called Degree and Cardinality of a relationship respectively.

Degree of a Relationship
 An important point about a relationship is how many entities participate
in it. The number of entities participating in a relationship is called the
Degree of the relationship.
 Among the Degrees of relationship, the following are the basic:
 Unary/Recursive Relationship: Tuples/records of a Single entity are
related withy each other.
 Binary Relationships: Tuples/records of two entities are associated in a
relationship.
 Ternary Relationship: Tuples/records of three different entities are
associated.
 And a generalized one N-Nary Relationship: Tuples from arbitrary
number of entity sets are participating in a relationship.

Cardinality of a Relationship
 Another important concept about relationship is the number of
instances/tuples that can be associated with a single instance from one
entity in a single relationship. The number of instances participating or
associated with a single instance from an entity in a relationship is called
the Cardinality of the relationship. The major cardinalities of a
relationship are:
 ONE-TO-ONE: one tuple is associated with only one other tuple.
 E.g. Building – Location  as a single building will be located in a
single location and as a single location will only accommodate a single
Building.
 ONE-TO-MANY, one tuple can be associated with many other tuples, but
not the reverse.
 E.g. Department-Student as one department can have multiple
students.
 MANY-TO-ONE, many tuples are associated with one tuple but not the
reverse.
 E.g. Employee – Department  as many employees belong to a single
department.
 MANY-TO-MANY: one tuple is associated with many other tuples and from
the other side, with a different role name one tuple will be associated with
many tuples
 E.g. Student – Course  as a student can take many courses and a
single course can be attended by many students.
4. Relational Constraints/Integrity Rules
 Relational Integrity
 Domain Integrity: No value of the attribute should be beyond the
allowable limits.
 Entity Integrity: In a base relation, no attribute of a Primary Key can
assume a value of NULL.
 Referential Integrity: If a Foreign Key exists in a relation, either the
Foreign Key value must match a Candidate Key value in its home relation
or the Foreign Key value must be NULL.
 Enterprise Integrity: Additional rules specified by the users or database
administrators of a database are incorporated.
 Key constraints
If tuples are need to be unique in the database, and then we need to make each
tuple distinct. To do this we need to have relational keys that uniquely identify
each relation.
 Super Key: an attribute or set of attributes that uniquely identifies a tuple
within a relation.
 Candidate Key: a super key such that no proper subset of that collection
is a Super Key within the relation.
A candidate key has two properties:
1. Uniqueness
2. Irreducibility
If a super key is having only one attribute, it is automatically a Candidate key.
If a candidate key consists of more than one attribute it is called Composite
Key.
 Primary Key: the candidate key that is selected to identify tuples uniquely
within the relation.
The entire set of attributes in a relation can be considered as a primary case in
a worst case.
 Foreign Key: an attribute, or set of attributes, within one relation that
matches the candidate key of some relation.
A foreign key is a link between different relations to create the view or the
unnamed relation
 Relational Views
Relations are perceived as a Table from the users’ perspective. Actually, there
are two kinds of relation in relational database. The two categories or tyapes of
Relations are Named and Unnamed Relations. The basic difference is on how
the relation is created, used and updated:
1. Base Relation
A Named Relation corresponding to an entity in the conceptual schema,
whose tuples are physically stored in the database.
2. View (Unnamed Relation)
A View is the dynamic result of one or more relational operations operating
on the base relations to produce another virtual relation that does not
actually exist as presented. So a view is virtually derived relation that does
not necessarily exist in the database but can be produced upon request by a
particular user at the time of request. The virtual table or relation can be
created from single or different relations by extracting some attributes and
records with or without conditions.
Purpose of a view
 Hides unnecessary information from users: since only part of the base
relation (Some collection of attributes, not necessarily all) are to be included
in the virtual table.
 Provide powerful flexibility and security: since unnecessary information will
be hidden from the user there will be some sort of data security.
 Provide customized view of the database for users: each user is going to be
interfaced with their own preferred data set and format by making use of the
Views.
 A view of one base relation can be updated.
 Update on views derived from various relations is not allowed since it may
violate the integrity of the database.
 Update on view with aggregation and summary is not allowed. Since
aggregation and summary results are computed from a base relation and
does not exist actually.

Schemas and Instances and Database State


When a database is designed using a Relational data model, all the data is
represented in a form of a table. In such definitions and representation, there
are two basic components of the database. The two components are the
definition of the Relation or the Table and the actual data stored in each table.
The data definition is what we call the Schema or the skeleton of the database
and the Relations with some information at some point in time is the Instance
or the flesh of the database.
Schemas
 Schema describes how data is to be structured, defined at setup/Design
time (also called "metadata").
 Since it is used during the database development phase, there is rare
tendency of changing the schema unless there is a need for system
maintenance which demands change to the definition of a relation.
 Database Schema (Intension): specifies name of relation and the
collection of the attributes (specifically the Name of attributes).
 Refer to a description of database (or intention).
 Specified during database design.
 Should not be changed unless during maintenance.
 Schema Diagrams
 Convention to display some aspect of a schema visually.
 Schema Construct
 Refers to each object in the schema (e.g. STUDENT).
 E.g. STUNEDT (FName, LName, Id, Year, Dept, Sex)

Instances
 Instance: is the collection of data in the database at a particular point of
time (snap-shot).
 Also called State or Snap Shot or Extension of the database.
 Refers to the actual data in the database at a specific point in time.
 State of database is changed any time we add, delete or update an
item.
 Valid state: the state that satisfies the structure and constraints
specified in the schema and is enforced by DBMS.
 Since Instance is actual data of database at some point in time, changes
rapidly.
 To define a new database, we specify its database schema to the DBMS
(database is empty).
 Database is initialized when we first load it with data.
Chapter Five

Database design

Database design is the process of coming up with different kinds of specification for the data to
be stored in the database. The database design part is one of the middle phases we have in
information systems development where the system uses a database approach. Design is the part
on which we would be engaged to describe how the data should be perceived at different levels
and finally how it is going to be stored in a computer system.
Information System with Database application consists of several tasks which include:
 Planning of Information systems Design
 Requirements Analysis,
 Design (Conceptual, Logical and Physical Design)
 Testing
 Implementation
 Operation and Support
From these different phases, the prime interest of a database system will be the Design part
which is again sub divided into other three sub-phases.
These sub-phases are:
1. Conceptual Design
2. Logical Design, and
3. Physical Design
 In general, one has to go back and forth between these tasks to refine a database design, and
decisions in one task can influence the choices in another task.
 In developing a good design, one should answer such questions as:
 What are the relevant Entities for the Organization?
 What are the important features of each Entity?
 What are the important Relationships?
 What are the important queries from the user?
 What are the other requirements of the Organization and the Users?
The Three levels of Database Design

Conceptual Design

Logical Design

Physical Design

Conceptual Database Design

 Conceptual design is the process of constructing a model of the information used in an


enterprise, independent of any physical considerations.
 It is the source of information for the logical design phase.
 Mostly uses an Entity Relationship Model to describe the data at this level.
 After the completion of Conceptual Design one has to go for refinement of the schema,
which is verification of Entities, Attributes, and Relationships.

Logical Database Design

 Logical design is the process of constructing a model of the information used in an enterprise
based on a specific data model (e.g. relational, hierarchical or network or object), but
independent of a particular DBMS and other physical considerations.
 Normalization process
 Collection of Rules to be maintained.
 Discover new entities in the process.
 Revise attributes based on the rules and the discovered Entities

Physical Database Design


 Physical design is the process of producing a description of the implementation of the
database on secondary storage, defines specific storage or access methods used by database
 Describe the storage structures and access methods used to achieve efficient access to
the data.
 Tailored to a specific DBMS system, characteristics are function of DBMS and
operating systems.
 Includes estimate of storage space

Conceptual Database Design

 Conceptual design revolves around discovering and analyzing organizational and user data
requirements.
 The important activities are to identify
 Entities
 Attributes
 Relationships
 Constraints
 And based on these components develop the ER model using ER diagrams

The Entity Relationship (E-R) Model

 Entity-Relationship modeling is used to represent conceptual view of the database.


 The main components of ER Modeling are:
 Entities
 Corresponds to entire table, not row.
 Represented by Rectangle
 Attributes
 Represents the property used to describe an entity or a relationship
 Represented by Oval
 Relationships
 Represents the association that exist between entities
 Represented by Diamond
 Constraints
 Represent the constraint in the data
Before working on the conceptual design of the database, one has to know and answer the
following basic questions.
 What are the entities and relationships in the enterprise?
 What information about these entities and relationships should we store in the database?
 What are the integrity constraints that hold? Constraints on each data with respect to update,
retrieval and store.
 Represent this information pictorially in ER diagrams, then map ER diagram into a relational
schema.

Developing an E-R Diagram

 Designing conceptual model for the database is not a one linear process but an iterative
activity where the design is refined again and again.
 To identify the entities, attributes, relationships, and constraints on the data, there are
different set of methods used during the analysis phase.
 These include information gathered by…
 Interviewing end users individually and in a group
 Questionnaire survey
 Direct observation
 Examining different documents
 The basic E-R model is graphically depicted and presented for review.
 The process is repeated until the end users and designers agree that the ER diagram is a fair
representation of the organization’s activities and functions.
 Checking the Redundant Relationships in the ER Diagram. Relationships between entities
indicate access from one entity to another - it is therefore possible to access one entity
occurrence from another entity occurrence even if there are other entities and relationships
that separate them - this is often referred to as Navigation' of the ER diagram
 The last phase in ER modeling is validating an ER Model against requirement of the user.

Graphical Representations in ER Diagramming

 Entity is represented by a RECTANGLE containing the name of the entity.

Strong Entity Weak Entity

 Connected entities are called relationship participants.


 Attributes are represented by OVALS and are connected to the entity by a line.

Attribute Multi-Valued Attribute Composite Attribute

Key
Composit Attribute

 A derived attribute is indicated by a DOTTED LINE. (………)

 PRIMARY KEYS are underlined.

Attribute Multi-Valued Attribute Composite Attribute

Key

 Relationships are represented by DIAMOND shaped symbols


 Weak Relationship is a relationship between Weak and Strong Entities.
 Strong Relationship is a relationship between two strong Entities
Strong RelationShip Weak RelationShip

Example 1: Build an ER Diagram for the following information:


 A student record management system will have the following two basic data object
categories with their own features or properties. Students will have an Id, Name, Dept, Age,
GPA and Course will have an Id, Name, Credit Hours
 Whenever a student enroll in a course in a specific Academic Year and Semester, the
Student will have a grade for the course

ID Name Dept DoB ID Name Credit


GPA
Student Course

Age
Enrolled_i
n

Acedamic_year Semeste
Grade r

Example 2: Build an ER Diagram for the following information:


 A Personnel record management system will have the following two basic data object
categories with their own features or properties. Employee will have an Id, Name, DoB, Age,
Tel and Department will have an Id, Name, Location
 Whenever an Employee is assigned in one Department, the duration of his stay in the
respective department should be registered.

Structural Constraints on Relationship

1. Constraints on Relationship / Multiplicity/ Cardinality Constraints


 Multiplicity constraint is the number or range of possible occurrence of an entity
type/relation that may relate to a single occurrence/tuple of an entity type/relation through a
particular relationship.
 Mostly used to insure appropriate enterprise constraints.

One-to-one relationship

 A customer is associated with at most one loan via the relationship borrower
 A loan is associated with at most one customer via borrower
E.g. Relationship Manages between Staff ands Branch
 The multiplicity of the relationship is:
 One branch can only have one manager.
 One Employee could Manages either one or no branches.

1..1 0..1
Employee Manages Branch

One-To-Many Relationships

 In the one-to-many relationship a loan is associated with at most one customer via borrower,
a customer is associated with several (including 0) loans via borrower

E.g. Relationship Leads between Staff and Project.


 The multiplicity of the relationship:
 One staff may lead one or more.
 One project is leaf by one staff.

1..1 0..*
Employee Leads Project
Many-To-Many Relationship

 A customer is associated with several (possibly 0) loans via borrower.


 A loan is associated with several (possibly 0) customers via borrower.

E.g.: Relationship Teaches between Instructors and Course.


 The multiplicity of the relationship
 One Instructor teaches one or more Course(s).
 One course thought by zero or more Instructor(s).

0..* 0..*
Instructor Teaches Course
Logical Database Design

Logical design is the process of constructing a model of the information used in an enterprise
based on a specific data model (e.g. relational, hierarchical or network or object), but
independent of a particular DBMS and other physical considerations.
 Normalization process
 Collection of Rules to be maintained.
 Discover new entities in the process.
 Revise attributes based on the rules and the discovered Entities
The first step before applying the rules in relational data model is converting the conceptual
design to a form suitable for relational logical model, which is in a form of tables.

Converting ER Diagram to Relational Tables

Three basic rules to convert ER into tables or relations:


1. For a relationship with One-to-One Cardinality:
 All the attributes are merged into a single table. Which means one can post the primary key
or candidate key of one of the relations to the other as a foreign key.
2. For a relationship with One-to-Many Cardinality:
 Post the primary key or candidate key from the “one” side as a foreign key attribute to the
“many” side. E.g.: For a relationship called “Belongs To” between Employee (Many) and
Department (One)
3. For a relationship with Many-to-Many Cardinality:
 Create a new table (which is the associative entity) and post primary key or candidate key
from each entity as attributes in the new table along with some additional attributes (if
applicable)
After converting the ER diagram in to table forms, the next phase is implementing the process of
normalization, which is a collection of rules each table should satisfy.
Normalization

A relational database is merely a collection of data, organized in a particular manner. As the


father of the relational database approach, Codd created a series of rules called normal forms
that help define that organization
One of the best ways to determine what information should be stored in a database is to clarify
what questions will be asked of it and what data would be included in the answers.
Database normalization is a series of steps followed to obtain a database design that allows for
consistent storage and efficient access of data in a relational database. These steps reduce data
redundancy and the risk of data becoming inconsistent.
NORMALIZATION is the process of identifying the logical associations between data items
and designing a database that will represent such associations but without suffering the update
anomalies which are;
1. Insertion Anomalies
2. Deletion Anomalies
3. Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many
tables. Thus de-normalization is sometimes used to improve performance, at the cost of reduced
consistency guarantees.
Normalization normally is considered as good if it is lossless decomposition.
All the normalization rules will eventually remove the update anomalies that may exist during
data manipulation after the implementation. The update anomalies are; The type of problems that
could occur in insufficiently normalized table is called update anomalies which includes;

(1) Insertion anomalies

An "insertion anomaly" is a failure to place information about a new database entry into all
the places in the database where information about that new entry needs to be stored. In a
properly normalized database, information about a new entry needs to be inserted into only
one place in the database; in an inadequately normalized database, information about a new
entry may need to be inserted into more than one place and, human fallibility being what it is,
some of the needed additional insertions may be missed.

(2) Deletion anomalies

A "deletion anomaly" is a failure to remove information about an existing database entry


when it is time to remove that entry. In a properly normalized database, information about an
old, to-be-gotten-rid-of entry needs to be deleted from only one place in the database; in an
inadequately normalized database, information about that old entry may need to be deleted
from more than one place, and, human fallibility being what it is, some of the needed
additional deletions may be missed.

(3) Modification anomalies

A modification of a database involves changing some value of the attribute of a table. In a


properly normalized database table, what ever information is modified by the user, the
change will be effected and used accordingly.
The purpose of normalization is to reduce the chances for anomalies to occur in a database.

Example of problems related with Anomalies


EmpID FName LName SkillID Skill SkillType School SchoolAdd Skill level
12 Abebe Kebede 2 SQL Database AAU Sidist_killo 5
16 Lemma Alemu 5 C++ Programming NAC Saris 6
28 Mesfin Taye 2 SQL Database AAU Sidist_killo 10
25 Abera Belay 6 VB.6 Programming Helico Piazza 8
65 Almaz Abera 2 SQL Database Helico Piazza 9
24 Teddy Tamiru 8 Oracle Database NAC Saris 5
51 Selam Dereje 4 Prolog Programming Jimma Jimma_city 8
94 Taye Gizaw 3 Cisco Networking AAU Sidist_killo 7
18 Belay Abebe 1 IP Programming Jimma Jimma_city 4
13 Yared Tesfu 7 Java Programming AAU Sidist_killo 6
Deletion Anomalies:

If employee with ID 16 is deleted then ever information about skill C++ and the type of
skill is deleted from the database. Then we will not have any information about C++ and
its skill type.

Insertion Anomalies:

What if we have a new employee with a skill called Pascal? We can not decide weather
Pascal is allowed as a value for skill and we have no clue about the type of skill that
Pascal should be categorized as.

Modification Anomalies:

What if the address for Helico is changed from Piazza to Mexico? We need to look for
every occurrence of Helico and change the value of School_Add from Piazza to Mexico,
which is prone to error.
Database-management system can work only with the information that we put explicitly
into its tables for a given database and into its rules for working with those tables, where
such rules are appropriate and possible.

Functional Dependency (FD)

Before moving to the definition and application of normalization, it is important to have an


understanding of "functional dependency."

Data Dependency
The logical associations between data items that point the database designer in the direction of a
good database design are referred to as determinant or dependent relationships.

Two data items A and B are said to be in a determinant or dependent relationship if certain
values of data item B always appears with certain values of data item A. if the data item A is the
determinant data item and B the dependent data item then the direction of the association is from
A to B and not vice versa.

The essence of this idea is that if the existence of something, call it A, implies that B must exist
and have a certain value, and then we say that "B is functionally dependent on A." We also
often express this idea by saying that "A determines B," or that "B is a function of A," or that "A
functionally governs B." Often, the notions of functionality and functional dependency are
expressed briefly by the statement, "If A, then B." It is important to note that the value B must be
unique for a given value of A, i.e., any given value of A must imply just one and only one value
of B, in order for the relationship to qualify for the name "function." (However, this does not
necessarily prevent different values of A from implying the same value of B.)
X  Y holds if whenever two tuples have the same value for X, they must have the same value
for Y
The notation is: AB which is read as; B is functionally dependent on A.
In general, a functional dependency is a relationship among attributes. In relational databases,
we can have a determinant that governs one other attribute or several other attributes.
FDs are derived from the real-world constraints on the attributes.

Example
Dinner Type of Wine
Meat Red
Fish White
Cheese Rose

Since the type of Wine served depends on the type of Dinner, we say Wine is functionally
dependent on Dinner.
Dinner  Wine
Dinner Type of Wine Type of Fork
Meat Red Meat fork
Fish White Fish fork
Cheese Rose Cheese fork

Since both Wine type and Fork type are determined by the Dinner type, we say Wine is
functionally dependent on Dinner and Fork is functionally dependent on Dinner.
Dinner  Wine
Dinner  Fork
Partial Dependency
If an attribute which is not a member of the primary key is dependent on some part of the
primary key (if we have composite primary key) then that attribute is partially functionally
dependent on the primary key.
Let {A, B} is the Primary Key and C is no key attribute.
Then if {A, B}  C and B C or A C
Then C is partially functionally dependent on {A, B}

Full Dependency

If an attribute which is not a member of the primary key is not dependent on some part of the
primary key but the whole key (if we have composite primary key) then that attribute is fully
functionally dependent on the primary key.
Let {A, B} is the Primary Key and C is no key attribute
Then if {A, B}  C and B C and A C doesn’t hold (if B can not determine C and B can not
determine C)
Then C Fully functionally dependent on {A, B}

Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A
implies B, and if also B implies C, then A implies C."
Example:
If Mr X is a Human, and if every Human is an Animal, then Mr X must be an Animal.
Generalized way of describing transitive dependency is that:
If A functionally governs B, AND
If B functionally governs C
THEN A functionally governs C
Provided that neither C nor B determines A i.e. (B / A and C / A)
In the normal notation:
{(AB) AND (BC)} ==> AC provided that B / A and C / A

Steps of Normalization:

We have various levels or steps in normalization called Normal Forms. The level of complexity,
strength of the rule and decomposition increases as we move from one lower level Normal Form
to the higher.
A table in a relational database is said to be in a certain normal form if it satisfies certain
constraints.
Normal form below represents a stronger condition than the previous one
Normalization towards a logical design consists of the following steps:
 Un-Normalized Form:
Identify all data elements
 First Normal Form:
Find the key with which you can find all data
 Second Normal Form:
Remove part-key dependencies. Make all data dependent on the whole key.
 Third Normal Form
Remove non-key dependencies. Make all data dependent on nothing but the key.
For most practical purposes, databases are considered normalized if they adhere to third normal
form.

First Normal Form (1NF)

Requires that all column values in a table are atomic (e.g., a number is an atomic value,
while a list or a set is not). We have two ways of achieving this: -
1. Putting each repeating group into a separate table and connecting them with a primary
key-foreign key relationship.
2. Moving these repeating groups to a new row by repeating the common attributes. If so
then find the key with which you can find all data
Definition: a table (relation) is in 1NF
If
 There are no duplicated rows in the table. Unique identifier.
 Each cell is single-valued (i.e., there are no repeating groups).
 Entries in a column (attribute, field) are of the same kind.

Example for First Normal form (1NF)


UNNORMALIZED
EmpID FName LName SkillID Skill SkillType School SchoolAdd Skill level
12 Abebe Kebede 2 SQL Database AAU Sidist_killo 5
6 VB.6 Programming Helico Piazza 8
16 Lemma Alemu 5 C++ Programming NAC Saris 6
1 IP Programming Jimma Jimma_city 4
28 Mesfin Taye 2 SQL Database AAU Sidist_killo 10
65 Almaz Abera 2 SQL Database Helico Piazza 9
4 Prolog Programming Jimma Jimma_city 8
7 Java Programming AAU Sidist_killo 6
24 Teddy Tamiru 8 Oracle Database NAC Saris 5
94 Taye Gizaw 3 Cisco Networking AAU Sidist_killo 7
FIRST NORMAL FORM (1NF)

Remove all repeating groups. Distribute the multi-valued attributes into different rows and
identify a unique identifier for the relation so that is can be said is a relation in relational
database.
EmpID FName LName SkillID Skill SkillType School SchoolAdd Skill level
12 Abebe Kebede 2 SQL Database AAU Sidist_killo 5
12 Abebe Kebede 6 VB.6 Programming Helico Piazza 8
16 Lemma Alemu 5 C++ Programming NAC Saris 6
16 Lemma Alemu 1 IP Programming Jimma Jimma_city 4
28 Mesfin Taye 2 SQL Database AAU Sidist_killo 10
65 Almaz Abera 2 SQL Database Helico Piazza 9
65 Almaz Abera 4 Prolog Programming Jimma Jimma_city 8
65 Almaz Abera 7 Java Programming AAU Sidist_killo 6
24 Teddy Tamiru 8 Oracle Database NAC Saris 5
94 Taye Gizaw 3 Cisco Networking AAU Sidist_killo 7

SECONF NORMAL FORM (2NF)

No partial dependency of a non key attribute on part of the primary key. This will result in a set
of relations with a level of Second Normal Form.
Any table that is in 1NF and has a single-attribute (i.e., a non-composite) primary key is
automatically in 2NF.
Definition: a table (relation) is in 2NF
If
 It is in 1NF and
 If all non-key attributes are dependent on the entire primary key. i.e. no partial
dependency.
Example for 2NF:
EMP_PROJ
EmpID EmpName ProjNo ProjName ProjLoc ProjFund ProjMangID Incentive

EMP_PROJ rearranged
EmpID ProjNo EmpName ProjName ProjLoc ProjFund ProjMangID Incentive
Business rule: Whenever an employee participates in a project, he/she will be entitled for an
incentive.
This schema is in its 1NF since we don’t have any repeating groups or attributes with multi-
valued property. To convert it to a 2NF we need to remove all partial dependencies of non key
attributes on part of the primary key.
{EmpID, ProjNo} EmpName, ProjName, ProjLoc, ProjFund, ProjMangID, Incentive
But in addition to this we have the following dependencies
FD1: {EmpID}EmpName
FD2: {ProjNo}ProjName, ProjLoc, ProjFund, ProjMangID
FD3: {EmpID, ProjNo} Incentive
As we can see, some non key attributes are partially dependent on some part of the primary key.
This can be witnessed by analyzing the first two functional dependencies (FD1 and FD2). Thus,
each Functional Dependencies, with their dependent attributes should be moved to a new relation
where the Determinant will be the Primary Key for each.

EMPLOYEE
EmpID EmpName

PROJECT
ProjNo ProjName ProjLoc ProjFund ProjMangID

EMP_PROJ
EmpID ProjNo Incentive
THIRD NORMAL FORM (3NF)

Eliminate Columns Dependent on another non-Primary Key - If attributes do not contribute to a


description of the key, remove them to a separate table. These levels avoid update and delete
anomalies.
Definition: a Table (Relation) is in 3NF
If
 It is in 2NF and
 There are no transitive dependencies between a primary key and non-primary key
attributes.
Example for (3NF)
Assumption: Students of same batch (same year) live in one building or dormitory
STUDENT
StudID Stud_F_Name Stud_L_Name Dept Year Dormitory
125/97 Abebe Kebede Info Sc 1 401
654/95 Lemma Alemu Geog 3 403
842/95 Mesfin Taye Comp. Sc 3 403
165/97 Abera Belay Info Sc 1 401
985/95 Almaz Abera Geog 3 403

This schema is in its 2NF since the primary key is a single attribute.
Let’s take StudID, Year and Dormitory and see the dependencies.
StudIDYear AND YearDormitory
And Year can not determine StudID and Dormitory can not determine StudID
Then transitively StudIDDormitory
To convert it to a 3NF we need to remove all transitive dependencies of non key attributes on
another non-key attribute.
The non-primary key attributes, dependent on each other will be moved to another table and
linked with the main table using Candidate Key- Foreign Key relationship.
STUDENT
StudID Stud_F_Name Stud_L_Name Dept Year
125/97 Abebe Kebede Info Sc 1
654/95 Lemma Alemu Geog 3
842/95 Mesfin Taye Comp. Sc 3
165/97 Abera Belay Info Sc 1
985/95 Almaz Abera Geog 3

DORM
Year Dormitory
1 401
3 403

Generally, even though there are other four additional levels of Normalization, a table is said to
be normalized if it reaches 3NF. A database with all tables in the 3NF is said to be Normalized
Database.
Mnemonic for remembering the rationale for normalization up to 3NF could be the following:
1. No Repeating or Redundancy: - no repeating fields in the table.
2. The Fields Depend Upon the Key: - the table should solely depend on the key.
3. The Whole Key: - no partial key dependency.
4. And Nothing But the Key: - no inter data dependency.
5. So Help Me Codd: - since Codd came up with these rules.

Other Levels of Normalization

1. Boyce-Codd Normal Form (BCNF):


Isolate Independent Multiple Relationships - No table may contain two or more 1: n or N: M
relationships that are not directly related.
The correct solution, to cause the model to be in 4th normal form, is to ensure that all M: M
relationships are resolved independently if they are indeed independent, as shown below.
Def: A table is in BCNF if it is in 3NF and if every determinant is a candidate key.
2. Forth Normal form (4NF)
Isolate Semantically Related Multiple Relationships - There may be practical constrains on
information that justify separating logically related many-to-many relationships.
Def: A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.
3. Fifth Normal Form (5NF)
A model limited to only simple (elemental) facts, as expressed in ORM.
Def: A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF
and if every join dependency in the table is a consequence of the candidate keys of the table.
4. Domain-Key Normal Form (DKNF)
Models are free from all modification anomalies.
Def: A table is in DKNF if every constraint on the table is a logical consequence of the
definition of keys and domains.
The underlying ideas in normalization are simple enough. Through normalization we want to
design for our relational database a set of tables that;
(1) Contain all the data necessary for the purposes that the database is to serve.
(2) have as little redundancy as possible.
(3) Accommodate multiple values for types of data that require them.
(4) Permit efficient updates of the data in the database, and
(5) Avoid the danger of losing data unknowingly.

Pitfalls of Normalization

 Requires data to see the problems.


 May reduce performance of the system.
 Is time consuming.
 Difficult to design and apply and.
 Prone to human error.

You might also like