Database 3rd Semister All Chapter
Database 3rd Semister All Chapter
1. Manual Approach
2. Traditional File Based Approach
3. Database Approach
1. Manual Approach
In the manual approach, data storage and retrieval follows the primitive and traditional
way of information handling where cards and paper are used for the purpose. The data
storage and retrieval will be performed using human labour.
Files, for as many event and objects as the organization has, are used to store information.
• Each of the files containing various kinds of information is labelled and stored in
one or more cabinets.
• The cabinets could be kept in safe places for security purpose based on the
sensitivity of the information contained in it.
• Insertion and retrieval is done by searching first for the right cabinet then for the
right the file then the information.
• One could have an indexing system to facilitate access to the data
Having this in mind, a full scale DBMS should at least have the following services to
provide to the user.
➢ Data storage, retrieval and update in the database
➢ A user accessible catalogue
➢ Transaction support service: ALL or NONE transaction, which minimize data
inconsistency.
➢ Concurrency Control Services: access and update on the database by different
users simultaneously should be implemented correctly.
➢ Recovery Services: a mechanism for recovering the database after a failure must
be available.
➢ Authorization Services (Security): must support the implementation of access and
authorization service to database administrator and users.
➢ Support for Data Communication: should provide the facility to integrate with
data transfer software or data communication managers.
➢ Integrity Services: rules about data and the change that took place on the data,
correctness and consistency of stored data, and quality of data based on business
constraints.
➢ Services to promote data independency between the data and the application
A DBMS is software package used to design, manage, and maintain databases. Each
DBMS should have facilities to define the database, manipulate the content of the
database and control the database. These facilities will help the designer, the user as well
as the database administrator to discharge their responsibility in designing, using and
managing the database.
It provides the following facilities:
The DBMS is software package that helps to design, manage, and use data using the
database approach. Taking a DBMS as a system, one can describe it with respect to it
environment or other systems interacting with the DBMS. The DBMS environment has
five components. To design and use a database, there will be the interaction or integration
of Hardware, Software, Data, Procedure and People.
1. Hardware: are components that one can touch and feel. These components are
comprised of various types of personal computers, mainframe or any server
computers to be used in multi-user system, network infrastructure, and other
peripherals required in the system.
3. Data: since the goal of any database system is to have better control of the data
and making data useful, Data is the most important component to the user of the
database. There are two categories of data in any database system: that is
Operational and Metadata. Operational data is the data actually stored in the
system to be used by the user. Metadata is the data that is used to store
information about the database itself. The structure of the data in the database is
called the schema, which is composed of the Entities, Properties of entities, and
relationship between entities and business constraints.
4. Procedure: this is the rules and regulations on how to design and use a
database. It includes procedures like how to log on to the DBMS, how to use
facilities, how to start and stop DBMS, how to make backup, how to treat
hardware and software failure, how to change the structure of the database.
5. People: this component is composed of the people in the organization that are
responsible or play a role in designing, implementing, managing, administering
and using the resources in the database. This component includes group of people
with high level of knowledge about the database and the design technology to
other with no knowledge of the system except using the data in the database.
2. Analysis: that concentrates more on fact finding about the problem or the
opportunity. Feasibility analysis, requirement determination and structuring, and
selection of best design method are also performed at this phase.
As people are one of the components in DBMS environment, there are group of roles
played by different stakeholders of the designing and operation of a database system.
End Users
Workers, whose job requires accessing the database frequently for various purposes,
there are different group of users in this category.
• Naïve Users:
a. Sizable proportion of users
b. Unaware of the DBMS
c. Only access the database based on their access level and demand
d. Use standard and pre-specified types of queries.
• Sophisticated Users
a. Users familiar with the structure of the Database and facilities of the
DBMS.
b. Have complex requirements
c. Have higher level queries
d. Are most of the time engineers, scientists, business analysts, etc
• Casual Users
a. Users who access the database occasionally.
Fundamentals of Database Systems Organized By Melkam A.
14
b. Need different information from the database each time.
c. Use sophisticated database queries to satisfy their needs.
d. Are most of the time middle to high level managers.
These users can be again classified as “Actors on the Scene” and “Workers behind the
Scene”.
STORED DATABASE
2. The conceptual (logical) level – has a conceptual schema which describes the
structure of the database for users. It hides the details of the physical storage
structures, and concentrates on describing entities, data types, relationships, user
operations and constraints. Usually a representational data model is used to
describe the conceptual schema.
3. The External or View level – includes external schemas or user vies. Each
external schema describes the part of the database that a particular user group is
interested in and hides the rest of the database from that user group. Represented
using the representational data model.
Classification of DBMSs
1. Data Model Classification
o Relational data model
o Object data model
o Hierarchical data model
o Network data model and Object relational data model
2. Number of Users
o Single User systems
o Multi User systems
3. Number of Sites
o Centralized – data is stored at single site.
o Distributes – database and DBMS software stored over many sites connected
by network
o Homogeneous – use same DBMS software at multiple sites.
4. Cost
o Low-end systems under $3000
o High-end systems, over $100,000
CHAPTER THREE
Database Modelling
The ER-model is used to model the logical view of the system from data perspective
which consists of components:
Entity sets/types
Entities with the same basic attributes are grouped or typed into an entity type. For
example, the entity type EMPLOYEE and PROJECT. An entity may belong to more
than one entity type.
For example, a staff working in a particular department can pursue higher education as
part-time.
Hence the same person is a LECTURER at one instance and STUDENT at another
instance.
Strong Entity:A strong (independent) entity is one that does not rely on other entities for
identification.Strong entity is one whose existence does not depend on other entity.
Example: Consider the example, student takes course. Here student is a strong entity.
In this example, course is considered as weak entity because, if there are no students to
take a particular course, then that course cannot be offered.The COURSE entity depends
on the STUDENT entity.
Weak (dependent) Entity:Weak entity is one whose existence depends on other entity.It
is one that relies on other entities for identification.In many cases, weak entity does
not have primary key.
Example: Consider the example, customer borrows loan. Here loan is a weak entity.
For every loan, there should be at least one customer. Here the entity loan depends on the
entity customer hence loan is a weak entity.
Associative Entity Type:A weak entity type that depends on two or more entity types for
its primary key.
An individual occurrence of an entity set is also known as an instance (object).
Example: 1. Age of a person can be derived from the date of birth of the person. In this
example, age is the derived attribute.
2. Specialization: is a process that defines a group entities which is divided into sub
groups based on their characteristic.
It is a top down approach, in which one higher entity can be broken down into two lower
level entity.It maximizes the difference between the members of an entity by identifying
the unique characteristic or attributes of each member.
It defines one or more sub class for the super class and also forms the superclass/subclass
relationship.
For example
C. Category or Union
Category represents a single super class or sub class relationship with more than one
super class.It can be a total or partial participation.
Fundamentals of Database Systems Organized By Melkam A.
32
For example: Car booking, Car owner can be a person, a bank (holds a possession on a
Car) or a company. Category (sub class) → Owner is a subset of the union of the three
super classes → Company, Bank, and Person.
A Category member must exist in at least one of its super classes.
D. Aggregation
Aggregation is a process that represent a relationship between a whole object and its
component parts.It abstracts a relationship between objects and viewing the relationship
as an object.It is a process when two entity is treated as a single entity.
In the above example, the relation between College and Course is acting as an Entity in
Relation with Student
EMPLOYEE is the superclass for each of these subclasses. These are called
superclass/subclass relationships:
CHAPTER FIVE
Functional dependency and normalization
In this chapter, you will learn:
✓ What normalization is and what role it plays in the database design process
✓ About the normal forms 1NF, 2NF, 3NF
✓ How normal forms can be transformed from lower normal forms to higher normal
forms
✓ How normalization and ER modeling are used concurrently to produce a good
database design
The above table is not normalized. We will see the problems that we face when a table is
not normalized.
Update anomaly: In the above table we have two rows for employee Rick as he belongs
to two departments of the company. If we want to update the address of Rick then we
have to update the same in two rows or the data will become inconsistent. If somehow,
the correct address gets updated in one department but not in other then as per the
database, Rick would be having two different addresses, which is not correct and would
lead to inconsistent data.
Insert anomaly: Suppose a new employee joins the company, who is under training and
currently not assigned to any department then we would not be able to insert the data into
the table if Emp_dep field doesn’t allow nulls.
Delete anomaly: Suppose, if at a point of time the company closes the department D890
then deleting the rows that are having Emp_dep as D890 would also delete the
information of employee Maggie since she is assigned only to this department.
To overcome these anomalies we need to normalize the data. In the next section we will
discuss about normalization.
Database tables and Normalization
Normalization is Process for evaluating and correcting table structures to minimize data
redundancies. It works through a series of stages called normal forms:
✓ First normal form (1NF)
✓ Second normal form (2NF)
✓ Third normal form (3NF)
Prof x ITec 01
ITec 02
Prof y ITec 03
First normal form says each cell of a table should contain exactly one value. But in the
above table there exist two values (ITec 01 and ITec 02) in one column.so it is better to
write them in separate columns.
Prof x ITec 01
Prof x ITec 02
Prof y ITec 03
3NF: Third normal form (3NF) is based on the concept of transitive dependency. A
functional dependency X → Y in a relation schema R is a transitive dependency if there
exists a set of attributes Z in R that is neither a candidate key nor a subset of any key of
R, and both X → Z and Z → Y hold.
According to Codd’s original definition, a relation schema R is in 3NF if:
It satisfies 2NF
No nonprime attribute of R is transitively dependent on the primary key.
Example: In the relation Emp_dep: The dependency Ssn → Dmgr_ssn is transitive
through Dnumber in EMP_DEP because both the dependencies Ssn → Dnumber and
Dnumber → Dmgr_ssn hold and Dnumber is neither a key itself nor a subset of the key
of EMP_DEPT. Intuitively, we can see that the dependency of Dmgr_ssn on Dnumber is
undesirable in EMP_DEPT since Dnumber is not a key of EMP_DEPT.
Fundamentals of Database Systems Organized By Melkam A.
42
The relation schema EMP_DEP is in 2NF, since no partial dependencies on a key exist.
However, EMP_DEPT is not in 3NF because of the transitive dependency of Dmgr_ssn
(and also Dname) on Ssn via Dnumber. We can normalize EMP_DEPT by decomposing
it into the two 3NF relation schemas ED1 and ED2 shown above. Intuitively, we see that
ED1 and ED2 represent independent facts about employees and departments, both of
which are entities in their own right. A NATURAL JOIN operation on ED1 and ED2 will
recover the original relation EMP_DEPT without generating spurious tuples.
Summery About Normal Forms
Normal Test Remedy(normalization)
form
First (1NF) Relation should have no Form new relations for each
multivalued attributes or nested multivalued attribute or nested
relations relation.
Second For relations where primary key Decompose and set up a new relation
(2NF) contains multiple attributes, no for each partial key with its dependent
nonkey attribute should be attribute(s). Make sure to keep a
functionally dependent on a part relation with the original primary key
of the primary key and any attributes that are fully
functionally dependent on it.
CHAPTER SIX
Relational Algebra and Structured Query Language
Agenda
✓ Relational algebra
✓ Relational calculus (reading Assignment)
✓ Structural Query Language (SQL)
Relational algebra
Relational Algebra is a procedural query language that consists of a set of operations that
take one or two relations as input and produce a new relation as a result/output.
The relational algebra is a theoretical language with operations that work on one or more
relations to define another relation without changing the original relation.
Relational algebra operations are performed recursively on a relation. The output of these
operations is a new relation, which might be formed from one or more input relations.
The algebra operations enable a user to retrieve specific request on a relational model.
The operations that produce a new relation can be further manipulated using operations
of the relation algebra. The sequence of the relational algebra that produces new relation
forms a relational algebra expression.
Knowledge about relational algebra allows us to understand query execution and
optimization in relational database management system.
Fundamental Operations of Relational Algebra
Table A Table B
1 1 1 1
1 2 1 3
σ column 2 = '1' (A X B)
column 1 column 2
1 1
1 1
Table A Table B
1 1 1 1
1 2 1 3
A ∪ B gives
column 1 column 2
1 1
1 2
1 3
A ∩ B
Table A ∩ B
column 1 column 2
1 1
Set Difference Operation (-): The result of the set difference operation on R and S
denoted by R − S is the set of elements in R but not in S.
A-B
Table A – B
column 1 column 2
1 2
Note: For the set operations (Union, Intersection, and Set difference) the two relational
operands R and S must have same type of tuples, this condition is known as Union
Compatibility.
Join Operations: Join operation combines two relations to form a new relation. The
tables should be joined based on a common column. The common column should be
compatible in terms of domain.
Inner Joins:
Outer join:
1. Natural Join: The natural join performs an equi join of the two relations R and S over
all common attributes. One occurrence of each common attribute is eliminated from the
result. In other words a natural join will remove duplicate attribute.
Notation: R S
Example of Equi Join: Given the two relations STAFF and DEPT, produce a list of staff
and the departments they work in.
1. Left Outer Join: Left outer joins is a join in which tuples from R that don’t have
matching values in the common column of S are also included in the result relation.
2. Right Outer Join: Right outer join is a join in which tuples from S that do not have
matching values in the common column of R are also included in the result relation.
3. Full Outer Join: Full outer join is a join in which tuples from R that do not have
matching values in the common columns of S still appear and tuples in S that do not have
matching values in the common columns of R still appear in the resulting relation.
Consider two relations PEOPLE and MENU determine the full outer, left outer, and right
outer join.
From this table, it is to be noted that all the tuples from the left table (in our case it is
PEOPLE relation) appears in the result.
2. The right outer join of PEOPLE and MENU on Food is represented in the relational
The result of the right outer join is also shown in the previous page.
3. The full outer join of PEOPLE and MENU on Food is represented in the relational
From this table, it is clear that tuples from both the PEOPLE and the MENU relation
appears in the result.
Chapter 6 (cont.…)
Ideally, a database language should allow a user to create the database and relation
structures; it should allow a user to perform basic data management tasks, such as the
insertion, modification and deletion of data from the relations; and it should allow a user
to perform both simple and complex queries to transform the raw data into information.
In addition, a database language must perform these tasks with minimal user effort, and
its command structure and syntax must be relatively easy to learn.
❖ A Data Definition Language (DDL) for defining the database structure, and
❖ A Data Manipulation Language (DML) for retrieving and updating data.
SQL contains only these definitional and manipulative commands; it does not contain
flow control commands. In other words, there are no IF..THEN..ELSE, GO TO, DO ...
WHILE or other commands to provide a flow of control.
1. Data Definition
The SQL data definition language allows us to create or destroy database objects
(schemas, domains, tables, views and indexes).
CREATE SCHEMA
DROP SCHEMA
CREATE DOMAIN
ALTER DOMAIN
DROP DOMAIN
CREATE TABLE
ALTER TABLE
DROP TABLE
CREATE VIEW
The SCHEMA is a collection of database objects that are in some way related to one
another. (All objects in a database are described in one schema or another).
The objects in a schema can be tables, views, domains, character sets, assertions (rules),
etc.
At present CREATE and DROP SCHEMA are not yet widely implemented.
So, this might be the first step before starting to create any object.
Examples:
Syntax
The CREATE TABLE command in the SQL statement is used to specify a new relation
in a database by giving it a name and listing its attributes.
Syntax
- <data_type> is the SQL supported data types: CHAR (n), VARCHAR (n), INT,
Examples
CREATE TABLE new table (name char (30) NOT NULL, age integer NOT
NULL CHECK age>0, sex char (1))
2. Create a table with fields: name, age, sex and city; and make the default value of city
‘Addis Ababa’ (expecting many records will hold this value) and a rule for male
members to be of age>30.
CREATE TABLE temp (name char (30) NOT NULL, age integer NOT NULL
CHECK age>0, sex char (1) CHECK VALUE IN (‘M’,’F’) PRIMARY KEY (name)
CHECK sex=‘M’AND age>30)
3. The following table defines a foreign key rule for the employee table whose target is
the department table.
CREATE TABLE department (depid char (5) NOT NULL, depname char (40), depid
char (5), budget float (14, 2) PRIMARY KEY (depid) UNIQUE (depname))
CREATE TABLE employee (empid char (5) NOT NULL, empname char (40), depid
char (5), salary float (10, 2) PRIMARY KEY (empid) FOREIGN KEY (depid)
REFERENCES department ON UPDATE CASCADE ON DELETE CASCADE).
Examples
1. Change the field width of the empid column inside the employee table.
2. Add a new default value of .AA. For the empid field inside the employee table.
3. Tell the system that empid is the primary key inside the employee table.
The Data Manipulation Language (DML) is part of the SQL syntax for executing queries
to insert, retrieve, update, and delete records. The statements are;
SELECT statement
It is the most frequently used SQL command. The general form of the SELECT statement
is:
The above order of the clauses in the processing of SELECT cannot be changed. The
only two mandatory clauses are the first two: SELECT and FROM and the remainder are
optional. The result of a query on a table is another table.
The result table for the above two SELECT statements is:
SELECT empname, ’Yearly salary =‘AS title, salary * 12 as yearly FROM employee
This command displays the following table. Note the title and Yearly column names in
the following table heading.
The above examples show the use of the SELECT statement to retrieve all rows from a
table.
This can be achieved with the WHERE clause, which consists of the keyword WHERE
followed by a search condition that specifies the rows to be retrieved.
This clause consists of a list of column identifiers that the result is to be sorted on,
separated by commas.
3. List all employees by empname and those with similar names by department Id.
ASC stands for ascending sort order and DESC stands for reverse sort order.
• The INSERT INTO statement is used to insert new rows into a table.
Syntax
You can also specify the columns for which you want to insert data:
Syntax