Database by NJM
Database by NJM
DATABASE FUNDAMENTALS
Class: Comp. Sc A/L By: NGANFOR JESPA
The term database refers to a collection of related data from which the users can efficiently
retrieve the desired information. In addition to the storage and retrieval of data, certain other
operations can also be performed on a database. These operations include adding, updating
and deleting data. All these operations on a database are performed using a database
management system (DBMS). Essentially, a DBMS is a computerized record-keeping system.
In this topic we will be introduced to the basic terminology used in a database management
system (such as normalization, entities, attributes, keys, relational database management
systems, structured query language).
Table of Contents
I. INTRODUCTION TO DATABASES ...............................................................................2
II. DATABASE MANAGEMENT SYSTEM (DBMS)......................................................4
III. DATABASE MODELS..................................................................................................4
IV. DATA BASE ABSTRACTION LEVELS .....................................................................6
V. DATABASE USERS......................................................................................................7
VI. CENTRLIZED DATABASE vs DISTRIBUTED DATABASE ...................................7
VII. ENTITY-RELATION MODEL......................................................................................8
VIII. RELATIONAL DATABASE ...................................................................................10
IX. DATA INTEGRITY .....................................................................................................15
X. DATABASE NORMALIZATION...............................................................................15
XI. INTRODUCTION TO QUERIES ................................................................................18
Page 1
Topic: Database Fundamentals By NGANFOR JESPA
I. INTRODUCTION TO DATABASES
Data can be anything such as a number, a person's name, images, sounds and so on. Hence,
data can be defined as a set of isolated and unrelated raw facts (represented by values), which
have little or no meaning because they lack a context for evaluation (e.g. ‘Monica’, ‘36’,
‘Chief’ …). When the data are processed and converted into a meaningful and useful form, it
is known as information. Hence, information can be defined as a set of organized and
validated collection of data. For example, 'Monica is 35 years old and she is a chef'.
Strictly speaking, data refer to the values physically recorded in the database, whereas
information refers to the conclusion or meaning drawn out of it. With respect to database,
these terms are synonymous.
Data validation is the process of ensuring that a program operates on clean, correct and
useful data. It uses routines, often called "validation rules" "validation constraints" or
Page 2
Topic: Database Fundamentals By NGANFOR JESPA
"Check routines", that checks for correctness, meaningfulness, and security of data that
are input to the system.
I.3.3 Metadata
Metadata is literally "data about data." This term refers to information about data itself --
perhaps the origin, size, formatting or other characteristics of a data item. In the database
field, metadata is essential to understanding and interpreting the contents of a data warehouse
(central repositories of integrated data from one or more disparate sources).
A data dictionary is a collection of descriptions of the data objects or items in a data model
for the benefit of programmers and others who need to refer to them. When developing
programs that use the data model, a data dictionary can be consulted to understand where a
data item fits in the structure, what values it may contain, and basically what the data item
means in real-world terms.
In the structure of a database, the smallest component under which data is entered is the field.
All fields in the same database have unique names, several data fields make up a record,
several records make up a file, and several files make up a database.
A File is a named collection of logically related multiple records. Depending on the database
software, a table can also be referred to as a table.
- The fields can be Code, Deptt, Name, Address, City and Phone (see Figure 1).
- Fields Code, Deptt, Name, Address, City and Phone for a particular employee form a
record. Figure 1 contains five records (0101–0109) and each record has six fields.
- A collection of all the employee records of a company form employee table.
Page 3
Topic: Database Fundamentals By NGANFOR JESPA
A DBMS is a collection of programs that manages the database structure and controls access
to the data stored in the database. The DBMS serves as the intermediary between the end
user and the database by translating user requests into the complex computer code. The end
user interacts with the DBMS through an application program.
Data Model: A data model is a representation of a real world situation about which data is
to be collected and stored in a database. A data model depicts the dataflow and logical
interrelationships among different data elements.
Database system: Database system is a general term that refers to the combination of a
database, a DBMS and a data model. A database system consists of Data (the database),
Software, Hardware and Users. It allows users to Store, Update, Retrieve, Organize and
Protect their data.
Some DBMS examples include MySQL, PostgreSQL, Microsoft Access, SQL Server,
FileMaker, Oracle, RDBMS, dBase, Clipper, and FoxPro.
A data model is a collection of concepts and rules for the description of the structure of the
database. Structure of the database means the data types, the constraints and the relationships
for the description or storage of data respectively. Database models may be grouped into two
categories:
• conceptual model focuses on the logical nature of the data representation and is
concerned with what is represented in the database; conceptual model include Entity-
Relationship (ER) Model and Object-Oriented (OO) Model
• Implementation model emphases on how information is represented in the database or on
how the data structures are implemented to represent what is modelled. Implementation
models include the hierarchical database model, the network database model, and the
relational database model.
Page 4
Topic: Database Fundamentals By NGANFOR JESPA
Page 5
Topic: Database Fundamentals By NGANFOR JESPA
Three – level architecture for database system is proposed to archive the characteristics of the
database approach. The goal of this architecture is separate the applications and the physical
database so the actual details of how data is organized are hided from the users.
Page 6
Topic: Database Fundamentals By NGANFOR JESPA
1) External level or view: View level part may be considered the "who" part of the
picture. In this highest level, there exists a number of views which of is defined a part
of the actual database. Each view is provided for a user or a group of users so that it
helps in simplified the interaction between the user and system.
2) Conceptual (logical) level: At conceptual level, the emphasis lies on the "what" part
of the picture. It describes the logical structure of the whole database. The entire
database is described using simple logical concepts such as objects, their properties
or relationships. Thus the complexity of the implementation detail of the data with be
hided from the users (abstraction).
3) Internal (physical) level: Physical level emphasizes the "how" and "where" parts of
the data storage. It describes where the data is actually stored, how is it stored and
how to access it.
V. DATABASE USERS
A database user is a principal at the database level. Every database user is a member of
the public role.
End user: People whose jobs require access to database for querying, updating and
generating report.
Application developer: Write software to allow end users to interface with the database
system. This kind of user need to familiar with the DBMSs to accomplish their task.
Page 7
Topic: Database Fundamentals By NGANFOR JESPA
the requests coming to the system, therefore could easily become a bottleneck. But since all
the data reside in a single place it easier to maintain and back up data. Furthermore, it is
easier to maintain data integrity, because once data is stored in a centralized database,
outdated data is no longer available in other places
The conceptual model can be represented using Entity-Relationship model (E-R model).
The E-R model views the real world as a set of basic objects (known as entities), their
characteristics (known as attributes) and associations among these objects (known as
relationships).
1) An entity is any object in the system that we want to model and store information
about. Entities are usually recognizable concepts, either concrete or abstract, such as
person, places, things, or events which have relevance to the database.
2) An attribute is an item of information which is stored about an entity. There exist
different types of entity
(a) Simple and composite attributes. A simple attribute is the smallest semantic unit of
data, which are atomic (no internal structure). A composite attribute can be subdivided
into parts, e.g., address (street, city, state, zip).
(b) Single and multivalued attributes. Single attributes have a single value for a particular
entity. Multivalued attributes have multiple values of an attribute for a particular entity;
e.g., degrees or courses that a student can have or take.
Page 8
Topic: Database Fundamentals By NGANFOR JESPA
Example: In a business enterprise, entity may be Product, Representative and Customer. The
attributes of product can be Name and Price. The attribute of representative is Name, Region
and Phone. The attributes of Customer are Name, City and Age. The relationship between
customer and product is represented by “Buys”, and the relationship between Representative
and product is represented by “Sells”
Price Name
Phone Name
Region
City
Age
Date
Page 9
Topic: Database Fundamentals By NGANFOR JESPA
One-to-one relationships occur when each entry in the first table has one, and only one,
counterpart in the second table. One-to-one relationships are rarely used because it is often
more efficient to simply put all of the information in a single table.
E.g. if a man only marries one woman and a woman only marries one man, it is a one-to-
one (1:1) relationship.
Fig 5: One-to-One
b) One-to-many Relationship (1: M)
One-to-many relationships are the most common type of database relationship. They
occur when each record in the first table corresponds to one or more records in the second
table but each record in the second table corresponds to only one record in the first table.
For example, the relationship between a Teachers table and a Students table in an
elementary school database would likely be a one-to-many relationship, because each
student has only one teacher, but each teacher may have multiple students.
Fig 6: One-to-Many
The crowbar represents the many occurrences.
c) Many-to-many Relationship (M:M)
Many-to-many relationships occur when each record in the first table corresponds to one or
more records in the second table and each record in the second table corresponds to one or
more records in the first table. For example, one teacher teaches many students and a student
is taught by many teachers.
Page 10
Topic: Database Fundamentals By NGANFOR JESPA
goal of a logical data model is to arrange the data in such a form that it is consistent, non-
redundant and supports operations for data manipulation.
The main organization unit in a relational data model is the relation. A relation can be
represented as a table but the definition of the relation is not necessarily equal to the
definition of the table and vice versa.
c) Primary Key:
A field or a set of fields that uniquely identify each record in a table is known as a primary
key. This implies that no two records in the relation can have same value for the primary key.
For example, an employee number uniquely identifies a member of staff within a company.
Page 11
Topic: Database Fundamentals By NGANFOR JESPA
An IP address uniquely addresses a PC on the internet. A primary key is mandatory. That is,
each entity occurrence must have a value for its primary key.
d) Foreign Key:
A field of a table that references the primary key of another table is referred to as foreign
key. Figure 13.3 illustrates how a foreign key constraint is related to a primary key constraint.
Here, the field Item_Code in the PURCHASE table references the field Item_Code in the
ITEM relation. Thus, the attribute Item_Code in the PURCHASE relation is the foreign key.
Page 12
Topic: Database Fundamentals By NGANFOR JESPA
For example, you have a database holding your CD collection. One of the entities is called
tracks, which holds details of the tracks on a CD. This has a composite key of CD name,
track number.
Application exercise
For each of the following entities, list possible primary keys. Then, suggest secondary keys,
if any: Student, Course, Unit, Result, Classroom, Lecturer, Department, and Attendance
General steps:
Page 13
Topic: Database Fundamentals By NGANFOR JESPA
There is no direct representation of a M:N relationship in the relational model. You will need
to turn each M:N relationship between two entities into a separate relation (table) of its
own. This relation will usually have as its own primary key the combination of two foreign
keys – each of these will be the primary key of one of the relations involved in this
relationship.
Page 14
Topic: Database Fundamentals By NGANFOR JESPA
NB: If a relationship has attributes then they need to go into a table. Where to put them
depends on the type of the relationship. In a 1:1 or 1:M relationship, put them the same place
the foreign key goes (on the M side in 1:M). In a M:N relationship, put them in the new table
you create for the relationship.
X. DATABASE NORMALIZATION
Normalization is a process in which data attributes within a data model are organized to
increase the cohesion of entity types. In other words, the goal of data normalization is to
reduce and even eliminate data redundancy, an important consideration for application
developers. There are two goals of the normalization process:
- eliminating redundant data (for example, storing the same data in more than one
table) and
- ensuring data dependencies make sense (only storing related data in a table).
Both of these are worthy goals as they reduce the amount of space a database consumes and
ensure that data is logically stored.
X.1 Dependencies
In order to be able to normalize a relation, we must first understand the concept of
dependency between attributes within a relation. There exist various types of dependencies:
Page 15
Topic: Database Fundamentals By NGANFOR JESPA
A table is in first normal form (1NF) if a relation cannot have repeating fields or groups (no
field must have more than one value): To do it, we have to:
a) Eliminate duplicative columns from the
same table.
b) Create separate tables for each group of
related data and identify each row with a
unique column or set of columns (the
primary key).
Example Let’s consider the following relation:
Students (Surname, LastName, Knowledge)
Page 16
Topic: Database Fundamentals By NGANFOR JESPA
A table is 2NF when it is in 1NF and when all of its non-key attributes are fully dependent
on its primary key (no partial dependency). That is, if X → A holds, then there should not be
any proper subset Y of X, for that Y → A also holds. To do it, we should:
a) Remove subsets of data that apply to multiple rows of a table and place them in
separate tables.
b) Create relationships between these new tables and their predecessors through the use
of foreign keys.
Example: Let’s consider the following relation:
We see here in Student_Project
relation that the primary key
attributes are Stu_ID and
Proj_ID. According to the rule,
Non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on
any of the prime key attribute individually. But we find that Stu_Name can be identified by
Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial
dependency, w h i c h i s n o t a l l o w e d i n S e c o n d
Normal Form.
Page 17
Topic: Database Fundamentals By NGANFOR JESPA
BCNF is an extension of Third Normal Form in strict way. it referred to as the "third
and half (3.5) normal form", since it adds one more requirement:
Meet all the requirements of the third normal form.
Every determinant must be a candidate key.
A database query is used to extract data from the database in a readable format according to
the user's request. For instance, if you have an employee table, you might issue a SQL
statement that returns the employee who is paid the most. This request to the database for
usable employee information is a typical query that can be performed in a relational database.
The most known and most used query language is SQL (Structured Query Language).
Page 18
Topic: Database Fundamentals By NGANFOR JESPA
• Data Definition Language: DDL is used to create and delete database and its objects.
These commands are primarily used by the DBA during the building and removal
phases of a database project. The most important DDL statements in SQL are as
follows:
- CREATE TABLE: To create a new table.
- ALTER TABLE: To modify the structure of a table.
- DROP TABLE: To delete a table.
• Data Manipulation Language: DML is used to retrieve, insert, modify and delete
database information. These commands will be used by all database users during the
routine operation of the database. The most important DML statements in SQL are the
following:
- INSERT: To insert data into a table.
- UPDATE: To update data in a table.
- DELETE: To delete data from a table.
- SELECT: To retrieve data from a table.
NOTE: All SQL queries must be terminated by a semicolon (;) even if the statement extends
over many lines.
The CREATE TABLE command is used to define the structure of the table.
Syntax: Example:
CREATE TABLE <tablename> ( CREATE TABLE EMPLOYEE(
<field1> <data type>, Code NUMBER(5),
Page 19
Topic: Database Fundamentals By NGANFOR JESPA
• The table and column names must start with a letter followed by letters, numbers or
underscores.
• Avoid using SQL keywords as names for tables or columns (such as SELECT,
CREATE, and INSERT).
• For each column, a name and a data type must be specified and the column name
must be unique within the table definition.
• Each column definition should be separated with a comma.
The ALTER TABLE command allows a user to change the structure of an existing table.
Syntax:
ALTER TABLE <tablename>
<ADD | MODIFY | DROP column(s)>;
Examples: Explanation
1 ALTER TABLE EMPLOYEE command will add a new column, named Email,
ADD Email CHAR(25); having a maximum width of 25
characters in the EMPLOYEE table.
2 ALTER TABLE EMPLOYEE command will change the maximum width of the
MODIFY Name CHAR(25); Name column to 25 characters in the EMPLOYEE
table.
3 ALTER TABLE EMPLOYEE command will delete the Deptt column from the
DROP Deptt; EMPLOYEE table.
The DROP TABLE command removes the table definition (with all records).
Syntax: Examples:
DROP TABLE <tablename>; DROP TABLE EMPLOYEE;
The above SQL command will delete the EMPLOYEE table.
The INSERT command is used to insert or add rows (records) into the specified table.
Page 20
Topic: Database Fundamentals By NGANFOR JESPA
Syntax:
INSERT INTO <tablename> (column1, column2, ..., columnN)
VALUES (value1, value2, ..., valueN);
Examples Explanation
1 INSERT INTO EMPLOYEE ( example 1 will add a new record at the
Code, Deptt, Name, Address, Salary) bottom of the EMPLOYEE table
VALUES (101, 'RD01', 'Prince', 'Park Way', consisting of the values in parenthesis.
15000);
2 INSERT INTO EMPLOYEE
VALUES (102, 'RD01', 'Pankaj', 'Pitampura',
26062700, 8000);
Note that for each of the listed columns, a matching value must be specified. In case no
column list is specified, then a value must be given for each column and in the same order as
specified in the CREATE TABLE command.
XI.4.5 UPDATE Command
The UPDATE command is used for modifying attribute values of records in a table.
Syntax:
UPDATE <tablename>
SET column1 = value1
[, column2 = value2]
... ... ... ... ...
[, columnN = valueN]
[WHERE <condition>];
Note that components specified inside the square brackets [] are optional.
Examples Explanation
1 UPDATE EMPLOYEE Example 1 command will update (in our case,
SET Salary = Salary + 1000;increments) the Salary field with 1000 for all the records.
2 UPDATE EMPLOYEE Example 2 command will increment the Salary column
SET Salary = Salary + 1000 with 1000 f o r o n l y t h o s e r o w s t h a t comply
WHERE Deptt = 'RD01'; w i t h condition specified in WHERE clause (Deptt =
'RD01').
XI.4.6 DELETE Command
The DELETE command is used to delete all or selected records from the specified table.
Syntax: Example:
DELETE FROM <tablename> DELETE FROM EMPLOYEE
[WHERE <condition>]; WHERE Salary > 8000;
NOTE: If WHERE condition is not used in the DELETE command, then all the records from
the specified table will be deleted.
Page 21
Topic: Database Fundamentals By NGANFOR JESPA
The SELECT statement is used to query the database and retrieve selected data.
Syntax:
SELECT <column1, column2, column3,...., columnN>
FROM <tablename>
[WHERE <condition>]
[GROUP BY <column1, column2, column3,...., columnN>]
[HAVING <condition>]
[ORDER BY <column1, column2, column3,...., columnN [ASC|DESC]>];
To select all the columns of a table, use * instead of column list with SELECT.
Examples Explanation
1 SELECT Code, Name, Salary The SELECT statement selects the values of the three
FROM EMPLOYEE; specified c o l u m n s f r o m t h e E M P L O Y E E t a b l e .
This operation is called projection.
2 SELECT * The SELECT statement selects all those columns from
FROM EMPLOYEE EMPLOYEE table in which the Salary column contains a
WHERE Salary > 7500; value greater than 7500. This operation is called selection.
SELECT * The SELECT statement displays the result in a descending
FROM EMPLOYEE order by the attribute Name.
WHERE Salary > 7500
ORDER BY Name DESC;
The ORDER BY clause specifies a sorting order in which the result tuples of a query are to
be displayed; DESC specifies a descending order. By default, ORDER BY arranges the
result set in ascending order (whether one uses ASC or not).
Now, let us join these two tables in our SELECT statement as follows:
Page 22
Topic: Database Fundamentals By NGANFOR JESPA
FROM CUSTOMERS, ORDERS 3 kaushik 23 1500
WHERE CUSTOMERS.ID = 2 Khilan 25 1560
ORDERS.CUST_ID; 4 Chaitali 25 2060
Here, it is noticeable that the join is performed in the WHERE clause. Several operators can be
used to join tables, such as =, <, >, <>, <=, >=,! =, BETWEEN, LIKE, and NOT; they can all be
used to join tables. However, the most common operator is the equal symbol.
QUERRY APPLICATION
Page 23
Topic: Database Fundamentals By NGANFOR JESPA
2) Cartesian product:
3) Natural join
Query Solution
select * enr ename dept dnr dname
from E as E(enr, ename, dept), D as D(dnr, dname) 1 Bill A A Marketing
where dept = dnr
2 Sarah C C Legal
3 John A A Marketing
Select * nr name dept nr name
from E, D 1 Bill A A Marketing
where dept = D.nr
2 Sarah C C Legal
3 John A A Marketing
Page 24