Chapter01 Updated
Chapter01 Updated
to Database Systems
DBASE SYS 1
Objectives
DBASE SYS 2
Topics
u What is a Data ?
u Data Models and Evolution
u The Relational Data Model
u What is a DBMS?
u Structure of a DBMS
u Data Abstraction
u Query Languages
u Host Languages
u Concurrency Control and Transaction
u Non-Relational Database
DBASE SYS 3
What is a data?
u Data are individual facts, statistics, or items of information, often numeric. In
a more technical sense, data are a set of values of qualitative or quantitative
variables about one or more persons or objects, while a datum is a single
value of a single variable. [Wikipedia.org, Jan 10, 2022]
u In computing, data is information that has been translated into a form that is
efficient for movement or processing. Relative to today’s computers and
transmission media, data is information converted into binary digital form. It
is acceptable for data to be used as a singula subject or a plural subject. Raw
data is a term used to describe data its most basic digital format. [Jack
Vaughan, techtarget.com]
DBASE SYS 4
How data is stored?
DBASE SYS 5
Common Data Storage Measurements
UNIT VALUE
bit 1 bit
byte 8 bits
kilobyte 1,024 bytes
megabyte 1,024 kilobytes
gigabyte 1,024 megabytes
terabyte 1,024 gigabytes
petabyte 1,024 terabytes
exabyte 1,024 petabytes
zettabyte 1,024 exabytes
yottabyte 1,024 zettabytes
brontobyte 1,024 yottabytes
DBASE SYS 6
Source: https://searchdatamanagement.techtarget.com/definition/data
Data Models
u A data model is a collection of concepts and constructs
for describing data.
u A schema is a description of a particular collection of
data, using the a given data model.
u The relational data model is the most widely used
model today. (before 2000 which has the new model:
non-relational data model)
u Main concept: relation, basically a table with rows and
columns.
u Every relation has a schema, which describes the columns, or
DBASE SYS 7
fields.
Data Models (cont’d)
u The data model of the DBMS hides details - Semantic
Models assist in the DB design process.
u Semantic Models allow an initial description of data in
the “real world”.
u A DBMS does not support directly all the features in a
semantic model.
u Most widely used: Entity-Relationship model (ER
approach).
DBASE SYS 8
DBASE SYS 9
The Relational Data Model
u Central construct: the RELATION : a set of records.
u Data is described through a SCHEMA specifying the name of the
relation, and name and type of each field:
u Students(sid: string, name: string, login: string,
age: integer, gpa: real)
u Actual data: instance of the relations : a set of tuples,
{<53666,Tida,tida@cs,18,3.4>,
<53688,Sopon,sopon@ee,18,3.2>,
<53650,Ploypilin,ploy@math,19,3.8>, ...}
u Integrity constraints (condition every instance must verify) can
also be specified.
DBASE SYS 10
What Is a DBMS?
u A very large, integrated collection of data
describing activities of organizations.
u Models real-world.
u Entities (e.g., students, courses)
u Relationships (e.g., Narisa is taking 2301375)
u A Database Management System (DBMS) is a
software package designed to store and
manage databases.
DBASE SYS 11
Structure of a DBMS (cont.)
Web Application SQL
Forms Front Ends Interface
SQL Commands
Query evaluation
Parser + Optimizer + engine
Plan Execution
DBASE SYS 13
Data Abstraction
DBASE SYS 14
DBMS Schemas: Internal, Conceptual,
External
DBASE SYS 15
Internal Level
u The internal schema defines the physical storage structure of the database.
The internal schema is a very low-level representation of the entire database.
It contains multiple occurrences of multiple types of internal record. In the
ANSI term, it is also called “stored record’.
u Facts about Internal schema:
u The internal schema is the lowest level of data abstraction
u It helps you to keeps information about the actual representation of the entire
database. Like the actual storage of the data on the disk in the form of records
u The internal view tells us what data is stored in the database and how
u It never deals with the physical devices. Instead, internal schema views a physical
device as a collection of physical pages
DBASE SYS 16
Conceptual/Logical Level
u The conceptual schema describes the Database structure of the whole
database for the community of users. This schema hides information about
the physical storage structures and focuses on describing data types, entities,
relationships, etc.
u This logical level comes between the user level and physical storage view.
However, there is only single conceptual view of a single database.
u Facts about Conceptual schema:
u Defines all database entities, their attributes, and their relationships
u Security and integrity information
u In the conceptual level, the data available to a user must be contained in or
derivable from the physical level
DBASE SYS 17
External/View Level
u An external schema describes the part of the database which specific user is
interested in. It hides the unrelated details of the database from the user. There
may be “n” number of external views for each database.
u Each external view is defined using an external schema, which consists of
definitions of various types of external record of that specific view.
u An external view is just the content of the database as it is seen by some specific
particular user. For example, a user from the sales department will see only sales
related data.
u Facts about external schema:
u An external level is only related to the data which is viewed by specific end users.
u This level includes some external schemas.
u External schema level is nearest to the user
u The external schema describes the segment of the database which is needed for a
certain user group and hides the remaining details from the database from the specific
user group
DBASE SYS 18
Goal of 3 level/schema of Database
DBASE SYS 19
Advantages Database Schema
DBASE SYS 20
Disadvantages Database Schema
DBASE SYS 21
Example: University Database
u Conceptual (Logical) schema:
u Students(sid: string, name: string, login: string,
age: integer, gpa:real)
u Courses(cid: string, cname: string, credits: integer)
u Enrolled(sid: string, cid: string, grade: string)
u describes data in terms of the data model of the DBMS
u Physical schema:
u Relations stored as unordered files.
u Index on first column of Students.
u External (View) Schema:
u Depend on users such as students, faculties, staffs.
DBASE SYS 22
Query Language
u Query language (QL) refers to any computer programming language that
requests and retrieves data from database and information systems by
sending queries. It works on user entered structured and formal programming
command based queries to find and extract data from host databases.
u Query language is primarily created for creating, accessing and modifying
data in and out from a database management system (DBMS). Typically, QL
requires users to input a structured command that is similar and close to the
English language querying construct.
u For example, the SQL query: SELECT * FROM Students;
u The simple programming context makes it one of the easiest programming
languages to learn. There are several different variants of QL and it has wide
implementation in various database-centered services such as extracting data
from deductive and OLAP databases, providing API based access to remote
applications and services and more.
DBASE SYS 23
Host Languages
u Data are seldom manipulated without some intended purpose. For instance,
consider a LIBRARY database consisting of information about the books in a
library. If a student wishes to access these data, it is probably with the
intention of finding a certain book, for which the student has some
information, such as the title. On the other hand, if a librarian wishes to
access the information, it may be for other purposes, such as determining
when the book was added to the library, or how much it cost. These issues
probably don’t interest the student.
u The language that is used for database application programming is the host
language for the DBMS. As mentioned earlier, a host language may be a
programming language such as C, , C#, C++, Python, Lisp, PHP, or R it may be
an application-level language, such as Microsoft Access or Visual Basic.
DBASE SYS 24
SQL Command Example
Employee Department
SQL
SELECT Manager
FROM Employee, Department
WHERE Employee.Name = "Clark Kent”
AND Employee.Dept = “Physics”
Query Language
Data definition language (DDL) ~ like type defs in C or Pascal
Data Manipulation Language (DML)
Query (SELECT)
UPDATE < relation name >
SET <attribute> = < new-value>
WHERE <condition>
DBASE SYS 25
Concurrency Control
DBASE SYS 26
Transaction: An Execution of a DB Program
DBASE SYS 27
DBASE SYS 28
Ensuring Atomicity
u DBMS ensures atomicity (all-or-nothing property) even if
system crashes in the middle of a transaction. There is no
midway i.e. transaction do not occur partially. Each
transaction is considered as one unit and either runs to
completion or is not executed at all. It involves the
following two operations.
- Abort: If a transaction aborts, changes made to database
are not visible.
- Commit: If a transaction commits, changes made are
visible.
DBASE SYS 29
Example
Consider the following transaction T consisting of T1 and T2: Transfer of 100
from account X to account Y
If the transaction fails after completion of T1 but before completion of T2.(say, after write(X) but
before write(Y), then amount has been deducted from X but not added to Y. This results in an
inconsistent database state. Therefore, the transaction must be executed in entirely in order to
ensure correctness of database state.
DBASE SYS 30
Consistency
u This means that integrity constraints must be maintained so that the database
is consistent before and after the transaction. It refers to the correctness of a
database. Referring to the example in previous slide,
u The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, database is consistent. Inconsistency occurs in case T1 completes
but T2 fails. As a result T is incomplete.
DBASE SYS 31
Isolation
u This property ensures that multiple transactions can occur concurrently
without leading to the inconsistency of database state. Transactions occur
independently without interference. Changes occurring in a particular
transaction will not be visible to any other transaction until that particular
change in that transaction is written to memory or has been committed. This
property ensures that the execution of transactions concurrently will result in
a state that is equivalent to a state achieved these were executed serially in
some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.
DBASE SYS 32
Isolation (cont.)
Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of
operations takes place due to which T’’ reads correct value of X but incorrect value of Y and
sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must
take place in isolation and changes should be visible only after they have been made to the
main memory.
DBASE SYS 33
Durability
u This property ensures that once the transaction has completed execution, the
updates and modifications to the database are stored in and written to disk and
they persist even if a system failure occurs. These updates now become permanent
and are stored in non-volatile memory. The effects of the transaction, thus, are
never lost.
u The ACID properties, in totality, provide a mechanism to ensure correctness and
consistency of a database in a way such that each transaction is a group of
operations that acts a single unit, produces consistent results, acts in isolation from
other operations and updates that it makes are durably stored.
DBASE SYS 34
Non-Relational database
DBASE SYS
35
Non-Relational Database
DBASE SYS 36
Types of NoSQL Databases
u Column stores – Relational database store all the data in a
particular table’s rows together on-disk, making retrieval of a
particular row fast. Column-family database generally serialize
all values of a particular column together on-disk, which makes
retrieval of a large amount of a specific attribute fast.
u Document stores – These databases store records as
“documents” where a document can generally be thought of as
a grouping of key-value pairs(it has nothing to do with storing
actual documents such as a Word document).
DBASE SYS 37
Types of NoSQL Databases (cont’d)
u Key-value stores – These databases pair keys to values.
An analogy is a files system where the path acts as the
key and the contents act as the file.
u Graph stores – These excel at dealing with
interconnected data. Graph database consist of
connections, or edges, between nodes. Both nodes and
their edges can store additional properties such as key-
value pairs.
DBASE SYS 38