Chapter 1
Overview of Database Systems
Good database design will get you through poor
programming better than good programming will get you
through poor database design….
Outline
• Data, Database and DBMS
• Conventional File Processing
• Historical Perspective
• Database Systems Environment
• Database Users
Data vs. Information
• Data:
– Raw facts that can be recorded
– building blocks of information
• Information:
– Data processed to reveal meaning
• Accurate, relevant, and timely information is key
to good decision making
• Good decision making is key to survival in global
environment
Database and DBMS
• A database is a very large, integrated collection of data.
• It contains end-user data and metadata
• Metadata: Data that describes the data.
• A database models a real-world enterprise.
For example, a university database might contain information about the ff:
– Entities (e.g., students, courses, lecturers, etc.)
– Relationships (e.g., Elsie is taking DCIT305)
• A Database Management System (DBMS) is a software package
designed to store and manage databases.
• A DBMS interacts with end-users, other applications, and the
database itself to capture and analyze data.
• A general-purpose DBMS allows the definition, creation,
querying, update, and administration of databases.
Databases
• Train indexes
Web timetables
• Library bookings
Airline catalogues
• Medical
Credit card
records
details
• Bank accounts
Student records
• Stock control
Customer histories
• Personnel
Stock market
records
prices
• Product catalogues
Discussion boards
• Telephone
and so on…directories
Figure 1.1 - The DBMS Manages the Interaction
Between the End User and the Database
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Typical DBMS Functionality
• Define a particular database in terms of its data types,
structures, and constraints
• Construct or Load the initial database contents on a secondary
storage medium
• Manipulate the database:
– Retrieval: Querying, generating reports
– Modification: Insertions, deletions and updates to its content
– Accessing the database through Web applications
• Share a database: allows multiple users and programs to
access the database simultaneously
Database Applications
– Banking: all transactions
– Airlines: reservations, schedules
– Universities: registration, grades
– Sales: customers, products, purchases
– Manufacturing: production, inventory, orders, supply chain
– Human resources: employee records, salaries, tax
deductions
• These examples are called traditional database applications
• More Recent Applications:
– Internet Applications (Youtube, iTunes, etc.)
– Geographic Information Systems (GIS)
– Data Warehouses - a large store of data accumulated from a wide range of sources
within a company and used to guide management decisions.
• Databases touch all aspects of our lives
Database can be any size and complexity
For example:
• A list of names and addresses
• IRS
(assume it has 10 million taxpayers and each taxpayer file 5
forms with 400 characters of information per form=20Gbyte)
• Amazon.com
(15 million people visit per day; about 100 people are
responsible for database update)
Why Study Databases??
• Shift from computation to information
– at the “low end”: webspace
– at the “high end”: scientific applications
• Datasets increasing in diversity and volume.
– Digital libraries, interactive video, etc.
– ... need for DBMS exploding
• DBMS encompasses most of CS
– OS, languages, AI, multimedia, logic
Conventional File Processing
• Traditionally manual file systems were composed of collection
of file folders kept in file cabinet and organization within
folders was based on data’s expected use (ideally logically
related)
• Finding and using data in growing collections of file folders
became time-consuming and cumbersome
• Initially, computer files were similar in design to manual files
(see Figure 1.2)
• Programs belonging to a specific application are assigned files
in the most appropriate structure for the application. File
descriptions are stored within each application program that
accesses a given file. These files are convenient for the
specific application. (see Figure 1.3)
Figure 1.2 - Contents of Customer File
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Basic File Terminology
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Figure 1.3 - Three file processing systems at
A Company
File descriptions are stored within each application program that
accesses a given file. Any change to a file structure requires changes
to the file descriptions for all programs that access the file.
File Systems
• Must write special programs to answer each question a user
may want to ask about data
• Must protect data from inconsistent changes made by
different users accessing data concurrently
• Must cope with system crashes to ensure data consistency
• Need to enforce security policies in which different users have
permission to access different subsets of the data
Database systems offer solutions to all the above problems
Drawbacks of File Systems
• As the number of applications grows, the following problems
occur:
– Program-Data Dependence (see Fig.) – file descriptions are stored
within each application that accesses the file, so change to file
structure requires changes to file descriptions in all programs.
– Data Redundancy (Duplication of data) – wasteful, inconsistent, loss
of metadata integrity (same data has different names in different files,
or same name may be used for different data in different files).
– Limited Data Sharing – users have little opportunity to share data
outside their own applications.
– Lengthy Development Times – little opportunity to re-use previous
development efforts.
– Excessive Program Maintenance – factors above combine to create
heavy maintenance load
Contrasting Database and File Systems
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Historical Perspective
• Early 1960s
– In 1960, Charles W. Bachman (at GE) designed the
Integrated Database System, the “first” DBMS.
– Formed basis for network data model
• Late 1960s
– IBM, not wanting to be left out, created a database system
of their own, known as IMS.
– Information Management System (IMS), used even today
in some major installations
– IMS formed the basis for hierarchical data model
• Difficulties = hard to access data (navigational record-at-a-
time procedures), limited data independence, no widely
accepted theoretical model (unlike relational)
Historical Perspective 2
• 1970s
– Edgar Codd, at IBM’s San Jose Research Laboratory,
proposed relational data model - all data represented in
the form of tables.
– It sparked the rapid development of several DBMSs based
on relational model - Oracle, DB2, Ingres, etc.
– INGRES worked with a query language known as QUEL, in
turn, pressuring IBM to develop SQL in 1974, which was
more advanced
– Database systems matured as an academic discipline
– The benefits of DBMS were widely recognized, and the use
of DBMSs for managing corporate data became standard
practice.
Historical Perspective 3
• 1980s
– Relational data model consolidated its position as
dominant DBMS paradigm, and database systems
continued to gain widespread use.
– SQL became ANSI and OSI standards in 1986 and
1987. SQL quickly replaced QUEL as the more
functional query language.
• Challenge
– RDBMSs were an efficient way to handle structured data.
Then, processing speeds got faster, and “unstructured”
data (art, photographs, music, etc.) became more common
place. Unstructured data is non-relational, and RDBMSs
simply were not designed to handle this kind of data.
Historical Perspective 4
• Late 1980s till now
– Considerable research into more powerful query language
and richer data model, with emphasis on supporting
complex analysis of data from all parts of an enterprise
– Object-oriented, but some organisations have to handle
large amounts of both structured and unstructured data,
so Object-relational databases developed.
– Several vendors, e.g., IBM’s DB2, Oracle 8, Informix UDS,
extended their systems with the ability to store new data
types such as images and text, and to ask more complex
queries
– Data warehouses have been developed by many vendors
to consolidate data from several databases, and for
carrying out specialized analysis
DBMS TYPES
• Hierarchical – Pre-historic – IMS
• Network – Historic –IDMS, ADABAS, lead to Object-
Oriented
• RELATIONAL- current – 95% of the market – Oracle,
Informix, SQL/ Server, Progress, IBM DB2, etc.
• Object- ORIENTED Current – lot of HuHa but very narrow
market, mainly CAD AND Engineering – Objectivity,
Versant, Jasmine
• Object – Relational- Current / Future – SQL3, Informix
UDO, Oracle-9, IBM DB2.
The Database System Environment
• Database system is composed of 4
main parts:
1. Hardware
2. Software
• Operating system software
• DBMS software
• Application programs and utility software
3. Users
4. Data
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Figure 1.4 - Simplified database system environment
A UNIVERSITY example
• A UNIVERSITY database for maintaining information
concerning students, courses, and grades in a university
environment
• We have:
STUDENT : stores data on each student
Attributes are Name, StdID#, Major, etc
COURSE : stores data on each course
Attributes are CourseName, Course#, CreditHrs, Dept. etc
SECTION : stores data on each section of each course
Attributes: SectionID, Course#, Semester, Year, Instructor
GRADE_REPORT : stores the grades that students receive
Attributes: StdID#, SectionID, Grade
PREREQUISITE : stores the prerequisites
Attributes: Course#, Prerequisite#
Database manipulation
• Database manipulation involves querying and
updating
• Examples of querying are:
Retrieve a transcript
List the prerequisites of the “Database” course
• Examples of updating are:
Enter a grade of “A” for “Smith” in “Database” course
Characteristics of the Database
• In the database approach, a single repository of data is
maintained that is defined once then accessed by various
users
• The characteristics of a DB system are:
1. Self-describing nature of a DB
2. Insulation between programs and data
3. Support of multiple views of the data
4. Sharing of data and multiuser transaction processing
1. Self-describing nature of a DB system
• Database system contains not only the database
itself but also a complete definition of the database
structure and constraints
• The information stored in the catalog is called Meta-
data and it describes the structure of the primary
database.
Figure 1.5 - Example of a simplified Meta-data
Figure 1.6 - Illustrating Metadata with MS Access
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
2. Insulation between programs and data
• In file processing, any changes to the structure of a
file may require changing all programs that access
the file
• In database system, the structure of data files is
stored in the DBMS catalog separately from the
access program
• This is called program-data independence
3. Support of multiple views of the data
• Each user may see a different view of the
database, which describes only the data of
interest to that user
• It may also contain some virtual data that is
derived from the database files but it’s not
explicitly stored
4. Sharing of data and multi-user
transaction processing
• Allowing a set of concurrent users to retrieve from
and to update the database.
• Concurrency control within the DBMS guarantees
that each transaction is correctly executed or
aborted
– For example, when several reservation clerks try to assign
a seat on an airplane flight
– (these types of applications are generally called online
transaction processing (OLTP))
Overall System Structure
• A typical DBMS has a layered architecture.
• The figure does not show the concurrency
control and recovery components.
• This is one of several possible architectures;
each system has its own variations.
Overall System Structure
Overall System Structure
• A transaction is any one execution of a user program
in a DBMS (Executing the same program several
times will generate several transactions) This is the
basic unit of change as seen by the DBMS: partial
transactions are not allowed
• Transaction-management component ensures that
the database remains in a consistent (correct) state
despite system failures (e.g., power failures and
operating system crashes) and transaction failures.
Overall System Structure
• Storage manager provides the interface
between the low-level data stored in the
database and the application programs and
queries submitted to the system.
• The storage manager is responsible for the
efficient storing, retrieving and updating of
data
Advantages of a DBMS
• Data independence
• Efficient data access
• Data integrity & security
• Data administration
• Concurrent access, crash recovery
• Reduced application development time
• So why not use them always?
– Expensive/complicated to set up & maintain
– This cost & complexity must be offset by need
Types of Databases
A DBMS can support many different types of databases. Databases
can be classified according to the number of users, database
location(s), and expected type and extent of use.
1. Classification according to Number of Users
• Single-user: Supports only one user at a time
• Desktop: Single-user database running on a personal computer
• Multi-user: Supports multiple users at the same time
• Workgroup: Multi-user database that supports a small group of
users (usually fewer than 50) or a single department
• Enterprise: Multi-user database that supports a large group of
users (across many departments) or an entire organization
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Types of Databases (cont’d)
2. Location might also be used to classify the
database.
• Centralized:
– Supports data located at a single site
• Distributed:
– Supports data distributed across several sites
Database Systems: Design,
Implementation, & Management, 6 th
Edition, Rob & Coronel
Types of Databases (cont’d)
3. Extent of data usage
The most popular way of classifying databases today,
however, is based on how they will be used and on the
time sensitivity of the information gathered from them.
• Transactional (or production):
– A database that is designed primarily to support a
company’s day-to-day operations is classified as an
operational database (sometimes referred to as a
transactional or production database).
• Data warehouse:
– Stores data used to generate information required to make
tactical or strategic decisions
– Often used to store historical data
Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel
Types of Databases (cont’d)
4. Extensible Markup Language (XML)
• Databases can also be classified to reflect the degree to which the
data are structured. Unstructured data are data that exist in the
format in which they were collected.
• Therefore, unstructured data exist in a format that does not lend itself
to the processing that yields information.
• Database types mentioned thus far focus on storage and management
of highly structured data. However, corporations also use unstructured
data, such as company e-mails, memos, documents such as
procedures and rules, Web pages, etc.
• Unstructured and semi-structured data storage and management
needs are being addressed through a generation of databases known
as XML databases.
• Extensible Markup Language (XML) is a special language used to
represent and manipulate data elements in a textual format.
• An XML database supports the storage and management of XML data.
The table compares the features of some well-
known database management systems.
Database Users
• Users are differentiated by the way they expect to
interact with the system
• Database administrators:
– Responsible for authorizing access to the database, for
coordinating and monitoring its use, acquiring software and
hardware resources, controlling its use and monitoring
efficiency of operations.
• Database Designers:
– Responsible to define the content, the structure, the
constraints, and functions or transactions against the database.
They must communicate with the end-users and understand
their needs.
• End Users
• System Analysts
Database Administrator (DBA)
• Coordinates all the activities of the database system; the
DBA has a good understanding of the enterprise’s
information resources and needs.
• Database administrator's duties include:
– Schema definition: design logical/physical schemas
– Storage structure and access method definition
– Schema and physical organization modification
– Handling security and granting user authority to access
database
– Specifying integrity constraints
– Acting as liaison with users
– Data availability, crash recovery
– Monitoring performance and responding to changes in
requirements
End Users
• Casual: access database occasionally by sophisticated query
language when needed. (Eg. The Manager)
• Naïve: they make up a large section of the end-user
population. Invoke one of the permanent application
programs that have been written previously or Learn only a
few facilities that they may use repeatedly
– E.g. people accessing database over the web, bank tellers, clerical staff
• Sophisticated: These include business analysts, scientists,
engineers, others thoroughly familiar with the system
capabilities.
• Application programmers/Specialized users: These write
specialized database applications that do not fit into the
traditional data processing framework. Build data entry &
analysis tools and web services on top of DBMSs
SQL - A language for Relational DBs
• SQL (a.k.a. “Sequel”),
– Stands for Structured Query Language
• ANSI (American National Standards Institute) standard
computer language for accessing and manipulating
database systems.
Two sub-languages:
• Data Definition Language (DDL)
– create, modify, delete tables (relations)
– specify constraints
– administer users, security, etc.
• Data Manipulation Language (DML)
– Specify queries to find records that satisfy criteria
– add, modify, remove records
DML Examples
SELECT *
FROM Student
WHERE Student.Name Like “K%” AND
Student.GPA > 3.25;
• List the names and ages (1 d.p.) of CS female students.
SELECT name, ROUND((DATE( )-dob)/365,1) AS age
FROM student WHERE Major=“CS" AND Gender="F“;
UPDATE Course SET CreditHrs=4 WHERE CourseID= ”DCIT305”
DELETE FROM Prerequisite WHERE CourseID= ”DCIT303”
DDL Examples
CREATE DATABASE CSD2014;
CREATE TABLE Student (
Name varchar(25
IDNO int(10)
Major varchar(10)
DOB date(8)
Gender varchar(1));
ALTER TABLE Course ADD Instructor varchar(25);
DROP TABLE Section;
Summary
• Information is derived from data, which is stored in a database
• To implement and manage a database, use a DBMS
• Database design defines its structure
• Good DB design is important
• Databases were preceded by file systems
• Because file systems lack a DBMS, file management becomes difficult as a
file system grows
• DBMS were developed to address file systems’ inherent weaknesses
• Other benefits of DBMS include:
– recovery from system crashes,
– concurrent access,
– quick application development,
– data integrity and security.
• A DBMS typically has a layered architecture.
• DBAs hold responsible jobs and are well-paid!
• DBMS R&D is one of the broadest, most exciting areas in CS.