DATABASES
Institute of Geographical Information Systems
THE WORLD IS INCREASINGLY DRIVEN BY DATA…
• THIS CLASS TEACHES THE BASICS OF
HOW TO USE & MANAGE DATA
2
Road Map for Today
Data vs. Information
Metadata
File System
Database
Drawbacks of File Systems
DBMS
Data VS. Information
Data Information
1. A given fact; 1. Data that has
statement, number or meaning within a
picture context
2. Represents something 2. Data processed and
in real world analyzed to be useful
3. No context, Raw facts in decision making
• Summarized
• Organized
• Analyzed
Data VS. Information
Data Information
1. 2.9
2. 3.4
3. 2.8
4. 3.7
Data with Context
Data VS. Information
Data
Information
1. 10% FIN
2. 15% MKT
3. 20 % MGT
Summarized data
Metadata
Descriptions of the properties or characteristics of the data,
including data types, field sizes, allowable values, and
documentation
What is a File?
A collection of records or documents dealing with
one organization, person, area or subject.
• Manual (paper) Files
• Computer Files
What is a Database?
An organized collection of logically related Data
• Models a real-world enterprise
• Entities (e.g., Students, Courses)
• Relationships (e.g., Alice is enrolled in 145)
Database- Example
Add new students, Register students for Assign grades to students,
instructors, and courses courses, and generate compute grade point
class rosters averages (GPA) and
generate transcripts
Database Applications
Banking all transactions
Airlines reservations, schedules
Universities registration, grades
Sales customers, products, purchases
Manufacturing production, inventory, orders, supply chain
Human employee records, salaries, tax
resources deductions
Drawbacks of using file systems
Data Redundancy and Inconsistency
• Multiple file formats, duplication of information
in different files
Difficulty in accessing data
• Need to write a new program to carry out each
new task
Data isolation
• Multiple Files and Formats
Drawbacks of using file systems
Integrity Problems
• Integrity constraints (e.g., account balance
> 0) become “buried” in program code rather
than being stated explicitly
• Hard to add new constraints or change existing
ones
Atomicity of updates
• Failures may leave database in an inconsistent
state with partial updates carried out e.g.,
funds transfer from one account to another
should either complete or not happen at all
Drawbacks of using file systems
Concurrent Access by Multiple Users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to
inconsistencies
• Example: Two people reading a balance (say
100) and updating it by withdrawing money
(say 50 each) at the same time
Drawbacks of using File Systems
Lengthy Development Times
• Programmers must design their own file formats
Excessive Program Maintenance
• 80% of information systems budget
Security problems
• Hard to provide user access to some, but not all,
data
Problems with Data Dependency
• Each application programmer must maintain
their own data
• Each application program needs to include code
for the metadata of each file
• Each application program must have its own
processing routines for reading, inserting,
updating and deleting data
• Lack of coordination and central control
• Non-standard file formats
Problems with Data Redundancy
• Waste of space to have duplicate data
• Causes more maintenance headaches
• The biggest Problem: When data changes
in one file, could cause inconsistencies
• Compromises Data Integrity
Three file processing
systems at Pine Valley
Duplicate Data Furniture
© Prentice Hall, 2002 18
SOLUTION:The DATABASE Approach
• Central repository of shared data
• Data is managed by a controlling agent
• Stored in a standardized, convenient form
Requires a Database Management System
(DBMS)
What is a DBMS?
A Database Management System (DBMS) is a piece of
software designed to store and manage databases
Software system used to define, create, maintain and
provide controlled access to the database and its metadata
DBMS
Application
#1
Application
#2
DBMS Database
containing
centralized
shared data
DBMS manages data
Application resources like an operating
#3
system manages hardware
resources
Components of Database System
Components of Database System
The database system can be divided into four
components.
Users : Users may be of various type, such as DB
Administrator, System Developer and End users
Database Application : Database Application may
be Personal, Departmental, Enterprise etc.
DBMS : Software that allow users to define, create
and manages database access, Ex: MySql, Oracle
etc.
Database : Collection of Logical Data
DBMS Goals
Database Types
Single-user DBMS
number of
users based
Multi-user DBMS
Database
Centralized DBMS
Types
Parallel DBMS
site location
based
Distributed DBMS
Client/Server DBMS
User Number Based?
Single User System
• Database resides on one computer and is only accessed by
one user at a time
• This one user may design, maintain, and write database
programs
Multi-user System
• Due to large amount of data management, most systems are
multi-user
• In this situation the data are both integrated and shared
What is an Integrated Database?
A database is integrated when the same information
is not recorded in two places.
For example, both the Library Department and
the Account Department of the college database
may need student addresses. Even though both
departments may access different portions of the
database, the students' addresses should only
reside in one place.
Centralized Database System
Data is accessed Single
Physically confined to a
from the multiple processor
single location
sites using computer together with
network site its associated
data storage
devices and
other
peripherals
Database is
maintained at
the central
location
Drawbacks of Centralized Database System
• When the central site computer or database
system goes down, then all users are blocked
from using the system until the system comes
back
• Communication costs from the terminals to the
central site can be expensive
Parallel Database System
Multiple
CPUs
CPUs and
Data
Storage
Disk in
Parallel
Parallel Database System
Advantages Disadvantages
• Useful for the applications • Increased startup cost
that query extremely large associated with initiating a
databases in terabytes, for single process and the
example, 1012 TBytes) startup-time
• or that must process an • Slowdowns as new process
extremely large number of completes with existing
transactions per second processes for commonly
• High throughput held resources, such as
• High response time shared data storage disks
and so on
Distributed Database System
Single logical database
split into several Fragment
fragments stored on one
or more
computers
under the
control of a
separate
DBMS
Machines are spread (or
distributed) geographically.
computers connected by a
communication network
Distributed Database System
Advantages Disadvantages
• Greater efficiency and better • Recovery from failure is more
performance complex
• A single database (on server) can
be shared across several distinct
client (application) systems.
• As data volumes and transaction
rates increase, users can grow
the system incrementally
• It causes less impact on ongoing
operations when adding new
locations
• Distributed database system
provides local autonomy
Client/Server Systems
• The client–server model of computing is a
distributed application structure that
partitions tasks or workloads between the
providers of a resource or service, called
Servers, and service requesters, called Client
34
Servers
• A server is software that offers “services” to
other software.
• For instance, a Web server provides Web
pages that are requested by a browser
• Databases usually behave as servers
• Some machines are optimized to host Server
software, they are also commonly referred to
as servers
Copyright © 2012 Pearson Education, Inc.
Chapter3.35
Publishing as Prentice Hall
Clients
• Clients are software that request services
• A browser, for instance, requests a Web page
to load and view
• An application client can request data from a
database
Copyright © 2012 Pearson Education, Inc.
Chapter3.36
Publishing as Prentice Hall
Client Server Database System
The Applications and
Tools of DBMS run on Client
one or more client DBMS Software
platforms, “Front End” resides on the
server
“Back End”
Client
Network
Database
Server
Client
Client Server Database System
Advantages Disadvantages
• Less expensive • High Programming cost
• More flexible • Lack of management tools
• Response time and for diagnosis,
throughput is high performance monitoring
• Better DBMS performance and tuning and security
• Better interfaces control, for the DBMS,
• High availability Client and Operating
Systems and Networking
• Overall improved ease of Environments
use to the user
Functions Served by Databases
• There are several different functions a
database can serve
• Three of them are:
– Transaction Database
– Management Information System
– Business Intelligence
Copyright © 2012 Pearson Education, Inc.
Chapter2.39
Publishing as Prentice Hall
Transaction Databases
• These are databases that are optimized to
collect and process business transactions such
as sales
• They need to be fast and efficient
• They often need to be available 24 hours a
day, 7 days a week
Copyright © 2012 Pearson Education, Inc.
Chapter2.40
Publishing as Prentice Hall
Information Management Systems
• Information management systems are
optimized to process the transaction
information, creating summaries and reports
that are useful to business managers
• They often work with a copy of the transaction
data so as not to slow down the transaction
database
Copyright © 2012 Pearson Education, Inc.
Chapter2.41
Publishing as Prentice Hall
Business Intelligence
• Business Intelligence moves beyond
Management Systems
• It provides tools for “Mining” data to look for
patterns and trends that might help the
business to improve its offerings or service
Copyright © 2012 Pearson Education, Inc.
Chapter2.42
Publishing as Prentice Hall
Quiz 1
• Briefly explain the components of a data base
system.