0% found this document useful (0 votes)
51 views

Week 1: Introduction: Database System Concepts

Uploaded by

drogo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Week 1: Introduction: Database System Concepts

Uploaded by

drogo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Adapted for SEN2104 – DBMS

Week 1: Introduction

Database System Concepts


©Silberschatz, Korth and Sudarshan
Database Management System (DBMS)

▪ DBMS contains information about a particular enterprise


▪ Collection of interrelated data
▪ Set of programs to access the data
▪ An environment that is both convenient and efficient to use
▪ Database Applications:
▪ Banking: transactions
▪ Airlines: reservations, schedules
▪ Universities: registration, grades
▪ Sales: customers, products, purchases
▪ Online retailers: order tracking, customized recommendations
▪ Manufacturing: production, inventory, orders, supply chain
▪ Human resources: employee records, salaries, tax deductions
▪ Databases can be very large.
▪ Databases touch all aspects of our lives

2
University Database Example
▪ Application program examples
▪ Add new students, instructors, and courses
▪ Register students for courses, and generate class rosters
▪ Assign grades to students, compute grade point
averages (GPA) and generate transcripts
▪ In the early days, database applications were built directly on
top of file systems

3
Drawbacks of using file systems to store data

▪ Data redundancy and inconsistency


▪ Multiple file formats, duplication of information in different files
▪ Difficulty in accessing data
▪ Need to write a new program to carry out each new task
▪ Data isolation
▪ Multiple files and formats
▪ Integrity problems
▪ Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
▪ Hard to add new constraints or change existing ones

4
Drawbacks of using file systems to store data (Cont.)

▪ Atomicity of updates
▪ Failures may leave database in an inconsistent state with partial
updates carried out
▪ Example: Transfer of funds from one account to another should
either complete or not happen at all
▪ Concurrent access by multiple users
▪ Concurrent access needed for performance
▪ Uncontrolled concurrent accesses can lead to inconsistencies
▪ Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the same
time
▪ Security problems
▪ Hard to provide user access to some, but not all, data

Database systems offer solutions to all the above problems

5
Relational Model
▪ All the data is stored in various tables.
▪ Example of tabular data in the relational model Columns

Rows

6
A Sample Relational Database

7
Database Design
The process of designing the general structure of the database:

▪ Logical Design – Deciding on the database schema.


Database design requires that we find a “good” collection of
relation schemas.
▪ Business decision – What attributes should we record in
the database?
▪ Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
▪ Physical Design – Deciding on the physical layout of the
database

8
Database Design (Cont.)
▪ Is there any problem with this relation?

9
Design Approaches
▪ Need to come up with a methodology to ensure that each of the
relations in the database is “good”
▪ Two ways of doing so:
▪ Entity Relationship Model
▪ Models an enterprise as a collection of entities and
relationships
▪ Represented diagrammatically by an entity-relationship
diagram:
▪ Normalization Theory
▪ Formalize what designs are bad, and test for them

10
Data Definition Language (DDL)
▪ Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
▪ DDL compiler generates a set of table templates stored in a data dictionary
▪ Data dictionary contains metadata (i.e., data about data)
▪ Database schema
▪ Integrity constraints
▪ Primary key (ID uniquely identifies instructors)
▪ Authorization
▪ Who can access what

11
Data Manipulation Language (DML)
▪ Language for accessing and manipulating the data organized
by the appropriate data model
▪ DML also known as query language
▪ Two classes of languages
▪ Pure – used for proving properties about computational
power and for optimization
▪ Relational Algebra
▪ Tuple relational calculus
▪ Domain relational calculus
▪ Commercial – used in commercial systems
▪ SQL is the most widely used commercial language

12
SQL

▪ The most widely used commercial language


▪ To be able to compute complex functions SQL is usually
embedded in some higher-level language
▪ Application programs generally access databases through one of
▪ Language extensions to allow embedded SQL
▪ Application program interface (e.g., ODBC/JDBC) which allow
SQL queries to be sent to a database

13
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation

14
Query Processing (Cont.)
▪ Alternative ways of evaluating a given query
▪ Equivalent expressions
▪ Different algorithms for each operation
▪ Cost difference between a good and a bad way of evaluating a
query can be enormous
▪ Need to estimate the cost of operations
▪ Depends critically on statistical information about relations
which the database must maintain
▪ Need to estimate statistics for intermediate results to compute
cost of complex expressions

15
XML: Extensible Markup Language
▪ Defined by the WWW Consortium (W3C)
▪ Originally intended as a document markup language not a
database language
▪ The ability to specify new tags, and to create nested tag structures
made XML a great way to exchange data, not just documents
▪ XML has become the basis for all new generation data interchange
formats.
▪ A wide variety of tools is available for parsing, browsing and
querying XML documents/data

16
History of Database Systems
▪ 1950s and early 1960s:
▪ Data processing using magnetic tapes for storage
▪ Tapes provided only sequential access
▪ Punched cards for input
▪ Late 1960s and 1970s:
▪ Hard disks allowed direct access to data
▪ Network and hierarchical data models in widespread use
▪ Ted Codd defines the relational data model
▪ Would win the ACM Turing Award for this work
▪ IBM Research begins System R prototype
▪ UC Berkeley begins Ingres prototype
▪ High-performance (for the era) transaction processing

17
History (cont.)
▪ 1980s:
▪ Research relational prototypes evolve into commercial systems
▪ SQL becomes industrial standard
▪ Parallel and distributed database systems
▪ Object-oriented database systems
▪ 1990s:
▪ Large decision support and data-mining applications
▪ Large multi-terabyte data warehouses
▪ Emergence of Web commerce
▪ Early 2000s:
▪ XML and XQuery standards
▪ Automated database administration
▪ Later 2000s:
▪ Giant data storage systems
▪ Google BigTable, Yahoo PNuts, Amazon, ..

18

You might also like