0% found this document useful (0 votes)
0 views

Module-1 (2)

The document outlines a course on Database Management Systems, detailing the syllabus which includes modules on database architectures, data models, relational algebra, normalization, and transaction management. It also provides information on the course faculty, textbooks, and the importance of databases in various applications. Additionally, it defines key concepts such as data, information, DBMS, and the advantages of using database systems over traditional file-based systems.

Uploaded by

rishav ray
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Module-1 (2)

The document outlines a course on Database Management Systems, detailing the syllabus which includes modules on database architectures, data models, relational algebra, normalization, and transaction management. It also provides information on the course faculty, textbooks, and the importance of databases in various applications. Additionally, it defines key concepts such as data, information, DBMS, and the advantages of using database systems over traditional file-based systems.

Uploaded by

rishav ray
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

Database Management

Systems

1
Course Faculty Details

Dr. Taranath N L
Associate Professor,
Dept. of CSE,
Alliance University Bengaluru

Contact No. - +91-9448658482


Email – [email protected]

2
Course Syllabus
MODULE 1: INTRODUCTION TO Database and its Architectures

Overall Introduction to the syllabus, Advantages of using the DBMS approach,


Characteristics of the database Approach, Actors on the scene, Workers behind
the scene, Database system concepts and architecture- Data Models, schemas,
and Instances, Three-Schema Architecture and data independence, Database
languages and interfaces, Database system environment, Centralized and
Client/server architectures of DBMSs, Classification of Database Management
Systems

MODULE 2: Data Models


Relational Data Model Introduction, Relational Data Model and Relational
Database constraints- Relational Model Concepts and constraints, Relational
database schemas, Update Operations, Transactions and Dealing with Constraint
Violations, Conceptual Modeling, Database Design: Introduction, Data Modeling
using the ER model- High-level conceptual data models for Database Design,
Entity types, Entity sets, attributes and keys, Relationship types Sets, roles, and
structural constraints, Weak entity types, Refining the ER Design for the
Company Database, Naming conventions and Design Issues.

MODULE 3: Relational Algebra and SQL / PL programming

Relational Algebra and Relational Calculus- Select and Project operations,


Operations from set theory-Join and division Queries in Relational algebra,
3
Concept of DDL, DML, DCL, Basic Structure of SQL Queries, Set operations,
Course Syllabus
Module – 4 : Functional Dependency and Normalization
Relational Database Design: Functional Dependency, Different
anomalies in designing a Database, Normalization using functional
dependencies,1NF,2NF, Boyce-Codd Normal Form, 3NF, Normalization
using multi-valued dependencies, 4NF, 5NF

Module – 5 : Transaction and locking mechanism in Databases


Transaction and System Concepts, Desirable Properties of
Transactions, Characterizing schedules based on Recoverability and
Serializability, Transaction Support in SQL, Concurrency Control
Techniques- Two-Phase Locking Techniques, Concurrency Control
Based on Timestamp Ordering, Multi version Concurrency Control
Techniques, Granularity of data items and Multiple Granularity, Locking
Issues

4
Books
Text Books :
1. RamezElmasri, ShamkantB.Navathe, "Fundamentals of Database
Systems",
6th Edition, Pearson Publishers, 2013. ISBN 10: 0-136-08620-9, ISBN
13: 978-0
136-08620-8.

2. Henry F. Korth and Silberschatz Abraham, "Database System


Concepts", 6th Edition, Mc.Graw Hill, 2012.ISBN-10: 0073523321,
ISBN-13: 978-0073523323

Reference Books :
1. Ramakrishnan, "Database Management System", 3rd Edition,
McGraw-Hill, 2014. ISBN-10: 0072465638, ISBN-13: 978-
0072465631.

2. Date C. J., "Introduction to Database Management", 8th Edition,


Addison Wesley, 2012.ISBN-10: 0321197844, ISBN-13: 978-
0321197849
5
What is Data?
Data is a set of values of qualitative or quantitative
variables; restated, pieces of data are individual
pieces of information.
Data is measured, collected and reported, and
analyzed, whereupon it can be visualized using graphs
or images.
Data as a general concept refers to the fact that some
existing information or knowledge is represented or
coded in some form suitable for better usage or
processing.
Raw data, i.e. unprocessed data, is a collection of
numbers, characters; data processing commonly
occurs by stages, and the "processed data" from one
stage may be considered the "raw data" of the next.
6
What is Data Processing?
Any operation or set of operations performed
upon data, whether or not by automatic means,
such as collection, recording, organization,
storage, adaptation or alteration to convert
it into useful information.
Broadly, the collection and manipulation of items of
data to produce meaningful information.

7
Data Processing System
A data processing system may involve some
combination of:
Conversion converting data to another format.
Validation – Ensuring that supplied data is "clean,
correct and useful."
Sorting – "arranging items in some sequence and/or in
different sets."
Summarization – reducing detail data to its main
points.
Aggregation – combining multiple pieces of data.
Analysis – the "collection, organization, analysis,
interpretation and presentation of data.".
Reporting – list detail or summary data or computed
information.
8
What is Information?
A collection of data which conveys some meaningful
idea is information. It may provide answers to questions
like who, which, when, why, what, and how.
OR
The raw input is data and it has no significance when it
exists in that form. When data is collated or organized
into something meaningful, it gains significance. This
meaningful organization is information.
OR
Observations and recordings are done to obtain data,
while analysis is done to obtain information

9
Traditional File Based Systems
Predecessor to the DBMS.
A file system is a hierarchical description of the
folders on a drive and information about the files
inside them.
It handles the movement, creation and deletion of
those folders and files.
A collection of application programs that perform
services for the end-users such as the production
of reports.
Each program defines and manages its own data.
File based systems were developed as better
alternatives to paper based filing systems.
10
File System v/s DBMS
Advantages Disadvantages
FMS • Simpler to use • Typically no multi-user
• Less expensive access
• Limited to smaller
databases
• Limited functionality
• Decentralization of data
• Redundancy and
integrity issues
DBMS • Greater flexibility • Difficult to learn
• Greater processing • Packaged separately
power from the OS
• Ensures data • Requires skilled
integrity administrators
• Supports • Expensive
simultaneous access 11
What is database?
A shared collection of logically related data, and a
description of this data, designed to meet the
information needs of an organization.
It is the collection of schemas, tables, queries,
reports, views and other objects.
In other words, A database is a collection of
information that is organized so that it can easily be
accessed, managed, and updated.
The data is typically organized to model aspects of
reality in a way that supports processes requiring
information, such as modelling the availability of
rooms in hotels in a way that supports finding a hotel
with vacancies.
Often abbreviated DB.
12
What is DBMS?
A software system that enables users to define,
create, maintain, and control access to the
database.
DBMS contains information about a particular
enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and
efficient to use

13
Database Applications
Database Applications:
Banking: all transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized
recommendations
Manufacturing: production, inventory, orders,
supply chain
Human resources: employee records, salaries,
tax deductions
Databases touch all aspects of our lives
14
Purpose / Benefits of Database Systems
In the early days, database applications were
built directly on top of file systems
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
 Multiple file formats, duplication of information in

different files
Difficulty in accessing data
 Need to write a new program to carry out each new
task
Data isolation — multiple files and formats
Integrity problems
 Integrity constraints (e.g. account balance > 0)
become “buried” in program code rather than being
stated explicitly 15
Purpose / Benefits of Database Systems
Drawbacks of using file systems (cont.)
Atomicity of updates
 Failures may leave database in an inconsistent state
with partial updates carried out
 Example: Transfer of funds from one account to
another should either complete or not happen at all
Concurrent access by multiple users
 Concurrent accessed needed for performance
 Uncontrolled concurrent accesses can lead to
inconsistencies
 Example: Two people reading a balance and
updating it at the same time
Security problems
 Hard to provide user access to some, but not all,
data
Database systems offer solutions to all the16
Levels of Abstraction
Physical level: describes how a record
(e.g., instructor) is stored.
Logical level: describes data stored in
database, and the relationships among the
data.
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
View level: application programs hide
details of data types. Views can also hide
information (such as an employee’s salary) 17
View of Data
An architecture for a database system

18
Instances and Schemas
 Similar to types and variables in programming languages
 Logical Schema – the overall logical structure of the
database
 Example: The database consists of information about a
set of customers and accounts in a bank and the
relationship between them
 Analogous to type information of a variable in a

program
 Physical schema– the overall physical structure of the
database
 Instance – the actual content of the database at a
particular point in time
 Analogous to the value of a variable
 Physical Data Independence – the ability to modify the
physical schema without changing the logical schema
 Applications depend on the logical schema
 In general, the interfaces between the various levels 19
Data Models
A collection of tools for describing
 Data
 Data relationships
 Data semantics
 Data constraints
Relational model
Entity-Relationship data model (mainly for
database design)
Object-based data models (Object-oriented and
Object-relational)
Semistructured data model (XML)
Other older models:
 Network model
 Hierarchical model

20
Relational Model
All the data is stored in various tables.
Columns
Example of tabular data in the relational
model
Rows

21
A Sample Database

22
Database Languages
Database Languages
A database system provides a data-definition
language to specify the database schema and a
data-manipulation language to express database
queries and updates.
In practice, the data definition and data-
manipulation languages are not two separate
languages; instead they simply form parts of a
single database language, such as the widely used
SQL language.

23
Data Definition Language (DDL)
Specification notation for defining the database
schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
DDL compiler generates a set of table templates
stored in a data dictionary
Data dictionary contains metadata (i.e., data
about data)
Database schema
Integrity constraints
 Primary key (ID uniquely identifies

instructors)
24

Data Definition Language (DDL)
 Domain Constraints. A domain of possible values must be
associated with every attribute (for example, integer types,
character types, date/time types). Declaring an attribute to
be of a particular domain acts as a constraint on the values
that it can take. Domain constraints are the most elementary
form of integrity constraint. They are tested easily by the
system whenever a new data item is entered into the
database.
 Referential Integrity. There are cases where we wish to
ensure that a value that appears in one relation for a given
set of attributes also appears in a certain set of attributes in
another relation (referential integrity).
For example, the department listed for each course
must be one that actually exists. More precisely, the dept name
value in a course record must appear in the dept name
attribute of some record of the department relation. Database
modifications can cause violations of referential integrity.
When a referential-integrity constraint is violated, the normal
procedure is to reject the action that caused the violation. 25
Data Definition Language (DDL)
 Assertions. An assertion is any condition that the database
must always satisfy. Domain constraints and referential-
integrity constraints are special forms of assertions.
However, there are many constraints that we cannot express
by using only these special forms.
For example, “Every department must have at least five
courses offered every semester” must be expressed as an
assertion. When an assertion is created, the system tests it for
validity. If the assertion is valid, then any future modification to
the database is allowed only if it does not cause that assertion
to be violated.
 Authorization. We may want to differentiate among the
users as far as the type of access they are permitted on
various data values in the database. These differentiations
are expressed in terms of authorization, the most common
being: read authorization, which allows reading, but not
modification, of data; insert authorization, which allows
insertion of new data, but not modification of existing data;
update authorization, which allows modification, but not 26
Data Manipulation Language (DML)
Data-Manipulation Language
A data-manipulation language (DML) is a
language that enables users to access or manipulate
data as organized by the appropriate data model.
The types of access are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database
There are basically two types:
• Procedural DMLs require a user to specify what
data are needed and how to get those data.
• Declarative DMLs (also referred to as
nonprocedural DMLs) require a user to specify what
data are needed
27
Data Manipulation Language (DML)
Declarative DMLs are usually easier to learn
and use than are procedural DMLs. However, since
a user does not have to specify how to get the
data, the database system has to figure out
an efficient means of accessing data.
A query is a statement requesting the
retrieval of information. The portion of a DML that
involves information retrieval is called a query
language. Although technically incorrect, it is
common practice to use the terms query language
and data-manipulation language synonymously.

28
29
Data Dictionary
 Data elements that are define in all tables of all
databases. Specifically the data dictionary stores
the name, datatypes, display formats, internal
storage formats, and validation rules. The data
dictionary tells where an element is used, by
whom it is used and so on.
 Tables define in all databases. For example, the
data dictionary is likely to store the name of the
table creator, the date of creation access
authorizations, the number of columns, and so on.
 Indexes define for each database tables. For each
index the DBMS stores at least the index name the
attributes used, the location, specific index
characteristics and the creation date.
 Define databases: who created each database, the
date of creation where the database is located,
who the DBA is and so on.
30
Data Dictionary Continued…..
End users and The Administrators of the
data base
Programs that access the database
including screen formats, report formats
application formats, SQL queries and so on.
Access authorization for all users of all
databases.
Relationships among data elements which
elements are involved: whether the
relationship are mandatory or optional, the
connectivity and cardinality and so on.

31
Data Dictionary Continued…..

32
SQL
The most widely used commercial
language
SQL is NOT a Turing machine equivalent
language
To be able to compute complex functions
SQL is usually embedded in some higher-
level language
Application programs generally access
databases through one of
Language extensions to allow embedded
SQL
Application program interface (e.g.,
ODBC/JDBC) which allow SQL queries to be 33
Database Design
Logical Design – Deciding on the database
schema. Database design requires that we
find a “good” collection of relation
schemas.
Business decision – What attributes should
we record in the database?
Computer Science decision – What relation
schemas should we have and how should
the attributes be distributed among the
various relation schemas?
Physical Design – Deciding on the physical
layout of the database
34
Database Design

35
Database Design

36
Design Approaches
Need to come up with a methodology to
ensure that each of the relations in the
database is “good”
Two ways of doing so:
Entity Relationship Model (Module-2)
 Models an enterprise as a collection of entities

and relationships
 Represented diagrammatically by an entity-

relationship diagram:
Normalization Theory (Module-4)
 Formalize what designs are bad, and test for
them

37
Object-Relational Data Models
Relational model: flat, “atomic” values
Object Relational Data Models
Extend the relational data model by
including object orientation and constructs
to deal with added data types.
Allow attributes of tuples to have complex
types, including non-atomic values such as
nested relations.
Preserve relational foundations, in particular
the declarative access to data, while
extending modeling power.
Provide upward compatibility with existing
relational languages.
38
XML: Extensible Markup Language
Defined by the WWW Consortium (W3C)
Originally intended as a document markup
language not a database language
The ability to specify new tags, and to
create nested tag structures made XML a
great way to exchange data, not just
documents
XML has become the basis for all new
generation data interchange formats.
A wide variety of tools is available for
parsing, browsing and querying XML
documents/data
39
Database Engine
Storage manager
Query processing
Transaction manager

40
Storage Management
 A storage manager is a program module
that provides the interface between the low
level data stored in the database and the
application programs and queries submitted
to the system.
 The storage manager is responsible for the
interaction with the file manager.
 The raw data are stored on the disk using
the file system, which is usually provided by
a conventional operating system.
 The storage manager translates the various
DML statements into low-level file-system
commands. Thus, the storage manager is
responsible for storing, retrieving, and
updating data in the database.
41
Storage Manager Components
 Authorization and integrity manager: which
tests for the satisfaction of integrity constraints
and checks the authority of users to access data.
 Transaction manager: which ensures that the
database remains in a consistent (correct) state
despite system failures, and that concurrent
transaction executions proceed without conflicting.
 File manager: which manages the allocation of
space on disk storage and the data structures used
to represent information stored on disk.
 Buffer manager: which is responsible for fetching
data from disk storage into main memory, and
deciding what data to cache in main memory. The
buffer manager is a critical part of the database
system, since it enables the database to handle
data sizes that are much larger than the size of
42
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation

43
Query Processing
Parser: During parse
call, the database
performs the following
checks- Syntax check,
Semantic check and
Shared pool check,
after converting the
query into relational
algebra.

44
Query Processing
Syntax check – concludes SQL syntactic validity.
Example:
SELECT * FORM employee
Here error of wrong spelling of FROM is given by this
check.

Semantic check – determines whether the


statement is meaningful or not. Example: query
contains a tablename which does not exist is checked
by this check.

Shared Pool check – Every query possess a hash


code during its execution. So, this check determines
existence of written hash code in shared pool if code
exists in shared pool then database will not take
additional steps for optimization and execution. 45
Query Processing
Optimizer: During optimization stage, database must
perform a hard parse atleast for one unique DML
statement and perform optimization during this parse. This
database never optimizes DDL unless it includes a DML
component such as subquery that require optimization.
It is a process in which multiple query execution plan
for satisfying a query are examined and most efficient
query plan is satisfied for execution.
Database catalog stores the execution plans and
then optimizer passes the lowest cost plan for execution.

Row Source Generation –


The Row Source Generation is a software that
receives a optimal execution plan from the optimizer and
produces an iterative execution plan that is usable by the
rest of the database. the iterative plan is the binary
program that when executes by the SQL engine produces 46
Transaction Management
What if the system fails?
What if more than one user is concurrently
updating the same data?
A transaction is a collection of operations
that performs a single logical function in a
database application
Transaction-management component
ensures that the database remains in a
consistent (correct) state despite system
failures (e.g., power failures and operating
system crashes) and transaction failures.
Concurrency-control manager controls
the interaction among the concurrent
transactions, to ensure the consistency of 47
Actors on and Behind the Scene

48
Database System Environment

49
Database Users and Administrators

Database

50
51
Database Administrator

52
Database Users
 Naive users are unsophisticated users who
interact with the system by invoking one of
the application programs that have been
written previously.
 For example, a bank teller who needs to
transfer $50 from account A to account B
invokes a program called transfer.
 This program asks the teller for the amount
of money to be transferred, the account
from which the money is to be transferred,
and the account to which the money is to
be transferred.

53
Database Users
 Application programmers are computer
professionals who write application programs.
Application programmers can choose from many
tools to develop user interfaces.
 Rapid application development (RAD) tools are
tools that enable an application programmer to
construct forms and reports without writing a
program.
 There are also special types of programming
languages that combine imperative control
structures (for example, for loops, while loops and
if-then-else statements) with statements of the
data manipulation language.
 These languages, sometimes called fourth-
generation languages, often include special
features to facilitate the generation of forms and
the display of data on the screen. Most major
54
commercial database systems include a fourth
Database Users
 Sophisticated users interact with the
system without writing programs.
 Instead, they form their requests in a
database query language.
 They submit each such query to a query
processor, whose function is to break down
DML statements into instructions that the
storage manager understands.
 Analysts who submit queries to explore data
in the database fall in this category.

55
Database System Internals

56
Database Architectures
The architecture of a database systems is
greatly influenced by the underlying
computer system on which the database is
running:
Centralized
Client-server
Parallel (multi-processor)
Distributed

57
Centralized Architecture

58
59
Parallel Architecture

60
Distributed Architecture

61
Database Architecture….

62
2 Tier Architecture
 In a two-tier architecture, the application
resides at the client machine, where it
invokes database system functionality at
the server machine through query
language statements.
 Application program interface standards
like ODBC and JDBC are used for
interaction between the client and the
server.
 In contrast, in a three-tier architecture, the
client machine acts as merely a front end
and does not contain any direct database
calls.
 Instead, the client end communicates with
an application server, usually through a
63
3 Tier Architecture
 The application server in turn
communicates with a database system to
access data.
 The business logic of the application, which
says what actions to carry out under what
conditions, is embedded in the application
server, instead of being distributed across
multiple clients.
 Three-tier applications are more appropriate
for large applications, and for applications
that run on the World Wide Web.

64
65
66
Physical Data Logical Data
Independence Independence
Deals with storage of Deals with structure or
data changing the data
definition
Data retrieval is easy Data retrieval is difficult
Easy to achieve Difficult to achieve
Application program If new fields are added
level is not changed if a or deleted from the
change done in database changes need
physical level to be made in the
application program
Deals with internal Deals with conceptual
schema schema
Example : Hashing Example :
algorithms, Storage Add/Delete/Modify a
67
Classification of DBMS
 Hierarchical databases
 Network databases
 Relational databases
 Object-oriented databases
 Graph databases
 ER model databases
 Document databases
 NoSQL databases

68
Hierarchical Database

69
Network Database

70
71
72
Object Oriented Database

73
Graph Database

74
ER Model Database

75
Document Database

76
77

You might also like