0% found this document useful (0 votes)
107 views

JHGV

The document provides a table of contents for a book on database and database management systems. The table of contents outlines chapters on storage engines, SQL engines, interface drivers, and a case study. It also lists related books and the author. The introduction chapter provides an overview of the history and evolution of database systems from hierarchical to relational to object-oriented models. It describes the basic components and operations of a database as well as the benefits of a database approach.

Uploaded by

360robert
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views

JHGV

The document provides a table of contents for a book on database and database management systems. The table of contents outlines chapters on storage engines, SQL engines, interface drivers, and a case study. It also lists related books and the author. The introduction chapter provides an overview of the history and evolution of database systems from hierarchical to relational to object-oriented models. It describes the basic components and operations of a database as well as the benefits of a database approach.

Uploaded by

360robert
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Table of Contents

 Preface
 Abbreviations
 Introduction
o Overview of DBMS
o Overview of MMDB

 Storage Engine
o Database Management
o Allocator
o Transaction
o Concurrency
o Logging
o Recovery
o Table
o Index

 SQL Engine
o Query Representation
o Query Optimization
o Query Execution

 Interface Drivers
o ODBC
o JDBC

 Real world Implementation


o Case study CSQL

Authors
1. Prabakaran Thirumalai

Related Books
 Structured Query Language
 Python Programming/Database Programming
 Operating System Design
 Embedded Control Systems Design/Operating systems -- other real-time operating systems

Part I: Introduction to Database and Database Management Systems

 Chapter 2: Introduction to DBMS


 Chapter 3: Introduction to MMDB

Table of Contents — Previous: Preface — Next: Introduction to DBMS

Chapter 1: Overview

Contents
[hide]

 1 1.1 Introduction
 2 1.2 Database
 3 1.3 Database Management Systems
 4 1.3.1 Benefits of Database Approach
 5 1.4 Database System Types
 6 1.4.1 Hierarchical DBMS
 7 1.4.2 Network DBMS
 8 1.4.3 Relational DBMS
 9 1.4.4 Main Memory DBMS
 10 1.4.5 Column or Vertical DBMS
 11 1.4.6 Stream Processing DBMS
 12 1.4.7 Object Relational DBMS
 13 1.4.8 Distributed DBMS
 14 1.5 Database Related Roles and Responsibilities
 15 1.6 Programming Interfaces

[edit] 1.1 Introduction


Database systems have become an essential component of every software applications. Database systems emerged in 1960s and took
10 years to gain widespread usage. More and more organizations began to adopt database technology to manage their corporate data
during mid 1970s

Generalized Update Access Method (GUAM) is the hierarchical database system developed in early 1960s by Rockwell. Rockwell
developed this software to manage the data usually associated with manufacturing operations. IBM introduced Information
Management System (IMS) as a hierarchical database management system soon after that. The 1970s is the era of relational database
technology. Dr. Codd’s paper on the relational model revolutionized the thinking on data systems. The industry quickly responded to
the superiority of the relational model, by adapting their products to that model. During the 1980s, database systems gained a lot of
ground and a large percentage of businesses made the transition from file-oriented data systems to database systems. Some of the
leading products like ORACLE, DB2, SQL Server, Informix and Sybase started ruling the database world with their flagship
relational database management systems (RDBMS).
Relational model matured in 1990s and became the leading data model. Towards the end of 90s, object oriented database gained
popularity. Though the transition has started, applications that are already developed in relational model are reluctant to move to
object oriented model. It is only, the new applications that are using this object model as it allows the user to design the application
more naturally than the relational model.

Most of the leading database management systems support object-oriented model. Many of them give object to relational mapping to
achieve object model support. For example, DB2 is Relational, Hierarchical (XML), Object oriented Database Management system.

Commercial DBMS

 Oracle
 DB2
 Microsoft SQL Server
 Sybase
 Informix

Open Source DBMS

 MySQL
 Postgres
 Firebird
 CSQL

[edit] 1.2 Database


The word database is in common use. We use terms database and database management system interchangeably, which is quite wrong
most of the time. So we must start with defining what ‘database’ means.

A database is collection of related data, designed, built and populated with data for a specific purpose.

A database could be of any size and of varying complexity. For example, the list of name and address of employees may consist of
few hundred to thousand records varying upon the organization size. On the other hand, there are databases with huge number of
records. For example, database maintained by Income Tax Department to keep track of tax filed by taxpayers. In India, lets say there
are around 1 billion tax payers, and if each taxpayer files with approximately 500 characters of information per form, we would get a
database of 10^9 * 500 = 500 Giga bytes (GB) of data. For keeping at least past three years of returns, we need 1.5 Terabytes(TB) of
space. This huge amount of information must be organized and managed so that users can search and update the data as needed. There
are some databases that are highly complex in nature because of the relationship which exist between the records. Railway reservation
database is best example for complex database.

The database approach includes the fundamental operations that can be applied to data. Every database management system provides
for the following basic operations:

READ data contained in the database, INSERT data to the database, UPDATE individual parts of the data in the database, DELETE
portions of the data in the database

Database practitioners refer to these operations by the acronym CRUD:

 C—Create or add data


 R—Read data
 U—Update data
 D—Delete data

[edit] 1.3 Database Management Systems


A database management system (DBMS) is a general-purpose software system that enables users to define, construct, manipulate and
share information or data and allowing it to persist over long periods of time. This information or data is termed as database. It is not
necessary to use general-purpose DBMS software to store and retrieve database. We could write our own set of programs to create and
maintain the database that is nothing but creating our own special purpose DBMS software. In either case, we have to develop
considerable amount of complex software. In fact, DBMSs are very complex software systems that implement query processing,
optimization, execution, allocation, transaction, concurrency, recovery and security.

[edit] 1.3.1 Benefits of Database Approach


Database approach is adapted because of the following advantages

2. Sharing of data with access control for multiple users


3. Multi user transaction processing
4. Redundancy control
5. Integrity Constraints Enforcements
6. Standards Enforcement
7. Data Tier abstraction for applications

[edit] 1.4 Database System Types


Database software has evolved to support different type of data models. As we try to represent real-world data requirements in a data
model, we come up with different data model over a period of time. It turns out that we can look at data requirements and create data
models in a few different ways. They are listed below.

[edit] 1.4.1 Hierarchical DBMS


A hierarchical database is similar in nature to a file system, with a root node and one or more children referencing the parent. This
gives very fast data-access path, but high application maintenance.

Data is organized into a tree like structure, which allows repeating information using parent/child as one to many relationships. Parent
can have many children but child has only one parent. This model was widely used in the first main frame database management
systems. The most common form of hierarchical model used currently is the LDAP model. This model gained popularity again with
the recent XML databases.

An XML database is a data persistence software system that allows data to be imported, accessed and exported in the XML format.
Two major classes of XML database exist:

 XML-enabled: These map all XML to a traditional database (such as a relational database), accepting XML as input and
rendering XML as output.
 Native XML (NXD): The internal model of such databases depends on XML and uses XML documents as the fundamental
unit of storage.
[edit] 1.4.2 Network DBMS
This model is an extension to hierarchical data model in which each record can have multiple parents and multiple child records. In
effect, it supports many to many relationships. It provides flexible way to represent objects and their relationships. But before it gains
popularity, new model ‘relational model’ was proposed and that replaced network database model as soon as it was proposed.

[edit] 1.4.3 Relational DBMS


The relational model is the basis for any relational database management system(RDBMS). It defines how to create, store and retrieve
the data and to make the data logically consistent. A relational model has three core components: a collection of objects or relations,
operators that act on the objects or relations, and data integrity methods.

Dr. E. F. Codd, the father of the relational model, stipulated the rules and proposed this model. Data is represented as mathematical n-
ary relations, an n-ary relation being a subset of the Cartesian product of n domains. "Relation" is a mathematical term for "table", and
thus "relational" roughly means, "based on tables".

The basic principle of the relational model is the Information Principle: all information is represented by data values in relations. In
accordance with this Principle, a relational database is a set of relations and the result of every query is presented as a relation.

The basic relational building block is the domain or data type. A tuple is an unordered set of attribute values. An attribute is an ordered
pair of attribute name and type name. An attribute value is a specific valid value for the type of the attribute. This can be either a scalar
value or a more complex type. A relation is defined as a set of n-tuples. A table in a relational database, alternatively known as a
relation, is a two-dimensional structure used to hold related information. A database consists of one or more related tables. Don’t
confuse a relation with relationships. A relation is essentially a table, and a relationship is a way to correlate, join, or associate the two
tables.

A row in a table is a collection or instance of one thing, such as one employee or one line item on an invoice. A column contains all
the information of a single type, and the piece of data at the intersection of a row and a column, a field, is the smallest piece of
information that can be retrieved with the database’s query language.

The consistency of a relational database is enforced, not by rules built into the applications that use it, but rather by constraints,
declared as part of the logical schema and enforced by the DBMS for all applications. The relational model establishes the connections
between related data occurrences by means of logical links implemented through foreign keys.

The relational model defines operations such as select, project and join. Although these operations may not be explicit in a particular
query language, they provide the foundation on which a query language is built.

SQL, which stands for Structured Query Language, supports the database components in virtually every modern relational database
system. SQL has been refined and improved by the American National Standards Institute (ANSI) for more than 20 years.

ANSI introduced standard querying language to access relational databases, SQL (Structured Query Language). All database vendors
developed SQL Engines on top of their relational engines to interpret and execute these SQL Statements. This lead to standard
interfaces to emerge in programming languages as well. ODBC for C and JDBC for JAVA became the de-facto standard to access the
SQL engine.

This book is mainly focused on relational database management systems (RDBMS) as other database systems are built on top of this
RDBMS, and they have nothing to gain from the main memory nature of the MMDBs.

[edit] 1.4.4 Main Memory DBMS


In-Memory Database system (IMDB) is a memory-resident relational database that eliminates disk access by storing and manipulating
data in main memory. It is also known as main memory database (MMDB) or real-time database (RTDB). Predictability is more than
performance in case of real-time databases.

Disk and memory capacities continue to grow much faster than latency and bandwidth improvements. Now, multi-terabyte RAM
scans take minutes and terabyte-disk scans take many hours. We shall now keep the whole database in memory and design our data
structures and algorithms intelligently and use multi-processors sharing a massive main memory, and intelligently use precious disk
bandwidth. The database engines need to overhaul their algorithms to deal with the fact that main memories are huge (billions of
pages trillions of bytes). Main memory database implementation has proved that they can execute queries ten to twenty times faster
than traditional approach. The era of main memory databases has finally arrived. In this book we will discuss about this type of
database management systems.

[edit] 1.4.5 Column or Vertical DBMS


Storing data in column-wise as ternary relations (key, attribute, value) allows extraordinary compression, often as a bitmap. Querying
such bitmaps can reduce query times by orders of magnitude and enable whole new optimization strategies. This is due to the fact that
the total disk I/Os it performs to execute as OLAP(Online Analytical Processing) query will come done drastically because of the
compaction in data. These types of databases started gaining popularity in OLAP applications because of multi-fold reduction in query
execution time.

[edit] 1.4.6 Stream Processing DBMS


Data is increasingly generated by instruments that monitor the environment – telescopes looking at the heavens, patient monitors
watching the life-signs of a person in the emergency room, cell-phone and credit-card systems looking for fraud and RFID scanners
watching products flow through the supply chain. In each of these cases, one wants to compare the incoming data with the history of
an object. The data structures, query operators, and execution environments for such stream processing systems are qualitatively
different from classic DBMS architectures. In essence, the arriving data items each represent a fairly complex query against the
existing database. Researchers have been building stream-processing systems, and their stream-processing ideas have started
appearing in mainstream products.

[edit] 1.4.7 Object Relational DBMS


An object-relational database management system (ORDBMS) provides a relational database management system that allows
developers to integrate a database with their own custom data-types and methods. Object Views allow the developer to define an
object-oriented structure over an existing relational database table. In this way, existing applications do not need to change
immediately, and any new development can use the object-oriented definitions of the table. This makes the transition from a relational
to an object-relational database relatively easy, because object definitions can reference existing relational components. The term
object-relational database sometimes also refers to external software products running over traditional DBMSs to provide similar
features.

[edit] 1.4.8 Distributed DBMS


Distributed databases bring the advantages of distributed computing to the database management domain. It is a collection of multiple
logically interrelated databases distributed over the computer network, and distributed manager software that manages a distributed
database. It comprises of data replication, data fragmentation, distributed query processing, distributed transaction processing,
distributed database recovery, etc.

In this book, focus is given on relational, main memory database system.

[edit] 1.5 Database Related Roles and Responsibilities


For small personal database, such as list of contacts, one person typically defines, constructs and manipulates the database and there is
no sharing of data. However huge and complex database requires many persons to define, construct, manipulate, maintain database.
There are many roles involved with a database and they are listed below

 Database Administrator takes care of database itself, DBMS and related software. He is responsible for authorizing access to
the database, monitor database usage and for acquiring software and hardware resources as needed.
 Database Designers are responsible for identifying the data to be stored in the database and for choosing appropriate structures
to represent and store this data. They interact with specialized users and develop ‘views’ of the database that meets their
application requirements.
 System Analysts / Software Engineers who thoroughly understands the functionalities of DBMS so as to implement their
applications to meet their complex application requirements.
 DBMS Kernel Developers are persons who design and implement the DBMS interfaces and modules as a software package. .
 DBMS Tool Developers include persons who develop tools to access and use DBMS software. Typical packages include
database design, performance monitoring, GUI, etc.

This book is mainly focused on persons who perform “DBMS Kernel Developers” role.

[edit] 1.6 Programming Interfaces


Most database systems provide interactive interface where SQL commands can be typed and given as input to database system, which
will retrieve and display the resultant records. For example, in a computer system where MYSQL RDBMS is installed, the command
mysql will start the interactive interface. This tool is convenient for schema creation and for occasional adhoc queries. However
majority of database interactions in practice are executed though programs. These programs are generally called as database
applications. As more than 90% of applications involve database, we shall say all applications are database applications.

There are three ways to access database programmatically

 Embedded SQL – Embedding SQL Commands in a general purpose programming language. Database statements are
embedded into the programming language and are identified by preprocessor with prefix “EXEC SQL”. These pre processors
convert these statements into DBMS generated code. ESQL for C/C++ and SQLJ for Java.
 Native language drivers – standard interface on top of SQL commands. These provide functions to connect to database execute
statement, retrieve resultant records, etc. ODBC, JDBC are examples
 Proprietary language/drivers – PL/SQL, PHP drivers. Mysql Provides PHP driver to access the database.
Chapter 2: Introduction to Database Management Systems

Table of Contents — Previous: Introduction — Next: Introduction to MMDB

Contents
[hide]

 1 2.1 Overview
 2 2.2 Driver Interfaces
 3 2.3 SQL Engine
 4 2.4 Transaction Engine
 5 2.5 Relational Engine
 6 2.6 Storage Engine
 7 2.7 SELECT Execution Sequence
 8 2.8 INSERT Execution Sequence

[edit] 2.1 Overview


The database management system (DBMS) is the software that handles storage and retrieval of data. Most of the DBMS present today
are relational DBMS. This book concentrates only on relational database management systems. RDBMS has five main components
 Interface Drivers
 SQL Engine
 Transaction Engine
 Relational Engine
 Storage Engine

Figure 1 contains DBMS components, Memory layout and disk files associated with relational database management system. From
early days of database system evolution, disk is considered to be the backing store for the data to achieve durability. The architecture
above applies for disk resident database systems (DRDB). Nowadays there are two different approaches other than DRDBs. They are

 Main memory databases (MMDB) – data is stored in main memory.


 Network Databases – data is stored in another host over the network.

Most of the components in the DRDB system architecture above are present in main memory and network databases as well.
[edit] 2.2 Driver Interfaces
A user or application program shall initiate either schema modification or content modification. These application requests are broadly
classified by SQL as Data Definition Language (DDL), Data Manipulation Language (DML) and Data Control Language (DCL).
DDL deals with schema modifications; DML deals with content modifications; DCL deals with user access and privilege
modifications. If the application program is written in C/C++, it shall use ODBC drivers to connect to the DBMS, or if it is written in
Java, it shall use JDBC drivers to connect to DBMS. Some vendors provide language specific proprietary interfaces. For example
MySQL provides drivers for PHP, Python, etc.

These drivers are built on top of SQL. They provide methods to prepare statements, execute statements, fetch results, etc.

[edit] 2.3 SQL Engine


This component is responsible for interpreting and executing the SQL query. It comprises of three major components

Compiler – builds a data structure from the SQL statement and then does semantic checking on the query such as whether the table
exists, field exists, etc.

Optimizer – transforms the initial query plan (data structure created by compiler), into sequence of operations usually pipelined
together to achieve fast execution. It refers the metadata (dictionary) and statistical information stored about the data to decide which
sequence of operations is likely to be faster and based on that it creates the optimal query plan. Both cost and rule based optimizers are
used in case of DRDBs.

Execution Engine – executes each step in the query plan chosen by optimizer. It interacts with the relational engine to retrieve and
store records.

[edit] 2.4 Transaction Engine


Transactions are sequence of operations that read or write database elements, which are grouped together. Transaction should have the
following ACID properties
Atomicity: Either all or none of the effect should appear in database after transaction completes.

Consistency: Constraints should always keep the database in consistent state

Isolation: Transaction should run as though no other transaction is running.

Durability: Once the transaction completes, effect of the transaction on the database must never be lost.

All the above properties are explained in detail under the Transaction Chapter.

Transaction engine comprises of three major components

Concurrency Manager – responsible for concurrent synchronized access to data. This is usually implemented using latches and
locks. Latches or Mutexes are acquired and released for short duration synchronization and locks are used for that of long duration.

Log Manager – responsible for atomicity and durability property of transaction. Undo logs make sure that transaction rollback takes
the database state to previous consistent state when that transaction started. Redo logs make sure that all committed transactions shall
be recovered in case of crash.

Recovery Manager- responsible for recovering the database from the disk image and redo log files. Most of the databases uses a
technique called ‘shadow paging’ to maintain consistent image of memory in disk.

[edit] 2.5 Relational Engine


Relational objects such as Table, Index, and Referential integrity constraints are implemented in this component. Some of the main
components are

Field – abstracts column level information including type, length, etc Catalog – maintains Meta data information about the relational
database objects such as table, index, trigger, field, etc

Table – responsible for insert, update, delete, fetch, execute. It interacts with the allocator subsystem of storage engine, which in turn
talks to buffer manager to get the job done.
Index – responsible for insert, update, delete, and scan of index nodes for all index types. Popular index types are hash and tree. Hash
index is used for improving the point lookup (predicate with equality on primary key) and tree index is used for improving the range
query (predicate with greater or less than operator on key).

Expression Engine – represents the predicate (WHERE clause of SQL statement) of the data retrieval operation and responsible for
evaluating the expressions, which shall include arithmetic, comparison, and logical expressions.

[edit] 2.6 Storage Engine


This component is responsible to store and retrieve data records. It also provides mechanism to store meta data information and
control information such as undo logs, redo logs, lock tables, etc. Important storage engine components are

Buffer manager – responsible for loading pages from disk to memory and to manage the buffer pool based on Least Recently Used
(LRU) algorithm. This also has special purpose allocator for storing control information, which are transient. Buffer pool is the
memory space used by buffer manager to cache disk pages associated with records, index information, Meta data information. Some
database systems have space limit at individual level and some at global level for buffer pool size.

File Manager – Database in DRDB is nothing but a physical file at disk. File manager maps disk pages of the file to the memory
pages and does the actual disk I/O operations in case of major faults generated by buffer manager module.

Process Manager – responsible for registering and deregistering database application process and threads and account all the
resources (transactions, locks, latches) acquired by them.

[edit] 2.7 SELECT Execution Sequence


This is what happens conceptually, when user issues a SELECT SQL statement, Fig 2: SQL SELECT Execution Sequence
 User issues transaction start request (startTrans())
 DBMS reserves one free slot for the transaction started (allocSlot())
 DBMS returns to the user.
 User issues an SELECT SQL request (stmtExecute())
 DBMS interprets the request and represents it in data structure (parse())
 DBMS checks whether table and field names exist in database (check())
 DBMS identifies the optimum way to execute the statement (optimize())
 DBMS executes the statement by interacting with relational engine (execute())
 DBMS checks with buffer manager whether the disk page where data is present is already present in memory
(isPageInMemory())
 DBMS interacts with file manager to load the page into memory buffer if not already loaded.( loadPage())
 DBMS evaluates records which satisfy the predicate (evaluate())
 DBMS takes lock on the records based on the isolation level of the transaction (lockRecord())
 DBMS retrieves records and returns to the application (returnRecords)
 User issues transaction commit (commit())
 DBMS releases all the locks acquired during the transaction (releaseLocks())
 DBMS releases the transaction slot allocated for this transaction (freeSlot())
 DBMS returns to the application

[edit] 2.8 INSERT Execution Sequence


This is what happens conceptually, when user issues INSERT SQL statement,

Fig 3: SQL INSERT Execution Sequence


 User issues transaction start request (startTrans())
 DBMS reserves one free slot for the transaction started (allocSlot())
 DBMS returns to the user.
 User issues an INSERT SQL request (stmtExecute())
 DBMS interprets the request and represents it in data structure (parse())
 DBMS checks whether table and field names exist in database (check())
 DBMS identifies the optimum way to execute the statement (optimize())
 DBMS executes the statement by interacting with relational engine (execute())
 DBMS checks with buffer manager whether the disk page where record needs to be allocated is already present in memory
(isPageInMemory())
 DBMS interacts with file manager to load the page into memory buffer if not already loaded.( loadPage()), not shown in
diagram above
 DBMS creates undo log records for the newly inserted record (createUndoLog())
 DBMS copies the values from the application buffer to allocated new record (copyValues())
 DBMS creates redo log records for the newly inserted record (createRedoLog())
 DBMS takes lock on the allocated record based on the isolation level of the transaction (lockRecord())
 DBMS checks if indexes are available, if yes it does index node insertion for all indexes on this table. (insertIndexNode())
 DBMS checks with buffer manager whether the index disk page where index node needs to be allocated is already present in
memory (isPageInMemory())
 DBMS interacts with file manager to load the page into memory buffer if not already loaded.( loadPage()), not shown in
diagram above
 DBMS takes lock on the allocated index node (lockIndexNode())
 DBMS returns to the application with the number of rows affected (returnRowsAffected)
 User issues transaction commit (commit())
 DBMS releases all the locks acquired during the transaction (releaseLocks())
 DBMS releases the transaction slot allocated for this transaction (freeSlot())
 DBMS returns to the application

Chapter 3: Introduction to Main Memory Database Systems

Table of Contents — Previous: Introduction to DBMS — Next: Storage Engine

Contents
[hide]

 1 3.1 Overview
 2 3.2 Memory Segment
 3 3.3 SQL Engine
 4 3.4 Relational Engine
 5 3.5 Transaction Engine
 6 3.6 Storage Engine

[edit] 3.1 Overview


Many applications like telecom, process control, airline reservation, stock market, etc require real time access to data. In addition to
maintaining data consistency, these applications require timely response for the transactions accessing database. Main memory
databases, which became feasible recently with the increasing availability of large cheap memory, can provide better performance and
consistent throughput than disk based database systems. As the data resides permanently in main memory, it will not involve any disk
I/O that affects the throughput. In effect, for disk-based systems, standard deviation for throughput will be huge and real time
applications will not be able to benchmark against that unpredictable throughput. This led to the development of new type of database
management system called, Main memory database (MMDB) system. It is also called as “In-Memory database (IMDB) management
system” and “Real Time Database (RTDB) management systems”.

As the basic underlying assumption changed in this type of database management system, it leads to research and design of each and
every component of storage, relational and sql engine of the traditional disk based management system.

One disk block transfer takes approximately 5 milliseconds whereas main memory access takes 100 nanoseconds. By keeping the
whole database in memory, these disk I/Os are converted to memory access, thereby improves the throughput by many-folds. This
leads to an interesting question “In DRDBs if all the data is cached in buffer pool, then they will perform as good as MMDBs?”
Unfortunately the answer is No. This is because the data structures and access algorithms are designed for disk based system and
cannot work well when whole database is completely in memory. There are some common misconceptions about main memory
databases

Do they support durable transactions?


Yes. Though the complete database is in main memory, backup copy is kept in disk to enable recovery in the event of crash.

Can multiple users concurrently access the database?

Yes. Multiple users and multiple threads shall access the database and are synchronized through latches and locks.

The main memory resident of data have important implications in

 Data Representation
 Data Access Algorithms – Query Processing
 Recovery
 Concurrency control

This book will discuss more in detail about the difference between DRDB and MMDB implementations and how MMDB is times
faster than DRDB.

Figure 4, depicts a main memory database management system. This has nearly all the components, which are present in disk resident
database management system. Implementations of components under SQL Engine, Relational Engine, and Storage Engine differ
heavily from the DRDB components.

Fig 4: MMDB System Architecture


In case of MMDB, physical entity that corresponds to database is either shared memory segment or memory mapped file. These Inter
Process Communication (IPC) mechanisms allow sharing of memory across processes. There are some JAVA embedded main
memory databases that require only multi-thread access. These DBMSs use heap memory as database. For multi process access, we
should either go with shared memory IPC or memory mapped IPC mechanism. Usually the control information is stored separately
from the data records to avoid data corruption of control information.

[edit] 3.2 Memory Segment


Memory is divided into three main segments.

Control Segment containing lock table, process table, transaction table, undo logs and redo logs. These structures are transient and are
required for the operation of the database management system.

Catalog Segment containing meta data about tables, indexes, fields, etc.

User Segment containing records of all the tables and index records of those tables, if any.

[edit] 3.3 SQL Engine


Optimization and execution of the SQL statements change drastically in case of MMDBs. Cost based optimization is mainly based on
the number of disk I/Os rather than the CPU cycles. These cost based optimization is the complex system in the whole DBMS. Both
cost and rule based optimizers are used in case of DRDBs. But in case of MMDB rule based optimizer would suffice.

[edit] 3.4 Relational Engine


All the relational operation algorithms change as we do not need to use two pass algorithms for implementing the operations.
Moreover in case of DRDBs when we do first pass we copy the data from disk to memory for all the fields. Incase of MMDB, we
shall use pointer indirection and avoid data copy and disk I/O.

T-Tree index structures are better in case of space and cpu cycles for MMDB than B-Tree of DRDB.
[edit] 3.5 Transaction Engine
As MMDB transactions no more wait for disk I/O, it will lead to lot of contention issues and deadlocks, which will hamper the
throughput. This needs to be avoided and given more importance in case of MMDB than DRDB

[edit] 3.6 Storage Engine


In DRDB, allocations happen based on the disk blocks. But in MMDB, it should happen in memory. This leads to lot of optimization
in the allocation algorithms, which shall save space as well as CPU cycles. Database modules works with File Manager module in
case of DRDB, whereas it works with OS IPC abstraction layer in case of MMDB.

Chapter 4: Storage Engine Overview

Storage engine usually takes care of data organization, access methods and transactions in Database Management Systems (DBMS).
This is the lowest subsystem in DBMS that interacts with operating system services. Generally, products interact through OS Layer
subsystem (wrapper to OS system calls) to ease the porting effort.

Data shall be stored in one of the following components of the computer system

 Memory
 Secondary Storage - Disk
 Tertiary storage - Tape

These components have data capacities, cost per byte, speed ranging over many orders of magnitude. Devices with smallest capacity
offer the fastest access speed and have the highest cost per byte.

Current Trend:Tertiary storage are now replaced by disks because of its huge capacity and low cost per byte. Future Trend: Disks are
replaced by main memory because of its increase in capacity and decrease in cost per byte.

One disk block transfer takes approximately 5 milliseconds whereas main memory access takes 100 nanoseconds. By keeping the
whole database in memory, these disk I/Os are converted to memory access, thereby improves the throughput by many-folds. Memory
being a volatile devices losses the data stored in it, when the power goes off. But disk being a non-volatile device keeps the content
intact even for long periods when there is a power failure. This makes main memory storage manager difficult to implement. It uses
logging and checkpointing mechanisms to cope up with power failure and make the data stored in main memory persistant.

Chapter 5 Database Management

Database is a collection of objects that hold and manipulate data. Every database management system will have a limit on the total
number of databases supported. Most of the databases support up to 10K databases per instance. Because of the virtual address space
limitation, MMDB may not be able to support 100 databases per instance. Lets assume that database size is set to 100MB, then
instance shall support 4 GB/ 100 MB= 40 databases maximum.

Database is a collection of many relational objects such as tables, views, and constraints. A user owns a database; the owners shall
give special access to other users.

Contents
[hide]

 1 Functions of Database Management


 2 Validation Techniques for stray pointer writes
 3 Database VS Schema
 4 Schema VS User
 5 Default Database, Schema and User
 6 Database Security
 7 Database Administrator
 8 Access protection , user accounts and database
 9 Database Audits
 10 Role based Access control
 11 Types of Databases
 12 Memory Pages
[edit] Functions of Database Management
 Create and delete databases
 Grant/Revoke privileges to users to access
 Grow/Shrink database based on its size
 Persistency of data
 Recovery of database in case of system crash
 Archive/Restore

[edit] Validation Techniques for stray pointer writes


This is one of the major concerns in MMDBs during the application development cycle. MMDB expose the actual pointer to data,
which may lead the application to do stray pointer write. This will corrupt the data. MMDB should provide a mechanism to detect this
and recover the database in the event of corruption. TODO::Get info from dali white papers

[edit] Database VS Schema


In the ANSI SQL-92 standard, a schema is defined as a collection of database objects that are owned by single user and form a single
namespace. A namespace is a set of objects that cannot have duplicate names. For example, two tables can have same name if and
only if they both belong to two different schemas.

[edit] Schema VS User


There is an implicit relationship between schemas and users. This relationship is so close that, many of the database users are unaware
about this. If you create a user, DBMS creates a user and a schema and grants all privileges to the user on the created schema. Some
DBMS expect the user to do the above in three different steps.

[edit] Default Database, Schema and User


Every DBMS will have at least one default database, schema and user with preset password. This is done to ensure that DBMS is
ready to soon after the installation.

[edit] Database Security


In a multi-user database management system, DBMS should provide techniques to enable certain users to access selected portions of a
database without gaining access to the rest of the database. This is very important, when a large integrated databse is to be used by
many different users within the same organization. For example sensitive information such as salary should be kept confidential from
most of the other database system users. DBMS accomplishes this by having authorization subsytem that is responsible for ensuring
security of portions of database against unauthorized access.

Another security issue with databases that work across network shall be handled though data encryption. Encryption can also be used
to provide additional protection for sensitive portions of database as well. The data is encoded using some coding algorithm. An
unauthorized user who accesses encoded data will have difficulty in decoding the data. Authorized users will be given the decryption
algorithms using which they shall access the data.

[edit] Database Administrator


Database Administrator is the central authority for managing a database system. The DBA’s responsibility include creating, deleting
users, changing user passwords, granting privileges to user on database objects and revoking the privileges in accordance with the
policy of the organization. DBA is responsible for the overall security of the database system.

[edit] Access protection , user accounts and database


1. Whenever a person or group of person s need to access a database system, the individual or group must first apply for a user
account. The DBA will then create a new account number and password for the user if there is a legitimate need to access the
database. 2.The user must log in to the DBMS by entering account number and password whenever database access is needed.

3.The DBMS checks that the account number and password are valid; if they are, the user is permitted to use
the DBMS and to access the database. Application programs can also be considered as users and can be
required to supply passwords.
4. It is straightforward to keep track of database users and their accounts and passwords by creating an encrypted table or file with the
two fields account number and password. This table can easily be maintained by the DBMS. Whenever a new account is created, a
new record is inserted into the table. When an account is canceled, the corresponding record must be deleted from the table. 5. The
database system must also keep track of all operations on the database that are applied by a certain user throughout each login session,
which consists of the sequence of the database interactions that a user performs from the time of logging in to the time of logging off.
When a user logs in, the DBMS can record the user's account number and associate it with the terminal are attributed to the user's
account until the user logs off. It is particularly important to keep track of update operations that are applied to the database so that, if
the database is tampered with, the DBA can find out which user did the tampering. 6.To keep a record of all updates applied to the
database and of the particular user who applied each update, we can modify system log, which includes an entry for each operation
applied to the database that may be required for recovery from a transaction failure or system crash. We can expand the log entries so
that they also include the account number of the user and the on line terminal ID that applied each operation recorded in the log. If any
tampering with the database is suspected, a database audit is performed, which consists of reviewing the log to examine all accesses
and operations applied to the database during a certain time period. When an illegal or unauthorized operation is found, the DBA can
determine the account number used to perform this operation. Database audits are particularly important for sensitive databases that
are updated by many transactions and users, such as a banking database that is updated by many bank tellers. A database log that is
used mainly for security purposes is sometimes called an audit trail.

[edit] Database Audits


If any tampering with the database is suspected, a database audit is performed, which consists of reviewing the log to examine all
accesses and operations applied to the database during a certain time period. When an illegal or unauthorized operation is found, the
DBA can determine the account number used to perform this operation. Database audits are particularly important for sensitive
databases that are updated by many transactions and users, such as a banking database that is updated by many bank tellers. A
database log that is used mainly for security purposes is sometimes called an audit trail.

[edit] Role based Access control


[edit] Types of Databases
Control Database - one

Catalog Database - one


User database - many

[edit] Memory Pages


Usually 8 KB. should be same as OS page size.

Chapter 7: Transaction Management

A transaction is basic unit of work that comprises of many database operations. Transactions shall either be committed or rolled back.
This means it shall decide whether to submit all the changes to DBMS or to neglet all the changes and take the database state to the
state when that transaction started.

Contents
[hide]

 1 Transaction Stages
 2 Transaction pseudo code for money transfer
 3 Transaction Properties
 4 Correctness Principle

[edit] Transaction Stages


For recovery purposes, DBMS should maintain the state of the transaction. A transaction state changes are given below

 Transaction Start
 Database operations (INSERT, UPDATE, DELETE, SELECT)
 Commit or Abort
Database operations include read and modifications of data records. The effect of running SELECT statement is read and effect of
running INSERT, UPDATE and DELETE is data record modifications.

The commit operation informs the successful end of transaction to the transaction manager. After commit operation, database should
be in consistent state and all the updates made by that transaction should be made permanent.

The rollback or abort operation informs the transaction manager that there is an error in one of the operations involved in the
transaction and database is in inconsistent state. All the updates made by the transaction must be undone on the database to get it back
to the previous consistent state; that is state at which the transaction started.

[edit] Transaction pseudo code for money transfer


BEGIN TRANSACTION;

Read account1 balance

If read failed GOTO UNDO;

Reduce Rs 100 from account1 balance

If reduce failed GOTO UNDO;

Read account2 balance

If read failed GOTO UNDO;

Add Rs 100 to account2 balance

If add failed GOTO UNDO;

COMMIT;
GOTO FINISH;

UNDO:

ROLLBACK;

FINISH:

RETURN;

[edit] Transaction Properties


Transaction processing guarantees four important properties (referred as ACID properties)

Atomicity — Each transaction is treated as all or nothing- it either commits or aborts. If a transaction commits, all its effects remain.
If it aborts, all its effects are undone.

Consistency — Transactions preserve data consistency; it always takes the database from one consistent state to another consistent
state. Intermediate state shall be inconsistent, but commit or rollback operation should again take the database to another new
consistent state or to the old consistent state.

Isolation — Transactions should be isolated from one another. Even though all transactions run in parallel, updates made by one
transaction should not be visible to other and vice versa until the transaction commits.

Durability — Transaction commit should ensure that its updates are present in the database, even if there is a subsequent system
crash.

[edit] Correctness Principle


If a transaction executes in the absence of any other transaction or system error, and it starts with the database in a consistent state,
then the database is also in a consistent state when the transaction ends. This means that if we control simultaneous transactions and
system errors, transactions always ensure correctness.The former is controlled by concurrency manager and the later is controlled by
logging and recovery manager.

Chapter 8: Concurrency Management

Contents
[hide]

 1 8.1 Overview
 2 8.1.1 Pessimistic Concurrency
 3 8.1.2 Optimistic Concurrency
 4 8.2 Concurrency problem
 5 8.2.1 Lost Update Problem
 6 8.2.2 Uncommitted dependency
 7 8.2.3 Inconsistent Analysis
 8 8.3 Locking
 9 8.3.1 Serializable transactions
 10 8.3.2 Two Phase Locking Protocol
 11 8.3.2.1 Conservative 2 PL
 12 8.3.2.2 Strict 2 PL
 13 8.3.2.3 Rigorous 2 PL
 14 8.3.3 Lock Starvation
 15 8.3.4 Dead Lock
 16 8.3.4.1 Dead Lock Prevention
 17 8.3.4.2 Dead Lock Detection
 18 8.3.4.3 Timeouts
 19 8.4. Isolation Levels
 20 8.5 Lock Granularity
 21 8.5.1 Granularity Levels in DBMS
 22 8.5.2 Intent Locks
 23 8.5.2 Lock Escalation
 24 8.6 Index and Predicate Locking
 25 8.7 Timestamp based concurrency control (TODO:Rephrase whole section)
 26 8.7.1 Timestamps
 27 8.7.2 Basic Timestamp ordering
 28 8.7.3 Strict Timestamp ordering
 29 8.8 Multi Version concurrency control
 30 8.9 Optimistic concurrency control
 31 8.10 Architecture for lock manager

[edit] 8.1 Overview


Concurrency is defined as the ability of multiple processes and threads to access and change the data records at the same time. Lower
the contention to access and modify data with more users, better is the concurrency and vice versa. A process that access data prevents
other process to change the data. This reduces the concurrency. A process that modifies data prevents other process to access or
change the data. This reduces the concurrency.

In general, database systems uses two approaches to manage concurrent data access; pessimistic and optimistic. Conflicts cannot be
avoided in both the models, it differs only in when the conflicts are dealt.

[edit] 8.1.1 Pessimistic Concurrency


Pessimistic concurrency systems assume that conflict will occur and it avoids conflicts by acquiring locks on data that is being read or
modified, so that no other process can modify that data. In this model, readers block writers and writers block readers.

[edit] 8.1.2 Optimistic Concurrency


Optimistic concurrency systems assume that transactions are unlikely to modify data that another transaction is modifying. This is
implemented by using versioning technique. This allows readers to see the state of the data before the modification occurs as the
system maintains previous version of the data record before it actually attempts to change it. In this model readers do not block writers
and writers do not block readers. However writers will block writes which will lead to conflicts.

[edit] 8.2 Concurrency problem


There are some problems that any concurrency control mechanism must address. There are three ways in which things can go wrong.
They are lost update problem, uncommitted dependency and Inconsistent Analysis. In all three, individual transactions are right, but
when they are interleaved it may produce wrong results.

[edit] 8.2.1 Lost Update Problem


TODO- Define this

TODO- Draw diagram

Transaction-1 reads tuple1 at time t1 Transaction-2 reads tuple1 at time t2 Transaction-1 updates tuple1 at time t3 Transaction-2
updates tuple1 at time t4

Transaction-1 update on tuple1 at time t3 is lost as Transaction-2 overwrites the update made by Transaction-1 on tuple1 without
checking whether it has changed.

[edit] 8.2.2 Uncommitted dependency


This occurs when one transaction is allowed to access or update a tuple that has been updated by another transaction but not yet
committed by that transaction.

TODO- Draw diagram


Transaction-1 updates tuple1 at time t1 Transaction-2 reads tuple1 at time t2 Transaction-1 rolls back the transaction at time t3 In the
above sequence, Transaction-2 sees an uncommitted change at time t2, which is undone at time t3. Transaction-2 is operating with
wrong value seen at time t2. As a result, transaction-2 might produce incorrect result.

If transaction-2 updates instead of reading tuple t1 at t2, the situation is even worse, it will loose its update on tuple t1, once
transaction-1 rolls back.

[edit] 8.2.3 Inconsistent Analysis


If one transaction is calculating an aggregate summary function on records, while other transactions are updating some of the records
involved in the aggregation, the aggregate function may calculate with some values before they are updated and some values after they
are updated.

TODO- draw diagram

For example, suppose that transaction-1 is calculating the total number of reservations on all the theatres for particular day; meanwhile
transaction-2 is reserving 5 seats on that day, then results of transaction-1 will be off by 5 because transaction-1 reads the value of X
after 5 seats have been subtracted from it.

[edit] 8.3 Locking


All the above three problems can be solved by using concurrency control technique called locking. The basic idea is to deny access to
the data record for all transactions, when a transaction is working on it.

A lock is a variable associated with a data item that describes the status of the data item. Generally there is one lock for each data
item(record) in DBMS. Locks are used to provide synchronous access to the data items by concurrent transactions. There are two
types of locks supported by Unix kind of operating systems

 pthread mutexes
 semaphores
pthread mutexes work well with multiple threads and semaphores work well with multiple threads as well as multiple process. Pthread
mutexes are called binary locks as they have two states (lockState); locked and unlocked. Semaphores can be used as binary locks as
well as counting locks. In our case, binary locks will be used to provide synchronized concurrent access to the data items. Two
operations, lockItem() and unlockItem() are used with binary locking. A transaction requests access to data item X by first issuing
lockItem(X) operation. If lockState(X) is 1, then the transaction is forced to wait till lockState(X) becomes 0. If it is zero, then
lockState(X) is set to 1, and the transaction is allowed to access data item X. When the transaction is through with using the data item,
it issues an unlockItem(X) operation, which sets lockState(X) to zero, so that other transactions shall access X.

Data access protocol for locking method is

 A read transaction will acquire lock on the data before it reads the data.
 A write transaction will acquire lock on the data before it writes the data.
 If lock request is denied, then the transaction goes to wait state and tries for the lock periodically until the lock is released by
the transaction that acquired it.

In the above locking model,

 Readers block readers and writers


 Writers block readers and writers

Concurrency shall be slightly improved on the above model by making readers not blocking other readers, as it will not lead to any
inconsistencies. This shall be achieved by introducing another type of lock, which shall be shared by all readers. These locks are called
as Shared Locks. Another type of lock, exclusive locks are obtained by writers to block all readers and writers from accessing the data.
The data access protocol for this locking method is

 A read transaction will acquire Shared Lock on the data before it reads the data.
 A write transaction will acquire Exclusive Lock on the data before it writes the data.
 Lock request is denied for read operation if another transaction has exclusive lock on the data item.
 Lock request is denied for write operation if another transaction has read or exclusive lock on the data item.
 If lock request is denied, then the transaction goes to wait state and tries for the lock periodically until the lock is released by
the transaction that acquired it.
In the above locking model,

8. Readers block writers and allow other readers


9. Writers block readers and writers

The above rules shall be summarized as lock compatibility matrix

Shared Exclusive No Lock Shared Yes No Yes Exclusive No No Yes No Lock Yes Yes Yes TODO::Above lock compatibility matrix
in image

Yes -> compatible, no conflict, lock request will be granted.

No-> not compatible, there is a conflict, so lock request should be denied.

Concurrency problems with locking protocol

 Lost update problem – transaction-1 waits for exclusive lock forever from time t3 as shared lock is acquired by transaction-1
and transaction-2. Transaction-2 waits for exclusive lock forever from time t4 as shared lock is acquired by transaction-1 and
transaction-2. Our lost update problem is solved, but a new problem has occurred. It is called “DeadLock”. We will look into
its details later.

 Uncommitted dependency - Transaction-2 waits for lock when it tries to read tuple1 at time t2 as it is exclusively locked by
transaction-1. It waits till transaction-1 either commits or rollbacks. Locking avoids uncommitted dependency issue.

 Inconsistent Analysis – Transcation-1 waits till transaction-2 releases the exclusive lock on the data record X and then
computes the aggregation giving correct results.
[edit] 8.3.1 Serializable transactions
A given set of transactions is considered to be serializable, if it produces the same result, as though these transactions are executed
serially one after the other.

 Individual transactions are correct as they transform a correct state of the database to another correct state
 Executing transaction one at a time in any serial order is also correct, as individual transactions are independent of each other.
 An interleaved execution is correct, if it is equivalent to serial execution or if it is serializable.

The concept of serializability was first introduced by Eswaran, Gray proved the two-phase locking protocol, which is briefly described
as:

If all transactions obey the “two-phase locking protocol”, then all possible interleaved schedules are
serializable.

[edit] 8.3.2 Two Phase Locking Protocol


After releasing a lock, a transaction must never go on to acquire any more locks. A transaction that obeys this protocol, thus has two
phases, a lock acquisition or “Growing” phase and a lock releasing or “Shrinking” phase.

Two phase locking may limit the amount of concurrency that can occur in a schedule. This is because a transaction may not be able to
release lock on data item after it is through with it. This is the price for guaranteeing serializability of all schedules without having to
check the schedules themselves.

There are number of variations of two phase locking (2PL). The technique we described above is known as 2 phase locking.

[edit] 8.3.2.1 Conservative 2 PL


This requires a transaction to lock all the item it accesses before the transaction starts, by declaring read-set and write-set. Read-set of
a transaction is the set of all items that the transaction reads, and the write-set is the set of all items that it writes.
If any of the items in read-set or write-set cannot be locked, it waits for it to be released by other transaction and after acquiring all
locks at once, it starts the transaction. Conservative 2 phase locking is deadlock free protocol. However, it is difficult to use in practice
because of the need to declare the read-set and write-set, which is not possible practically in most of the situations.

Growing phase is before the transaction starts and shrinking phase starts as soon as the transaction ends in case of conservative 2
phase locking.

[edit] 8.3.2.2 Strict 2 PL


The most popular variation of 2-phase locking is strict 2-phase locking. A transaction does not release any of its exclusive locks till
the transaction either commits or rollbacks. This guarantees strict schedules, as no other transactions can read or write an item that is
written by this transaction unless it is committed. Growing phase starts as soon as the transaction starts and shrinking on write locks
happens during either transaction commits or rollbacks.

[edit] 8.3.2.3 Rigorous 2 PL


This is more restrictive version of strict 2-phase locking protocol. A transaction does not release any of its shared and exclusive locks
till the transaction either commits or rollbacks. This is the easiest 2-phase locking to implement, but gives less concurrency for reads.

Growing phase starts as soon as the transaction starts and shrinking happens during either transaction commit or rollback.

[edit] 8.3.3 Lock Starvation


Lock starvation occurs when a transaction cannot proceed for an indefinite period of time while other transactions in the system
continue to run normally. This can occur due to unfair lock scheduling algorithms which implements priority based locking. One
solution for starvation is to have fair lock waiting scheme, such as first in first out (FIFO) queue.

Starvation can also occur when the deadlock algorithm, selects the same transaction repeatedly for abort, thereby never allowing it to
finish. The algorithms shall be modified to use higher priorities for transactions that have been aborted multiple times to avoid this
problem.
[edit] 8.3.4 Dead Lock
Deadlock occurs when each transaction in a set of two or more transactions wait for some resource that is locked by some other
transaction in the same set.

For example transaction T1 acquires Resource R1 and transaction T2 acquires resource R2. After this if T1 waits for R2 and T2 waits
for R1. Both will never get the lock, and this situation is termed as deadlock.

TODO Diagram with respect to time

[edit] 8.3.4.1 Dead Lock Prevention


There are many prevention protocols, but most of them are practically not possible in case of DBMS. They are conservative 2 phase
locking, ordering data record locking, no waiting, cautious waiting, Wait-die and Wound-wait.

Conservative two-phase locking protocol is deadlock prevention protocol, in which all locks are acquired before the transaction
works on the data records.

Ordering of data record locking will also prevent deadlocks. A transaction, which works on several data records, should obtain
locks in pre-determined order always. This requires the programmer or DBMS aware of chosen order of the data record locks. This is
also not practical to implement in database systems.

No Waiting Algorithm If a transaction is unable to obtain a lock, it is immediately aborted and then restarted after a certain time
delay without checking whether a deadlock will actually occur or not. This can cause transactions to abort and restart needlessly.
Cautious Waiting Algorithm

This is proposed to avoid needless restart in case of no waiting algorithm, If transaction T1 tries to lock an data record R1, but is not
able to do so because R1 is locked by some other transaction T2 with a conflicting lock. If T2 is not blocked on some other locked
data record, then T1 is blocked and allowed to wait; otherwise abort T1.

Wait-Die and Wound-Wait Algorithm The other two techniques, wait-die and wound-wait use transaction timestamps as basis to
determine what to do in case of deadlocks. Transaction timestamp in a unique identifier assigned to each transaction. These
timestamps are generally running counter which gets incremented for every transaction started. If transaction T1, starts before
transaction T2, then TS(T1) < TS(T2)

Suppose that transaction T1 tries to lock data record R1, but is not able to lock because R1 is locked by some other transaction T2 with
a conflicting lock. Rules followed by these schemes are as follows

Wait-Die – If TS (T1) < TS (T2), then T1 is allowed to wait, otherwise abort T1 and restart it later with the same timestamp. Wound-
Wait – If TS (T1) < TS (T2), then abort T1 and restart it later with same timestamp; otherwise T1 is allowed to wait

In wait-die, older transaction is allowed to wait on younger transaction, whereas a younger transaction requesting lock on record R1
held by an older transaction is aborted and restarted. The wound-wait approach does the opposite; a younger transaction is allowed to
wait on an older one, whereas an older transaction requesting lock on record R1 held by an younger transaction preempts the younger
transaction by aborting it. Both schemes end up aborting the younger of the two transactions that may be involved in a deadlock. In
wait-die, transactions wait only on younger transactions. In wound-wait, transactions wait only on older transactions. So no cycle is
created in both of these schemes avoiding deadlocks.

[edit] 8.3.4.2 Dead Lock Detection


Deadlock detection is more practical approach than the deadlock prevention techniques. This first checks whether deadlock state
actually exist in the system before taking any actions.

A simple way to detect a state of deadlock is for the system to construct and maintain a “wait-for” graph TODO-Diagram and
explanation

If the system is in a state of deadlock, some of the transactions causing deadlocks must be aborted. Either application or DBMS should
select one of the transactions involved in deadlock for rollback to get the system out of deadlock situation. This selection algorithm
should consider avoiding transactions that are running for long time and transactions that have performed many updates. The best
transactions to be aborted are the SELECT or read only transactions.

[edit] 8.3.4.3 Timeouts


The simplest solution for handling deadlocks is timeouts. In this method, transactions that wait for longer than the system defined
timeout period, are assumed to be in deadlock situation and are aborted.

[edit] 8.4. Isolation Levels


TODO

[edit] 8.5 Lock Granularity


The size of the data item is often called the data item granularity. Fine granularity refers to small data item sizes, whereas coarse
granularity refers to large item sizes.

Larger the data item size, lower the degree of concurrency. For example if the data item size is a ‘Table’ denoted by Table1, a
transaction T1 that needs to lock a record X must lock the whole table Table1 that contains record X because the lock is associated
with the whole data item, Table1. If another transaction T2 wants to lock a different record Y of Table1, it is forced to wait till T1
releases the lock on Table1. If the data item size is single record, then transaction T2 would be able to proceed, because it would lock
different data item.

Smaller the data item size, more the number of items in the database. Because every item is associated with a lock, the system will
have a larger number of active locks. More lock and unlock operations will be performed, causing a higher overhead. In addition,
more storage space is required for storing these locks.

For large transactions, which access many records, coarse granularity should be used and for small transactions, which access small
number of records, fine granularity should be used.

[edit] 8.5.1 Granularity Levels in DBMS


Granularity levels are listed below ordered from Coarse to fine granularity

 Database
 Table
 Disk Block or Memory Page
 Record
 Record Field

Since the best granularity size depends on the given transaction, DBMS should support multiple level so granularity and allows the
transaction to pick any level it wants.

[edit] 8.5.2 Intent Locks


Let us take an example database DB1, has one table Table1, having 2 pages P1 and P2. P1 has 10 records R1 to R10, and P2 has 10
records R11 to R20.

TODO::Draw tree structure denoting above.

Scenario 1:

Transaction T1 wants to update all records in Table1. It will request for exclusive lock on Table1. This is beneficial for T1 than
acquiring 20 locks for each data record. Now suppose, another transaction T2 wants to read record R5 from page P1, then T2 would
request a shared record level lock on R5. DBMS will now check for the compatibility of the requested lock with already held locks.
One way to verify this is to traverse the tree from leaf R5 to root DB1 and check for conflicting locks.

Scenario 2:

Transaction T1 wants to read record R5 from page P1, and then T2 would request a shared record level lock on R5. Now suppose,
another transaction T2 wants to update all records in Table1, so it will request exclusive lock on Table1. DBMS will now check for
the compatibility of the requested lock with already held locks. For this it needs to check all locks at page level and record level to
ensure that there are not conflicting locks.

For both the above scenarios, traversal-based lock conflict detection is very inefficient and would defeat the purpose of having
multiple granularity locking.
New types of locks are introduced to make the multiple granularity locking efficient. The idea behind intention locks is for a
transaction to indicate, along the path from the root to the desired node, what type of lock it will require from one of the node’s
descendants.

There are three types of intension locks.

 Intention Shared (IS)


 Intention Exclusive (IX)
 Shared Intention Exclusive (SIX)

Intention –Shared Locks

Indicates that a shared lock will be requested on some descendant node

Intention –Exclusive Locks

Indicates that a exclusive lock will be requested on some descendant node

Intention –Shared Locks

Indicates that this node is locked in shared mode and exclusive lock will be requested on some descendant node

Compatibility Table

Mode IS IX S SIX X IS Yes Yes Yes Yes No IX Yes Yes No No No S Yes No Yes No No SIX Yes No No No No X No No No No
No

TODO::diagram for above lock compatibility table

Locking protocol

1. The root of the tree must be locked first


2. A node can be locked by transaction in S or IS mode only if the parent node is already locked by transaction in either IS or IX
mode
3. A node can be locked by transaction in X, IX or SIX modes only if the parent of the node already locked by transaction in
either IX or SIX modes
4. A transaction can unlock a node, only if none of the children of node are currently locked by transaction.
5. Lock compatibility and 2 phase locking should be adhered.

TODO::Example illustrating above protocol and compatibility table: Refer 445 of Elmasri

[edit] 8.5.2 Lock Escalation


[edit] 8.6 Index and Predicate Locking
One solution to phantom record problem is to use index locking.

A more general technique, called predicate locking would lock access to all records that satisfy a predicate or where condition.
Predicate locks have proved to be difficult to implement effectively.

[edit] 8.7 Timestamp based concurrency control (TODO:Rephrase whole section)


There is another concurrency control technique based on timestamp ordering, which avoids use of locks. As it avoids locks, deadlocks
do not occur in this concurrency control mechanism.

[edit] 8.7.1 Timestamps


Timestamp is a unique identifier created by DBMS to identify a transaction. Typically, timestamp values are assigned in the order in
which the transactions are submitted to the system. Generally it is the transaction start time and referred as TS (T). These unique
identifiers shall be implemented using simple counters, which gets incremented when a transaction is started. As is has a finite
maximum value, algorithm should take care of resetting it when it reaches maximum.
[edit] 8.7.2 Basic Timestamp ordering
Each data item X, has two timestamps

1. ReadTS(X) – The read timestamp of data item X; this is the largest timestamp among all the timestamps of the transactions
that have successfully read the item X.
2. WriteTS(X) – The write timestamp of data item X; this is the largest timestamp among all the timestamps of the transactions
that have successfully modified the item X.

Whenever some transaction T tries to issue readItem(X) or writeItem(X), the algorithm should compare the timestamp of T with
ReadTS(X) and WriteTS(X) to ensure that the timestamp order of the transaction execution is not violated. If this order is violated,
then transaction T is aborted and resubmitted to the system as a new transaction with a new timestamp. If T is aborted, then any
transaction T1 that may have used a value written by T must also be aborted. Similarly any transaction T2 that may have used a value
written by T1 must also be aborted and so on. This effect is known as cascading rollback and is one of the biggest problems associated
with this scheme.

The basic timestamp-ordering algorithm is summarized below

Transaction T issues writeItem(X) operation: If readTS(X) > TS(T), or writeTS(T) > TS(T), then abort T, else execute
writeItem(X) of T and set writeTS(X) to TS(T)

Transaction T issues readItem(X) operation: If writeTS(T) > TS(T), then abort T, else execute readItem(X) of T and set
readTS(X) to largest of TS(T) and current readTS(X)

Whenever the basic timestamp ordering algorithm, detects two conflicting operations that occur in the incorrect order, it rejects the
later of the two operations by aborting the transaction that issued it. The schedules produced by this algorithm are guaranteed to be
conflict serializable, like the 2 phase locking protocol.

[edit] 8.7.3 Strict Timestamp ordering


Strict Timestamp ordering is a variation of basic timestamp ordering that ensures that the schedules are recoverable and conflict
serializable. In this variation, a transaction T, that issues a readItem(X) or writeItem(X) such that TS (T) > writeTS(X) has its read or
write operation delayed until the transaction T1 that wrote the value of X has committed or aborted. To implement this algorithm,
locking is required. This algorithm does not cause deadlock, since T waits for T1 only if TS(T) > TS(T1).

TODO::refer white paper and add more content

[edit] 8.8 Multi Version concurrency control


TODO

[edit] 8.9 Optimistic concurrency control


In both concurrency control techniques, locking and timestamp ordering, certain checks are made before a transaction operates on a
data item. In locking, check is done to determine whether the item being accessed is locked. In timestamp ordering, the transaction
timestamp is checked against the read and write timestamps of the data item. This imposes an overhead to the transaction execution. In
optimistic concurrency control techniques, no checking is done while the transaction is executing. In this scheme, updates in the
transaction are not applied directly to the data items until the transaction reaches its end. During transaction execution, all updates are
applied to local copies of the data items, which are kept on per transaction basis. At the end of the transaction execution, a validation
phase checks whether any of the transaction’s updates violate serializability. If it is not violated, then the transaction is committed and
the database is updated from the local copies, otherwise the transaction is aborted and then restarted later. There are three phases in
this protocol

 Read Phase – A transaction can read values of committed data items from the database. However updates are applied only to
local copies of the data items kept in transaction workspace.
 Validation Phase - Checking is performed to ensure that serializability will not be violated if the transaction updates are
applied to the database.
 Write Phase – Transaction updates are applied to the database if the validation phase says that it is serializable. Otherwise the
updates are discarded and the transaction is restarted.
This protocol suits well incase of minimal interference between transaction on data items. If the interference is more, then transactions
will be restarted often. This technique is called ‘optimistic’ because they assume that little interference will occur and hence that there
is no need to do checking during transaction execution.

TODO::refer white paper and add more content

[edit] 8.10 Architecture for lock manager


Concurrency control in traditional database systems aims to maintain database consistency. Concurrency control in MMDB is difficult
due to the conflicting requirements of satisfying timing constraints and maintaining data consistency.

You might also like