0% found this document useful (0 votes)
13 views

Unit 5

Uploaded by

harsh1234mathur
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Unit 5

Uploaded by

harsh1234mathur
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Unit-5

Transaction Processing
Concurrency Control in DBMS
Concurrency control is a very important concept of DBMS which
ensures the simultaneous execution or manipulation of data by
several processes or user without resulting in data inconsistency.
Concurrency Control deals with interleaved execution of more
than one transaction.
What is Transaction?
A transaction is a collection of operations that performs a single
logical function in a database application. Each transaction is a unit
of both atomicity and consistency. Thus, we require that
transactions do not violate any database consistency constraints.
That is, if the database was consistent when a transaction started,
the database must be consistent when the transaction successfully
terminates. However, during the execution of a transaction, it may
be necessary temporarily to allow inconsistency, since either the
debit of A or the credit of B must be done before the other. This
temporary inconsistency, although necessary, may lead to difficulty
if a failure occurs.
It is the programmer’s responsibility to define properly the various
transactions, so that each preserves the consistency of the
database. For example, the transaction to transfer funds from the
account of department A to the account of department B could be
defined to be composed of two separate programs: one that debits
account A, and another that credits account B. The execution of
these two programs one after the other will indeed preserve
consistency. However, each program by itself does not transform
the database from a consistent state to a new consistent state.
Thus, those programs are not transactions.
The concept of a transaction has been applied broadly in database
systems and applications. While the initial use of transactions was in
financial applications, the concept is now used in real-time
applications in telecommunication, as well as in the management of
long-duration activities such as product design or administrative
workflows.
A set of logically related operations is known as a transaction. The
main operations of a transaction are:
 Read(A): Read operations Read(A) or R(A) reads the value of A
from the database and stores it in a buffer in the main memory.
 Write (A): Write operation Write(A) or W(A) writes the value back
to the database from the buffer.
(Note: It doesn’t always need to write it to a database back it just
writes the changes to buffer this is the reason where dirty read
comes into the picture)
Let us take a debit transaction from an account that consists of the
following operations:
1. R(A);
2. A=A-1000;
3. W(A);
Assume A’s value before starting the transaction is 5000.
 The first operation reads the value of A from the database and
stores it in a buffer.
 the Second operation will decrease its value by 1000. So buffer
will contain 4000.
 the Third operation will write the value from the buffer to the
database. So A’s final value will be 4000.
But it may also be possible that the transaction may fail after
executing some of its operations. The failure can be because
of hardware, software or power, etc. For example, if the debit
transaction discussed above fails after executing operation 2, the
value of A will remain 5000 in the database which is not acceptable
by the bank. To avoid this, Database has two important operations:
 Commit: After all instructions of a transaction are successfully
executed, the changes made by a transaction are made
permanent in the database.
 Rollback: If a transaction is not able to execute all operations
successfully, all the changes made by a transaction are undone.
For more details please refer Transaction Control in DBMS article.
Properties of a Transaction
Atomicity: As a transaction is a set of logically related
operations, either all of them should be executed or none. A
debit transaction discussed above should either execute all three
operations or none. If the debit transaction fails after executing
operations 1 and 2 then its new value of 4000 will not be updated in
the database which leads to inconsistency.
Consistency: If operations of debit and credit transactions on the
same account are executed concurrently, it may leave the database
in an inconsistent state.
 For Example, with T1 (debit of Rs. 1000 from A) and T2 (credit of
500 to A) executing concurrently, the database reaches an
inconsistent state.
 Let us assume the Account balance of A is Rs. 5000. T1 reads
A(5000) and stores the value in its local buffer space. Then T2
reads A(5000) and also stores the value in its local buffer space.
 T1 performs A=A-1000 (5000-1000=4000) and 4000 is stored in
T1 buffer space. Then T2 performs A=A+500 (5000+500=5500)
and 5500 is stored in the T2 buffer space. T1 writes the value
from its buffer back to the database.
 A’s value is updated to 4000 in the database and then T2 writes
the value from its buffer back to the database. A’s value is
updated to 5500 which shows that the effect of the debit
transaction is lost and the database has become inconsistent.
 To maintain consistency of the database, we need concurrency
control protocols which will be discussed in the next article.
The operations of T1 and T2 with their buffers and database have
been shown in Table 1.
T1’s buffer T2’s Buffer
T1 space T2 Space Database

A=5000

R(A); A=5000 A=5000

A=5000 R(A); A=5000 A=5000

A=A-1000; A=4000 A=5000 A=5000

A=4000 A=A+500; A=5500

W(A); A=5500 A=4000

W(A); A=5500
Isolation: The result of a transaction should not be visible to others
before the transaction is committed. For example, let us assume
that A’s balance is Rs. 5000 and T1 debits Rs. 1000 from A. A’s new
balance will be 4000. If T2 credits Rs. 500 to A’s new balance, A will
become 4500, and after this T1 fails. Then we have to roll back T2
as well because it is using the value produced by T1. So transaction
results are not made visible to other transactions before it commits.
Durable: Once the database has committed a transaction, the
changes made by the transaction should be permanent. e.g.; If a
person has credited $500000 to his account, the bank can’t say that
the update has been lost. To avoid this problem, multiple copies of
the database are stored at different locations.
What is a Schedule?
A schedule is a series of operations from one or more transactions. A
schedule can be of two types:
Serial Schedule: When one transaction completely executes
before starting another transaction, the schedule is called a serial
schedule. A serial schedule is always consistent. e.g.; If a schedule S
has debit transaction T1 and credit transaction T2, possible serial
schedules are T1 followed by T2 (T1->T2) or T2 followed by T1 ((T2-
>T1). A serial schedule has low throughput and less resource
utilization.
Concurrent Schedule: When operations of a transaction are
interleaved with operations of other transactions of a schedule, the
schedule is called a Concurrent schedule. e.g.; the Schedule of debit
and credit transactions shown in Table 1 is concurrent. But
concurrency can lead to inconsistency in the database. The above
example of a concurrent schedule is also inconsistent.
Difference between Serial Schedule and
Serializable Schedule
Serial Schedule Serializable Schedule

In Serial schedule, transactions will be In Serializable schedule transaction are


executed one after other. executed concurrently.

Serial schedule are less efficient. Serializable schedule are more efficient.

In serial schedule only one transaction In Serializable schedule multiple


Serial Schedule Serializable Schedule

executed at a time. transactions can be executed at a time.

Serial schedule takes more time for


In Serializable schedule execution is fast.
execution.

Concurrency Control in DBMS


 Executing a single transaction at a time will increase the waiting
time of the other transactions which may result in delay in the
overall execution. Hence for increasing the overall throughput
and efficiency of the system, several transactions are executed.
 Concurrency control is a very important concept of DBMS which
ensures the simultaneous execution or manipulation of data by
several processes or user without resulting in data inconsistency.
 Concurrency control provides a procedure that is able to control
concurrent execution of the operations in the database.
 The fundamental goal of database concurrency control is to
ensure that concurrent execution of transactions does not result
in a loss of database consistency. The concept of serializability
can be used to achieve this goal, since all serializable schedules
preserve consistency of the database. However, not all schedules
that preserve consistency of the database are serializable.
 In general it is not possible to perform an automatic analysis of
low-level operations by transactions and check their effect on
database consistency constraints. However, there are simpler
techniques. One is to use the database consistency constraints as
the basis for a split of the database into subdatabases on which
concurrency can be managed separately.
 Another is to treat some operations besides read and write as
fundamental low-level operations and to extend concurrency
control to deal with them.
Concurrency Control Problems
There are several problems that arise when numerous transactions
are executed simultaneously in a random manner. The database
transaction consist of two major operations “Read” and “Write”. It is
very important to manage these operations in the concurrent
execution of the transactions in order to maintain the consistency of
the data.
Dirty Read Problem(Write-Read conflict)
Dirty read problem occurs when one transaction updates an item
but due to some unconditional events that transaction fails but
before the transaction performs rollback, some other transaction
reads the updated value. Thus creates an inconsistency in the
database. Dirty read problem comes under the scenario of Write-
Read conflict between the transactions in the database
1. The lost update problem can be illustrated with the below
scenario between two transactions T1 and T2.
2. Transaction T1 modifies a database record without committing
the changes.
3. T2 reads the uncommitted data changed by T1
4. T1 performs rollback
5. T2 has already read the uncommitted data of T1 which is no
longer valid, thus creating inconsistency in the database.
Lost Update Problem
Lost update problem occurs when two or more transactions modify
the same data, resulting in the update being overwritten or lost by
another transaction. The lost update problem can be illustrated with
the below scenario between two transactions T1 and T2.
1. T1 reads the value of an item from the database.
2. T2 starts and reads the same database item.
3. T1 updates the value of that data and performs a commit.
4. T2 updates the same data item based on its initial read and
performs commit.
5. This results in the modification of T1 gets lost by the T2’s write
which causes a lost update problem in the database.
Concurrency Control Protocols
Concurrency control protocols are the set of rules which are
maintained in order to solve the concurrency control problems in the
database. It ensures that the concurrent transactions can execute
properly while maintaining the database consistency. The
concurrent execution of a transaction is provided with atomicity,
consistency, isolation, durability, and serializability via the
concurrency control protocols.
 Locked based concurrency control protocol
 Timestamp based concurrency control protocol
Locked based Protocol
In locked based protocol, each transaction needs to acquire locks
before they start accessing or modifying the data items. There are
two types of locks used in databases.
 Shared Lock : Shared lock is also known as read lock which
allows multiple transactions to read the data simultaneously. The
transaction which is holding a shared lock can only read the data
item but it can not modify the data item.
 Exclusive Lock : Exclusive lock is also known as the write lock.
Exclusive lock allows a transaction to update a data item. Only
one transaction can hold the exclusive lock on a data item at a
time. While a transaction is holding an exclusive lock on a data
item, no other transaction is allowed to acquire a
shared/exclusive lock on the same data item.
There are two kind of lock based protocol mostly used in database:
 Two Phase Locking Protocol : Two phase locking is a widely
used technique which ensures strict ordering of lock acquisition
and release. Two phase locking protocol works in two phases.
o Growing Phase : In this phase, the transaction starts
acquiring locks before performing any modification on the
data items. Once a transaction acquires a lock, that lock
can not be released until the transaction reaches the end
of the execution.
o Shrinking Phase : In this phase, the transaction
releases all the acquired locks once it performs all the
modifications on the data item. Once the transaction
starts releasing the locks, it can not acquire any locks
further.
 Strict Two Phase Locking Protocol : It is almost similar to the
two phase locking protocol the only difference is that in two
phase locking the transaction can release its locks before it
commits, but in case of strict two phase locking the transactions
are only allowed to release the locks only when they performs
commits.
Timestamp based Protocol
 In this protocol each transaction has a timestamp attached to it.
Timestamp is nothing but the time in which a transaction enters
into the system.
 The conflicting pairs of operations can be resolved by the
timestamp ordering protocol through the utilization of the
timestamp values of the transactions. Therefore, guaranteeing
that the transactions take place in the correct order.
Advantages of Concurrency
In general, concurrency means, that more than one transaction can
work on a system. The advantages of a concurrent system are:
 Waiting Time: It means if a process is in a ready state but still
the process does not get the system to get execute is called
waiting time. So, concurrency leads to less waiting time.
 Response Time: The time wasted in getting the response from
the cpu for the first time, is called response time. So, concurrency
leads to less Response Time.
 Resource Utilization: The amount of Resource utilization in a
particular system is called Resource Utilization. Multiple
transactions can run parallel in a system. So, concurrency leads
to more Resource Utilization.
 Efficiency: The amount of output produced in comparison to
given input is called efficiency. So, Concurrency leads to more
Efficiency.
Disadvantages of Concurrency
 Overhead: Implementing concurrency control requires additional
overhead, such as acquiring and releasing locks on database
objects. This overhead can lead to slower performance and
increased resource consumption, particularly in systems with high
levels of concurrency.
 Deadlocks: Deadlocks can occur when two or more transactions
are waiting for each other to release resources, causing a circular
dependency that can prevent any of the transactions from
completing. Deadlocks can be difficult to detect and resolve, and
can result in reduced throughput and increased latency.
 Reduced concurrency: Concurrency control can limit the
number of users or applications that can access the database
simultaneously. This can lead to reduced concurrency and slower
performance in systems with high levels of concurrency.
 Complexity: Implementing concurrency control can be complex,
particularly in distributed systems or in systems with complex
transactional logic. This complexity can lead to increased
development and maintenance costs.
 Inconsistency: In some cases, concurrency control can lead to
inconsistencies in the database. For example, a transaction that is
rolled back may leave the database in an inconsistent state, or a
long-running transaction may cause other transactions to wait for
extended periods, leading to data staleness and reduced
accuracy.
Concurrency Control in Distributed
Transactions
Concurrency control mechanisms provide us with various concepts
& implementations to ensure the execution of any transaction
across any node doesn’t violate ACID or BASE (depending on
database) properties causing inconsistency & mixup of data in the
distributed systems. Transactions in the distributed system are
executed in “sets“, every set consists of various sub-transactions.
These sub-transactions across every node must be executed serially
to maintain data integrity & the concurrency control mechanisms do
this serial execution.
Types of Concurrency Control Mechanisms
There are 2 types of concurrency control mechanisms as shown
below diagram:

Types of Concurrency Control Mechanism

Pessimistic Concurrency Control (PCC)


The Pessimistic Concurrency Control Mechanisms proceeds on
assumption that, most of the transactions will try to access the
same resource simultaneously. It’s basically used to prevent
concurrent access to a shared resource and provide a system of
acquiring a Lock on the data item before performing any operation.
Optimistic Concurrency Control (OCC)
The problem with pessimistic concurrency control systems is that, if
a transaction acquires a lock on a resource so that no other
transactions can access it. This will result in reducing concurrency of
the overall system.
The Optimistic Concurrency control techniques proceeds on the
basis of assumption that, 0 or very few transactions will try to
access a certain resource simultaneously. We can describe a system
as FULLY OPTIMISTIC, if it uses No-Locks at all & checks for
conflicts at commit time. It has following 4-phases of operation:
 Read Phase: When a transaction begins, it reads the data while
also logging the time-stamp at which data is read to verify for
conflicts during the validation phase.
 Execution Phase: In this phase, the transaction executes all its
operation like create, read, update or delete etc.
 Validation Phase: Before committing a transaction, a validation
check is performed to ensure consistency by checking
the last_updated timestamp with the one recorded
at read_phase. If the timestamp matches, then the transaction
will be allowed to be committed and hence proceed with the
commit phase.
 Commit phase: During this phase, the transactions will either be
committed or aborted, depending on the validation check
performed during previous phase. If the timestamp matches, then
transactions are committed else they’re aborted.
Pessimistic Concurrency Control Methods
Following are the four Pessimistic Concurrency Control Methods:
Isolation Level
The isolation levels are defined as a degree to which the data
residing in Database must be isolated by transactions for
modification. Because, if some transactions are operating on some
data let’s say transaction – T1 & there comes another transaction –
T2 and modifies it further while it was under operation by
transaction T1 this will cause unwanted inconsistency problems.
Methods provided in this are: Read-Uncomitted, Read-
Comitted, Repeatable Read & Serializable.
Two-Phase Locking Protocol
The two-phase locking protocol is a concurrency technique used to
manage locks on data items in database. This technique consists of
2 phases:
Growing Phase: The transaction acquires all the locks on the data
items that’ll be required to execute the transaction successfully. No
locks will be realease in this phase.
Shrinking Phase: All the locks acquired in previous phase will be
released one by one and No New locks will be acquired in this
phase.
Distributed Lock Manager
A distributed lock a critical component in the distributed transaction
system, which co-ordinates the lock acquiring, and releasing
operations in the transactions. It helps in synchronizing the
transaction and their operation so that data integrity is maintained.

Distributed Lock Manager (DLM)

Multiple Granularity Lock


A lock can be acquired at various granular level like: table level,
row/record level, page level or any other resource’s level. In
transaction system a transaction can lock a whole table, or a
specific row while performing some changes on it. This lock
acquiring when done by various transactions simultaneously, this
phenomena is called as multiple granularity locking.
Optimistic Concurrency Control Methods
Below are four Optimistic Concurrency Control Methods:
Timestamp Based (OCC)
In a timestamp based concurrency technique, each transaction in
the system is assigned a unique timestamp which is taken as soon
as the transaction begins, and its verified again during the commit
phase. If there’s new updated timestamp from a different
transaction then based on some policy defined by the System
Adminstrator the transaction will either be restarted or aborted. But
if the times stamp is same & never modified by any other
transaction then it will be committed.
Example: Let’s say we have two transaction T1 and T2, they
operate on data item – A. The Timestamp concurrency technique
will keep track of the timestamp when the data was accessed by
transaction T1 first time.
Data item
and Initial_timestamp
Transaction operation Most_recent_Timestamp of data item (A)

T1 Read(A) 12:00PM 12:00PM

T2 Write(A) 12:15PM 12:00PM

T1 Write(A) 12:30PM 12:00PM

Now, let’s say this transaction T1 is about to commit, before


committing, it will check the initial timestamp with the most recent
timestamp. In our case, the transaction T1 won’t be committed
because a write operations by transaction T2 was performed.
if(Initial_timestamp == Most_recent_timestamp)
then ‘Commit’
else
‘Abort’
In our case, transaction will be aborted because T2 modified the
same data item at 12:15PM.
Multi-Version Concurrency Control (MVCC)
In MVCC, every data item has multiple versions of itself. When a
transaction starts, it reads the version that is valid at the start of the
transaction. And when the transaction writes, it creates a new
version of that specific data item. That way, every transaction can
concurrently perform their operations.
Example: In a banking system two or more user can transfer money
without blocking each other simultaneously.
A similar technique to this is : Immutable Data Structures. Every
time a transaction performs a new operation, new data item will be
created so that way transactions do not have to worry about
consistency issues.
Snapshot Isolation
Snapshot isolation is basically a snapshot stored in an isolated
manner when our database system was purely consistent. And this
snapshot is read by the transactions at the beginning. Transaction
ensures that the data item is not changed while it was executing
operations on it. Snapshot isolation is achieved through OCC &
MVCC techniques.
Conflict Free Replicated Data Types (CRDTs)
CRDTs is a data structure technique which allows a transaction to
perform all its operation and replicate the data to some other node
or current node. After all the operations are performed, this
technique offers us with merging methods that allows us to merge
the data across distributed nodes (conflict-free) and eventually
achieving consistent state (eventually consistent property).

Transaction in DBMS
In Database Management Systems (DBMS), a transaction is a
fundamental concept representing a set of logically related
operations executed as a single unit. Transactions are essential
for handling user requests to access and modify database contents,
ensuring the database remains consistent and reliable despite
various operations and potential interruptions.
In this article, we will discuss what a transaction means, various
operations of transactions, transaction states, and properties of
transactions in DBMS.
What does a Transaction mean in DBMS?
 Transaction in Database Management Systems (DBMS) can
be defined as a set of logically related operations.
 It is the result of a request made by the user to access the
contents of the database and perform operations on it.
 It consists of various operations and has various states in its
completion journey.
 It also has some specific properties that must be followed to keep
the database consistent.
Operations of Transaction
A user can make different types of requests to access and modify
the contents of a database. So, we have different types of
operations relating to a transaction. They are discussed as follows:
i) Read(X)
 A read operation is used to read the value of X from the
database and store it in a buffer in the main memory for further
actions such as displaying that value.
 Such an operation is performed when a user wishes just to see
any content of the database and not make any changes to it. For
example, when a user wants to check his/her account’s balance,
a read operation would be performed on user’s account balance
from the database.
ii) Write(X)
 A write operation is used to write the value to the database from
the buffer in the main memory. For a write operation to be
performed, first a read operation is performed to bring its value in
buffer, and then some changes are made to it, e.g. some set of
arithmetic operations are performed on it according to the user’s
request, then to store the modified value back in the database, a
write operation is performed.
 For example, when a user requests to withdraw some money
from his account, his account balance is fetched from the
database using a read operation, then the amount to be deducted
from the account is subtracted from this value, and then the
obtained value is stored back in the database using a write
operation.
iii) Commit
 This operation in transactions is used to maintain integrity in the
database. Due to some failure of power, hardware, or software,
etc., a transaction might get interrupted before all its operations
are completed. This may cause ambiguity in the database, i.e. it
might get inconsistent before and after the transaction.
 To ensure that further operations of any other transaction are
performed only after work of the current transaction is done, a
commit operation is performed to the changes made by a
transaction permanently to the database.
iv) Rollback
 This operation is performed to bring the database to the last
saved state when any transaction is interrupted in between due
to any power, hardware, or software failure.
 In simple words, it can be said that a rollback operation does
undo the operations of transactions that were performed before
its interruption to achieve a safe state of the database and avoid
any kind of ambiguity or inconsistency.
Transaction Schedules
When multiple transaction requests are made at the same time, we
need to decide their order of execution. Thus, a transaction
schedule can be defined as a chronological order of execution of
multiple transactions.
There are broadly two types of transaction schedules discussed as
follows:
i) Serial Schedule
 In this kind of schedule, when multiple transactions are to be
executed, they are executed serially, i.e. at one time only one
transaction is executed while others wait for the execution of the
current transaction to be completed. This ensures consistency in
the database as transactions do not execute simultaneously.
 But, it increases the waiting time of the transactions in the queue,
which in turn lowers the throughput of the system, i.e. number of
transactions executed per time.
 To improve the throughput of the system, another kind of
schedule are used which has some more strict rules which help
the database to remain consistent even when transactions
execute simultaneously.
ii) Non-Serial Schedule
 To reduce the waiting time of transactions in the waiting queue
and improve the system efficiency, we use nonserial schedules
which allow multiple transactions to start before a transaction is
completely executed. This may sometimes result in inconsistency
and errors in database operation.
 So, these errors are handled with specific algorithms to maintain
the consistency of the database and improve CPU throughput as
well.
 Non-serial schedules are also sometimes referred to as parallel
schedules, as transactions execute in parallel in these kinds of
schedules.
Serializable
 Serializability in DBMS is the property of a nonserial schedule that
determines whether it would maintain the database consistency
or not.
 The nonserial schedule which ensures that the database would be
consistent after the transactions are executed in the order
determined by that schedule is said to be Serializable Schedules.
 The serial schedules always maintain database consistency as a
transaction starts only when the execution of the other
transaction has been completed under it.
 Thus, serial schedules are always serializable.
 A transaction is a series of operations, so various states occur in
its completion journey. They are discussed as follows:
i) Active
 It is the first stage of any transaction when it has begun to
execute. The execution of the transaction takes place in this
state.
 Operations such as insertion, deletion, or updation are performed
during this state.
 During this state, the data records are under manipulation and
they are not saved to the database, rather they remain
somewhere in a buffer in the main memory.
ii) Partially Committed
 This state of transaction is achieved when it has completed most
of the operations and is executing its final operation.
 It can be a signal to the commit operation, as after the final
operation of the transaction completes its execution, the data has
to be saved to the database through the commit operation.

 If some kind of error occurs during this state, the transaction goes
into a failed state, else it goes into the Committed state.
iii) Commited
This state of transaction is achieved when all the transaction-related
operations have been executed successfully along with the Commit
operation, i.e. data is saved into the database after the required
manipulations in this state. This marks the successful completion of
a transaction.
iv) Failed
 If any of the transaction-related operations cause an error during
the active or partially committed state, further execution of the
transaction is stopped and it is brought into a failed state. Here,
the database recovery system makes sure that the database is in
a consistent state.
v) Aborted
If the error is not resolved in the failed state, then the transaction is
aborted and a rollback operation is performed to bring database to
the the last saved consistent state. When the transaction is aborted,
the database recovery module either restarts the transaction or kills
it.
The illustration below shows the various states that a transaction
may encounter in its completion journey.

Transaction in DBMS

Properties of Transaction
 As transactions deal with accessing and modifying the contents of
the database, they must have some basic properties which help
maintain the consistency and integrity of the database before and
after the transaction. Transactions follow 4 properties,
namely, Atomicity, Consistency, Isolation, and Durability.
 Generally, these are referred to as ACID properties of
transactions in DBMS. ACID is the acronym used for transaction
properties. A brief description of each property of the transaction
is as follows.
i) Atomicity
 This property ensures that either all operations of a transaction
are executed or it is aborted. In any case, a transaction can never
be completed partially.
 Each transaction is treated as a single unit (like an atom).
Atomicity is achieved through commit and rollback operations,
i.e. changes are made to the database only if all operations
related to a transaction are completed, and if it gets interrupted,
any changes made are rolled back using rollback operation to
bring the database to its last saved state.
ii) Consistency
 This property of a transaction keeps the database consistent
before and after a transaction is completed.
 Execution of any transaction must ensure that after its execution,
the database is either in its prior stable state or a new stable
state.
 In other words, the result of a transaction should be the
transformation of a database from one consistent state to another
consistent state.
 Consistency, here means, that the changes made in the database
are a result of logical operations only which the user desired to
perform and there is not any ambiguity.
iii) Isolation
 This property states that two transactions must not interfere with
each other, i.e. if some data is used by a transaction for its
execution, then any other transaction can not concurrently access
that data until the first transaction has completed.
 It ensures that the integrity of the database is maintained and we
don’t get any ambiguous values. Thus, any two transactions are
isolated from each other.
 This property is enforced by the concurrency control subsystem
of DBMS.
iv) Durability
 This property ensures that the changes made to the database
after a transaction is completely executed, are durable.
 It indicates that permanent changes are made by the successful
execution of a transaction.
 In the event of any system failures or crashes, the consistent
state achieved after the completion of a transaction remains
intact. The recovery subsystem of DBMS is responsible for
enforcing this property.
Introduction to Transaction Processing
Single user system :
In this at-most, only one user at a time can use the system.
Multi-user system :
In the same, many users can access the system concurrently.
Concurrency can be provided through :
1. Interleaved Processing –
In this, the concurrent execution of processes is interleaved in a single
CPU. The transactions are interleaved, meaning the second transaction is
started before the primary one could finish. And execution can switch
between the transactions. It can also switch between multiple
transactions. This causes inconsistency in the system.
2. Parallel Processing –
It is defined as the processing in which a large task into various smaller
tasks and smaller task also executes concurrently on several nodes. In
this, the processes are concurrently executed in multiple CPUs.
Transaction :
It is a logical unit of database processing that includes one or more access
operations. (read-retrieval, write-insert or update). It is a unit of program
execution that accesses and if required updates various data items.
A transaction is a set of operations that can either be embedded within an
application program or can be specified interactively via a high-quality
language such as SQL.
Example –
Consider a transaction that involves transferring $1700 from a customer’s
savings account to a customer’s checking account. This transaction involves
two separate operations: debiting the savings account by $1700 and
crediting the checking account by $1700. If one operation succeeds but the
other doesn’t, the books of the bank will not balance.
Transaction boundaries :
Begin and end boundaries. In this, you can say an application program may
have several transactions and transactions separated by the beginning and
end of the transaction in an application program.
Granularity of data :
 The size of data item is called its granularity.
 A data item can be an individual field (attribute), value of some record, a
record, or a whole disk block.
 Concepts are independent of granularity
Advantages :
 Batch processing or real-time processing available.
 Reduction in processing time, lead time and order cycle time.
 Reduction in inventory, personnel and ordering costs.
 Increase in productivity andcustomer satisfaction
Disadvantages :
 High setup costs.
 Lack of standard formats.
 Hardware and software incompatibility.

ACID Properties in DBMS


This article is based on the concept of ACID properties in DBMS that
are necessary for maintaining data consistency, integrity, and
reliability while performing transactions in the database. Let’s
explore them.
A transaction is a single logical unit of work that accesses and
possibly modifies the contents of a database. Transactions access
data using read-and-write operations. To maintain consistency in a
database, before and after the transaction, certain properties are
followed. These are called ACID properties.
For those looking to master these concepts and excel in exams
like GATE , our GATE course offers an in-depth exploration of
database management systems. We cover everything from the
basics to advanced topics, ensuring a thorough understanding that
is essential for high scores and practical application
Atomicity:
By this, we mean that either the entire transaction takes place at
once or doesn’t happen at all. There is no midway i.e. transactions
do not occur partially. Each transaction is considered as one unit
and either runs to completion or is not executed at all. It involves
the following two operations.
— Abort : If a transaction aborts, changes made to the database
are not visible.
— Commit : If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consider the following transaction T consisting of T1 and T2 :
Transfer of 100 from account X to account Y .
If the transaction fails after completion of T1 but before completion
of T2 .( say, after write(X) but before write(Y) ), then the amount
has been deducted from X but not added to Y . This results in an
inconsistent database state. Therefore, the transaction must be
executed in its entirety in order to ensure the correctness of the
database state.
Consistency:
This means that integrity constraints must be maintained so that the
database is consistent before and after the transaction. It refers to
the correctness of a database. Referring to the example above,
The total amount before and after the transaction must be
maintained.
Total before T occurs = 500 + 200 = 700 .
Total after T occurs = 400 + 300 = 700 .
Therefore, the database is consistent . Inconsistency occurs in
case T1 completes but T2 fails. As a result, T is incomplete.
Isolation:
This property ensures that multiple transactions can occur
concurrently without leading to the inconsistency of the database
state. Transactions occur independently without interference.
Changes occurring in a particular transaction will not be visible to
any other transaction until that particular change in that transaction
is written to memory or has been committed. This property ensures
that the execution of transactions concurrently will result in a state
that is equivalent to a state achieved these were executed serially
in some order.
Let X = 500, Y = 500.
Consider two transactions T and T”.
Suppose T has been executed till Read (Y) and then T’’ starts. As a
result, interleaving of operations takes place due to which T’’ reads
the correct value of X but the incorrect value of Y and sum
computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450) .
This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes should
be visible only after they have been made to the main memory.
Durability:
This property ensures that once the transaction has completed
execution, the updates and modifications to the database are stored
in and written to disk and they persist even if a system failure
occurs. These updates now become permanent and are stored in
non-volatile memory. The effects of the transaction, thus, are never
lost.
Some important points:
Property Responsibility for maintaining properties

Atomicity Transaction Manager

Consistency Application programmer

Isolation Concurrency Control Manager

Durability Recovery Manager

The ACID properties, in totality, provide a mechanism to ensure the


correctness and consistency of a database in a way such that each
transaction is a group of operations that acts as a single unit,
produces consistent results, acts in isolation from other operations,
and updates that it makes are durably stored.
ACID properties are the four key characteristics that define the
reliability and consistency of a transaction in a Database
Management System (DBMS). The acronym ACID stands for
Atomicity, Consistency, Isolation, and Durability. Here is a brief
description of each of these properties:
1. Atomicity: Atomicity ensures that a transaction is treated as a
single, indivisible unit of work. Either all the operations within the
transaction are completed successfully, or none of them are. If
any part of the transaction fails, the entire transaction is rolled
back to its original state, ensuring data consistency and integrity.
2. Consistency: Consistency ensures that a transaction takes the
database from one consistent state to another consistent state.
The database is in a consistent state both before and after the
transaction is executed. Constraints, such as unique keys and
foreign keys, must be maintained to ensure data consistency.
3. Isolation: Isolation ensures that multiple transactions can
execute concurrently without interfering with each other. Each
transaction must be isolated from other transactions until it is
completed. This isolation prevents dirty reads, non-repeatable
reads, and phantom reads.
4. Durability: Durability ensures that once a transaction is
committed, its changes are permanent and will survive any
subsequent system failures. The transaction’s changes are saved
to the database permanently, and even if the system crashes, the
changes remain intact and can be recovered.
Overall, ACID properties provide a framework for ensuring data
consistency, integrity, and reliability in DBMS. They ensure that
transactions are executed in a reliable and consistent manner, even
in the presence of system failures, network issues, or other
problems. These properties make DBMS a reliable and efficient tool
for managing data in modern organizations.
Advantages of ACID Properties in DBMS
1. Data Consistency: ACID properties ensure that the data
remains consistent and accurate after any transaction execution.
2. Data Integrity: ACID properties maintain the integrity of the
data by ensuring that any changes to the database are
permanent and cannot be lost.
3. Concurrency Control: ACID properties help to manage multiple
transactions occurring concurrently by preventing interference
between them.
4. Recovery: ACID properties ensure that in case of any failure or
crash, the system can recover the data up to the point of failure
or crash.
Disadvantages of ACID Properties in DBMS
1. Performance: The ACID properties can cause a performance
overhead in the system, as they require additional processing to
ensure data consistency and integrity.
2. Scalability: The ACID properties may cause scalability issues in
large distributed systems where multiple transactions occur
concurrently.
3. Complexity: Implementing the ACID properties can increase the
complexity of the system and require significant expertise and
resources.
Overall, the advantages of ACID properties in DBMS outweigh the
disadvantages. They provide a reliable and consistent approach
to data management, ensuring data integrity, accuracy, and
reliability. However, in some cases, the overhead of implementing
ACID properties can cause performance and scalability issues.
Therefore, it’s important to balance the benefits of ACID
properties against the specific needs and requirements of the
system.

Concurrency Control Techniques


Concurrency control is provided in a database to:
 (i) enforce isolation among transactions.
 (ii) preserve database consistency through consistency
preserving execution of transactions.
 (iii) resolve read-write and write-read conflicts.
Various concurrency control techniques are:
1. Two-phase locking Protocol
2. Time stamp ordering Protocol
3. Multi version concurrency control
4. Validation concurrency control
Concurrency control techniques in a Database Management System
(DBMS) manage simultaneous operations without conflicts.
Techniques like lock-based protocols, timestamp ordering,
and optimistic concurrency control ensure that database
transactions remain consistent, even when multiple transactions
access the same data concurrently. These methods prevent
problems such as deadlocks, dirty reads, and lost updates.
For a detailed exploration of concurrency control and other DBMS
concepts, consider enrolling in the GeeksforGeeks GATE CS Self-
Paced course to deepen your knowledge in computer science
fundamentals.
These are briefly explained below. 1. Two-Phase Locking
Protocol: Locking is an operation which secures: permission to
read, OR permission to write a data item. Two phase locking is a
process used to gain ownership of shared resources without creating
the possibility of deadlock. The 3 activities taking place in the two
phase update algorithm are:
(i). Lock Acquisition
(ii). Modification of Data
(iii). Release Lock
Two phase locking prevents deadlock from occurring in distributed
systems by releasing all the resources it has acquired, if it is not
possible to acquire all the resources required without waiting for
another process to finish using a lock. This means that no process is
ever in a state where it is holding some shared resources, and
waiting for another process to release a shared resource which it
requires. This means that deadlock cannot occur due to resource
contention. A transaction in the Two Phase Locking Protocol can
assume one of the 2 phases:
 (i) Growing Phase: In this phase a transaction can only acquire
locks but cannot release any lock. The point when a transaction
acquires all the locks it needs is called the Lock Point.
 (ii) Shrinking Phase: In this phase a transaction can only
release locks but cannot acquire any.
2. Time Stamp Ordering Protocol: A timestamp is a tag that can
be attached to any transaction or any data item, which denotes a
specific time on which the transaction or the data item had been
used in any way. A timestamp can be implemented in 2 ways. One is
to directly assign the current value of the clock to the transaction or
data item. The other is to attach the value of a logical counter that
keeps increment as new timestamps are required. The timestamp of
a data item can be of 2 types:
 (i) W-timestamp(X): This means the latest time when the data
item X has been written into.
 (ii) R-timestamp(X): This means the latest time when the data
item X has been read from. These 2 timestamps are updated
each time a successful read/write operation is performed on the
data item X.
3. Multiversion Concurrency Control: Multiversion schemes
keep old versions of data item to increase
concurrency. Multiversion 2 phase locking: Each successful write
results in the creation of a new version of the data item written.
Timestamps are used to label the versions. When a read(X)
operation is issued, select an appropriate version of X based on the
timestamp of the transaction. 4. Validation Concurrency
Control: The optimistic approach is based on the assumption that
the majority of the database operations do not conflict. The
optimistic approach requires neither locking nor time stamping
techniques. Instead, a transaction is executed without restrictions
until it is committed. Using an optimistic approach, each transaction
moves through 2 or 3 phases, referred to as read, validation and
write.
 (i) During read phase, the transaction reads the database,
executes the needed computations and makes the updates to a
private copy of the database values. All update operations of the
transactions are recorded in a temporary update file, which is not
accessed by the remaining transactions.
 (ii) During the validation phase, the transaction is validated to
ensure that the changes made will not affect the integrity and
consistency of the database. If the validation test is positive, the
transaction goes to a write phase. If the validation test is
negative, he transaction is restarted and the changes are
discarded.
 (iii) During the write phase, the changes are permanently applied
to the database.

Lock Based Concurrency Control Protocol in


DBMS
In a database management system (DBMS), lock-based concurrency
control (BCC) is used to control the access of multiple transactions
to the same data item. This protocol helps to maintain data
consistency and integrity across multiple users.
In the protocol, transactions gain locks on data items to control their
access and prevent conflicts between concurrent transactions. This
article will look deep into the Lock Based Protocol in detail.
What is a Lock?
A Lock is a variable assigned to any data item to keep track of the
status of that data item so that isolation and non-interference are
ensured during concurrent transactions.
Lock-based concurrency control ensures that transactions in a
database can proceed safely without causing conflicts. Mastering
this topic is key for anyone studying DBMS. For a thorough
exploration of lock-based protocols and how they manage
transaction concurrency, the GATE CS and IT – 2025 course offers
in-depth explanations and practical examples, helping students
navigate complex concurrency issues with ease
Lock Based Protocols
A lock is a variable associated with a data item that describes the
status of the data item to possible operations that can be applied to
it. They synchronize the access by concurrent transactions to the
database items. It is required in this protocol that all the data items
must be accessed in a mutually exclusive manner. Let me introduce
you to two common locks that are used and some terminology
followed in this protocol.
Types of Lock
1. Shared Lock (S): Shared Lock is also known as Read-only lock.
As the name suggests it can be shared between transactions
because while holding this lock the transaction does not have the
permission to update data on the data item. S-lock is requested
using lock-S instruction.
2. Exclusive Lock (X): Data item can be both read as well as
written.This is Exclusive and cannot be held simultaneously on
the same data item. X-lock is requested using lock-X instruction.
Lock Compatibility Matrix
A transaction may be granted a lock on an item if the requested lock
is compatible with locks already held on the item by other
transactions. Any number of transactions can hold shared locks on
an item, but if any transaction holds an exclusive(X) on the item no
other transaction may hold any lock on the item. If a lock cannot be
granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released.
Then the lock is granted.

Lock Compatibility Matrix

Concurrency Control Protocols


Concurrency Control Protocol allow concurrent schedules, but
ensure that the schedules are conflict/view serializable, and are
recoverable and maybe even cascadeless. These protocols do not
examine the precedence graph as it is being created, instead a
protocol imposes a discipline that avoids non-serializable schedules.
Different concurrency control protocols provide different advantages
between the amount of concurrency they allow and the amount of
overhead that they impose.
 Lock Based Protocol
o Basic 2-PL
o Conservative 2-PL
o Strict 2-PL
o Rigorous 2-PL
 Graph Based Protocol
 Time-Stamp Ordering Protocol
 Multiple Granularity Protocol
 Multi-version Protocol
Types of Lock-Based Protocols
1. Simplistic Lock Protocol
It is the simplest method for locking data during a transaction.
Simple lock-based protocols enable all transactions to obtain a lock
on the data before inserting, deleting, or updating it. It will unlock
the data item once the transaction is completed.
2. Pre-Claiming Lock Protocol
Pre-claiming Lock Protocols assess transactions to determine which
data elements require locks. Before executing the transaction, it
asks the DBMS for a lock on all of the data elements. If all locks are
given, this protocol will allow the transaction to start. When the
transaction is finished, it releases all locks. If all of the locks are not
provided, this protocol allows the transaction to be reversed and
waits until all of the locks are granted.
3. Two-phase locking (2PL)
A transaction is said to follow the Two-Phase Locking protocol if
Locking and Unlocking can be done in two phases
 Growing Phase: New locks on data items may be acquired but
none can be released.
 Shrinking Phase: Existing locks may be released but no new
locks can be acquired.
For more detail refer the published article Two-phase locking (2PL).
4. Strict Two-Phase Locking Protocol
Strict Two-Phase Locking requires that in addition to the 2-PL all
Exclusive(X) locks held by the transaction be released
until after the Transaction Commits. For more details refer the
published article Strict Two-Phase Locking Protocol.
Upgrade / Downgrade locks
A transaction that holds a lock on an item Ais allowed under certain
condition to change the lock state from one state to another.
Upgrade: A S(A) can be upgraded to X(A) if T i is the only transaction
holding the S-lock on element A. Downgrade: We may downgrade
X(A) to S(A) when we feel that we no longer want to write on data-
item A. As we were holding X-lock on A, we need not check any
conditions.
So, by now we are introduced with the types of locks and how to
apply them. But wait, just by applying locks if our problems could’ve
been avoided then life would’ve been so simple! If you have done
Process Synchronization under OS you must be familiar with one
consistent problem, starvation and Deadlock! We’ll be discussing
them shortly, but just so you know we have to apply Locks but they
must follow a set of protocols to avoid such undesirable problems.
Shortly we’ll use 2-Phase Locking (2-PL) which will use the concept
of Locks to avoid deadlock. So, applying simple locking, we may not
always produce Serializable results, it may lead to Deadlock
Inconsistency.
Problem With Simple Locking
Consider the Partial Schedule:
S.No T1 T2

1 lock-X(B)
S.No T1 T2

2 read(B)

3 B:=B-50

4 write(B)

5 lock-S(A)

6 read(A)

7 lock-S(B)

8 lock-X(A)

9 …… ……

1. Deadlock
In deadlock consider the above execution phase. Now, T1 holds an
Exclusive lock over B, and T2 holds a Shared lock over A. Consider
Statement 7, T2 requests for lock on B, while in Statement
8 T1 requests lock on A. This as you may notice imposes a deadlock
as none can proceed with their execution.

Deadlock
2. Starvation
Starvation is also possible if concurrency control manager is badly
designed. For example: A transaction may be waiting for an X-lock
on an item, while a sequence of other transactions request and are
granted an S-lock on the same item. This may be avoided if the
concurrency control manager is properly designed.

Timestamp based Concurrency Control


Timestamp-based concurrency control is a method used in database
systems to ensure that transactions are executed safely and
consistently without conflicts, even when multiple transactions are
being processed simultaneously. This approach relies on timestamps
to manage and coordinate the execution order of transactions. Refer
to the timestamp of a transaction T as TS(T).
What is Timestamp Ordering Protocol?
The main idea for this protocol is to order the transactions based on
their Timestamps. A schedule in which the transactions participate is
then serializable and the only equivalent serial schedule
permitted has the transactions in the order of their Timestamp
Values. Stating simply, the schedule is equivalent to the
particular Serial Order corresponding to the order of the Transaction
timestamps. An algorithm must ensure that, for each item accessed
by Conflicting Operations in the schedule, the order in which the
item is accessed does not violate the ordering. To ensure this, use
two Timestamp Values relating to each database item X.
 W_TS(X) is the largest timestamp of any transaction that
executed write(X) successfully.
 R_TS(X) is the largest timestamp of any transaction that
executed read(X) successfully.
Basic Timestamp Ordering
Every transaction is issued a timestamp based on when it enters the
system. Suppose, if an old transaction T i has timestamp TS(Ti), a
new transaction Tj is assigned timestamp TS(Tj) such that TS(Ti) <
TS(Tj). The protocol manages concurrent execution such that the
timestamps determine the serializability order. The timestamp
ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order. Whenever some
Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO
algorithm compares the timestamp of T with R_TS(X) &
W_TS(X) to ensure that the Timestamp order is not violated. This
describes the Basic TO protocol in the following two cases.
 Whenever a Transaction T issues a W_item(X) operation, check
the following conditions:
o If R_TS(X) > TS(T) and if W_TS(X) > TS(T), then abort
and rollback T and reject the operation. else,
o Execute W_item(X) operation of T and set W_TS(X) to
TS(T).
 Whenever a Transaction T issues a R_item(X) operation, check
the following conditions:
o If W_TS(X) > TS(T), then abort and reject T and reject
the operation, else
o If W_TS(X) <= TS(T), then execute the R_item(X)
operation of T and set R_TS(X) to the larger of TS(T) and
current R_TS(X).
Whenever the Basic TO algorithm detects two conflicting operations
that occur in an incorrect order, it rejects the latter of the two
operations by aborting the Transaction that issued it. Schedules
produced by Basic TO are guaranteed to be conflict serializable.
Already discussed that using Timestamp can ensure that our
schedule will be deadlock free.
One drawback of the Basic TO protocol is that Cascading
Rollback is still possible. Suppose we have a Transaction T 1 and
T2 has used a value written by T 1. If T1 is aborted and resubmitted to
the system then, T2 must also be aborted and rolled back. So the
problem of Cascading aborts still prevails. Let’s gist the Advantages
and Disadvantages of Basic TO protocol:
 Timestamp Ordering protocol ensures serializability since the
precedence graph will be of the form:
Precedence Graph for TS ordering
 Timestamp protocol ensures freedom from deadlock as no
transaction ever waits.
 But the schedule may not be cascade free, and may not even be
recoverable.
Strict Timestamp Ordering
A variation of Basic TO is called Strict TO ensures that the
schedules are both Strict and Conflict Serializable. In this variation, a
Transaction T that issues a R_item(X) or W_item(X) such that TS(T)
> W_TS(X) has its read or write operation delayed until the
Transaction T‘ that wrote the values of X has committed or aborted.
Advantages of Timestamp Ordering Protocol
 High Concurrency: Timestamp-based concurrency control
allows for a high degree of concurrency by ensuring that
transactions do not interfere with each other.
 Efficient: The technique is efficient and scalable, as it does not
require locking and can handle a large number of transactions.
 No Deadlocks: Since there are no locks involved, there is no
possibility of deadlocks occurring.
 Improved Performance: By allowing transactions to execute
concurrently, the overall performance of the database system can
be improved.
Disadvantages of Timestamp Ordering
Protocol
 Limited Granularity: The granularity of timestamp-based
concurrency control is limited to the precision of the timestamp.
This can lead to situations where transactions are unnecessarily
blocked, even if they do not conflict with each other.
 Timestamp Ordering: In order to ensure that transactions are
executed in the correct order, the timestamps need to be
carefully managed. If not managed properly, it can lead to
inconsistencies in the database.
 Timestamp Synchronization: Timestamp-based concurrency
control requires that all transactions have synchronized clocks. If
the clocks are not synchronized, it can lead to incorrect ordering
of transactions.
 Timestamp Allocation: Allocating unique timestamps for each
transaction can be challenging, especially in distributed systems
where transactions may be initiated at different locations.

You might also like