0% found this document useful (0 votes)
16 views

Unit 4_Concepts of Concurrency Control

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Unit 4_Concepts of Concurrency Control

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

DBMS (UNIT 4)

 Transaction Processing
 Transaction system
 Testing of serializability
 Recoverable schedule
 Concurrency control
 Locking techniques for concurrency control
 Time stamping protocols for concurrency control
 Validation based protocol
 Multiple granularity
 Recovery from transaction failures
 Log based recovery
 Checkpoints
 Shadow paging
 Deadlock handling

SYLLABUS
 The transaction is a set of logically related operation. It contains a
group of tasks.
 A transaction is an action or series of actions. It is performed by a single
user to perform operations for accessing the contents of the database.
 Example: Suppose an employee of bank transfers Rs 800 from X's
account to Y's account. This small transaction contains several low-level
tasks:

TRANSACTION PROCESSING
 Open_Account(X)
 Old_Balance = X.balance
 New_Balance = Old_Balance - 800
 X.balance = New_Balance
 Close_Account(X)

X'S ACCOUNT
 Open_Account(Y)
 Old_Balance = Y.balance
 New_Balance = Old_Balance + 800
 Y.balance = New_Balance
 Close_Account(Y)

Y'S ACCOUNT
 Following are the main operations of transaction:
 Read(X): Read operation is used to read the value of X from the database
and stores it in a buffer in main memory.
 Write(X): Write operation is used to write the value back to the database
from the buffer.
 Let's take an example to debit transaction from an account which consists
of following operations:
 1. R(X);
 2. X = X - 500;
 3. W(X);

OPERATIONS OF TRANSACTION:
 Let's assume the value of X before starting of the transaction is 4000.
 The first operation reads X's value from database and stores it in a buffer.
 The second operation will decrease the value of X by 500. So buffer will
contain 3500.
 The third operation will write the buffer's value to the database. So X's final
value will be 3500.
 But it may be possible that because of the failure of hardware, software or
power, etc. that transaction may fail before finished all the operations in the
set.

CONT…..
 For example: If in the above transaction, the debit transaction fails
after executing operation 2 then X's value will remain 4000 in the
database which is not acceptable by the bank.
 To solve this problem, we have two important operations:
 Commit: It is used to save the work done permanently.
 Rollback: It is used to undo the work done.

CONT….
 A transaction is a very small unit of a program and it may contain
several lowlevel tasks. A transaction in a database system must
maintain Atomicity, Consistency, Isolation, and Durability − commonly
known as ACID properties − in order to ensure accuracy, completeness,
and data integrity.

ACID PROPERTIES
 Atomicity − This property states that a transaction must be treated as
an atomic unit, that is, either all of its operations are executed or none.
There must be no state in a database where a transaction is left partially
completed. States should be defined either before the execution of the
transaction or after the execution/abortion/failure of the transaction.
 Consistency − The database must remain in a consistent state after any
transaction. No transaction should have any adverse effect on the data
residing in the database. If the database was in a consistent state before
the execution of a transaction, it must remain consistent after the
execution of the transaction as well.

CONT….
 Durability − The database should be durable enough to hold all its latest
updates even if the system fails or restarts. If a transaction updates a chunk
of data in a database and commits, then the database will hold the
modified data. If a transaction commits but the system fails before the data
could be written on to the disk, then that data will be updated once the
system springs back into action.
 Isolation − In a database system where more than one transaction are
being executed simultaneously and in parallel, the property of isolation
states that all the transactions will be carried out and executed as if it is the
only transaction in the system. No transaction will affect the existence of
any other transaction.

CONT…..
 Serialization Graph is used to test the Serializability of a schedule.
 Assume a schedule S. For S, we construct a graph known as precedence graph. This
graph has a pair G = (V, E), where V consists a set of vertices, and E consists a set of
edges. The set of vertices is used to contain all the transactions participating in the
schedule. The set of edges is used to contain all edges Ti ->Tj for which one of the
three conditions holds:
 Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).
 Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).
 Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).

TESTING OF SERIALIZABILITY
If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are
executed before the first instruction of Tj is executed.
If a precedence graph for schedule S contains a cycle, then S is non-serializable. If the
precedence graph has no cycle, then S is known as serializable.
 Recoverability is a property of database systems that ensures that, in
the event of a failure or error, the system can recover the database to a
consistent state. Recoverability guarantees that all committed
transactions are durable and that their effects are permanently stored in
the database, while the effects of uncommitted transactions are undone
to maintain data consistency.

RECOVERABLE SCHEDULE
 The recoverability property is enforced through the use of transaction
logs, which record all changes made to the database during transaction
processing. When a failure occurs, the system uses the log to recover
the database to a consistent state, which involves either undoing the
effects of uncommitted transactions or redoing the effects of
committed transactions.

CONT…..
 No-undo logging: This level of recoverability only guarantees that
committed transactions are durable, but does not provide the ability to
undo the effects of uncommitted transactions.
 Undo logging: This level of recoverability provides the ability to undo
the effects of uncommitted transactions but may result in the loss of
updates made by committed transactions that occur after the failed
transaction.

THERE ARE SEVERAL LEVELS OF


RECOVERABILITY THAT CAN BE SUPPORTED
BY A DATABASE SYSTEM:
 Redo logging: This level of recoverability provides the ability to redo
the effects of committed transactions, ensuring that all committed
updates are durable and can be recovered in the event of failure.
 Undo-redo logging: This level of recoverability provides both undo
and redo capabilities, ensuring that the system can recover to a
consistent state regardless of whether a transaction has been
committed or not.

CONT….
 Schedules in which transactions commit only after all transactions
whose changes they read commit are called recoverable schedules. In
other words, if some transaction Tj is reading value updated or written
by some other transaction Ti, then the commit of Tj must occur after the
commit of Ti.

RECOVERABLE SCHEDULES:
EXAMPLE : CONSIDER THE FOLLOWING SCHEDULE
INVOLVING TWO TRANSACTIONS T1 AND T2.
THIS IS A RECOVERABLE SCHEDULE SINCE T1 COMMITS
BEFORE T2, THAT MAKES THE VALUE READ BY T2 CORRECT.
 Concurrently control is a very important concept of DBMS which ensures
the simultaneous execution or manipulation of data by several processes or
user without resulting in data inconsistency. Concurrency Control deals
with interleaved execution of more than one transaction.
 What is Transaction?
 A set of logically related operations is known as a transaction. The main
operations of a transaction are:
 Read(A): Read operations Read(A) or R(A) reads the value of A from the
database and stores it in a buffer in the main memory.
 Write (A): Write operation Write(A) or W(A) writes the value back to the
database from the buffer.

CONCURRENCY CONTROL
 Executing a single transaction at a time will increase the waiting time of
the other transactions which may result in delay in the overall
execution. Hence for increasing the overall throughput and efficiency of
the system, several transactions are executed.
 Concurrently control is a very important concept of DBMS which
ensures the simultaneous execution or manipulation of data by several
processes or user without resulting in data inconsistency.
 Concurrency control provides a procedure that is able to control
concurrent execution of the operations in the database.

CONCURRENCY CONTROL IN DBMS


 In a multiprogramming environment where multiple transactions can
be executed simultaneously, it is highly important to control the
concurrency of transactions. We have concurrency control protocols to
ensure atomicity, isolation, and serializability of concurrent transactions.
Concurrency control protocols can be broadly divided into two
categories −
 Lock based protocols
 Time stamp based protocols

LOCKING TECHNIQUES FOR


CONCURRENCY CONTROL
 Database systems equipped with lock-based protocols use a
mechanism by which any transaction cannot read or write data until it
acquires an appropriate lock on it. Locks are of two kinds −
 Binary Locks − A lock on a data item can be in two states; it is either
locked or unlocked.
 Shared/exclusive − This type of locking mechanism differentiates the
locks based on their uses. If a lock is acquired on a data item to
perform a write operation, it is an exclusive lock. Allowing more than
one transaction to write on the same data item would lead the
database into an inconsistent state. Read locks are shared because no
data value is being changed.

LOCK-BASED PROTOCOLS
 This locking protocol divides the execution phase of a transaction into
three parts. In the first part, when the transaction starts executing, it
seeks permission for the locks it requires.
 The second part is where the transaction acquires all the locks. As soon
as the transaction releases its first lock, the third phase starts.
 In this phase, the transaction cannot demand any new locks; it only
releases the acquired locks.

TWO-PHASE LOCKING 2PL


TWO-PHASE LOCKING 2PL
 Two-phase locking has two phases, one is growing, where all the locks
are being acquired by the transaction; and the second phase is
shrinking, where the locks held by the transaction are being released.
 To claim an exclusive (write) lock, a transaction must first acquire a
shared (read) lock and then upgrade it to an exclusive lock.

CONT……
 The most commonly used concurrency protocol is the timestamp based
protocol. This protocol uses either system time or logical counter as a
timestamp.
 Lock-based protocols manage the order between the conflicting pairs
among transactions at the time of execution, whereas timestamp-based
protocols start working as soon as a transaction is created.
 Every transaction has a timestamp associated with it, and the ordering is
determined by the age of the transaction. A transaction created at 0002
clock time would be older than all other transactions that come after it. For
example, any transaction 'y' entering the system at 0004 is two seconds
younger and the priority would be given to the older one.

TIMESTAMP-BASED PROTOCOLS
 In addition, every data item is given the latest read and write-
timestamp. This lets the system know when the last ‘read and write’
operation was performed on the data item.

CONT….
 Validation Based Protocol is also called Optimistic Concurrency
Control Technique. This protocol is used in DBMS (Database
Management System) for avoiding concurrency in transactions. It is
called optimistic because of the assumption it makes, i.e. very less
interference occurs, therefore, there is no need for checking while the
transaction is executed.

VALIDATION BASED PROTOCOL


 In this technique, no checking is done while the transaction is been
executed. Until the transaction end is reached updates in the
transaction are not applied directly to the database. All updates are
applied to local copies of data items kept for the transaction. At the end
of transaction execution, while execution of the transaction,
a validation phase checks whether any of transaction updates violate
serializability. If there is no violation of serializability the transaction is
committed and the database is updated; or else, the transaction is
updated and then restarted.

CONT…..
 Optimistic Concurrency Control is a three-phase protocol. The three phases for
validation based protocol:

 Read Phase:
Values of committed data items from the database can be read by a transaction.
Updates are only applied to local data versions.

 Validation Phase:
Checking is performed to make sure that there is no violation of serializability when
the transaction updates are applied to the database.

 Write Phase:
On the success of the validation phase, the transaction updates are applied to the
database, otherwise, the updates are discarded and the transaction is slowed down.

VALIDATION BASED PROTOCOL


 1. Avoid Cascading-rollbacks: This validation based scheme avoid
cascading rollbacks since the final write operations to the database are
performed only after the transaction passes the validation phase. If the
transaction fails then no updation operation is performed in the
database. So no dirty read will happen hence possibilities cascading-
rollback would be null.
 2. Avoid deadlock: Since a strict time-stamping based technique is used
to maintain the specific order of transactions. Hence deadlock isn’t
possible in this scheme.

ADVANTAGES:
 1. Starvation: There might be a possibility of starvation for long-term
transactions, due to a sequence of conflicting short-term transactions
that cause the repeated sequence of restarts of the long-term
transactions so on and so forth. To avoid starvation, conflicting
transactions must be temporarily blocked for some time, to let the
long-term transactions to finish.

DISADVANTAGES:
 Granularity: It is the size of data item allowed to lock.
 It can be defined as hierarchically breaking up the database into blocks
which can be locked.
 The Multiple Granularity protocol enhances concurrency and reduces
lock overhead.
 It maintains the track of what to lock and how to lock.
 It makes easy to decide either to lock a data item or to unlock a data
item. This type of hierarchy can be graphically represented as a tree.

MULTIPLE GRANULARITY
 DBMS is a highly complex system with hundreds of transactions being
executed every second. The durability and robustness of a DBMS
depends on its complex architecture and its underlying hardware and
system software.
 If it fails or crashes amid transactions, it is expected that the system
would follow some sort of algorithm or techniques to recover lost data.

RECOVERY FROM TRANSACTION


FAILURES
 A transaction has to abort when it fails to execute or when it reaches a
point from where it can’t go any further. This is called transaction failure
where only a few transactions or processes are hurt.
 Reasons for a transaction failure could be −
 Logical errors − Where a transaction cannot complete because it has some
code error or any internal error condition.
 System errors − Where the database system itself terminates an active
transaction because the DBMS is not able to execute it, or it has to stop
because of some system condition. For example, in case of deadlock or
resource unavailability, the system aborts an active transaction.

TRANSACTION FAILURE
 System Crash
 There are problems − external to the system − that may cause the system
to stop abruptly and cause the system to crash. For example, interruptions
in power supply may cause the failure of underlying hardware or software
failure.
 Examples may include operating system errors.
 Disk Failure
 In early days of technology evolution, it was a common problem where
hard-disk drives or storage drives used to fail frequently.
 Disk failures include formation of bad sectors, unreachability to the disk,
disk head crash or any other failure, which destroys all or a part of disk
storage.

CONT….
 Log-based recovery is a technique used in database management
systems (DBMS) to recover a database to a consistent state in the event
of a failure or crash. It involves the use of transaction logs, which are
records of all the transactions performed on the database.
 In log-based recovery, the DBMS uses the transaction log to reconstruct
the database to a consistent state. The transaction log contains records
of all the changes made to the database, including updates, inserts, and
deletes. It also records information about each transaction, such as its
start and end times.

LOG BASED RECOVERY


 Durability: The log file provides a reliable and durable way to recover data in case of
a failure. It ensures that no committed transaction is lost due to a system crash.
 Faster Recovery: Log-based recovery is usually faster compared to other recovery
techniques, as it only needs to replay the committed transactions from the log file to
recover the database.
 Incremental Backup: Log-based recovery allows for incremental backups. Instead of
taking a full backup of the database every time, only the changes made since the last
backup are stored in the log file.
 Reduces the chances of Data Corruption: Log-based recovery reduces the chances of
data corruption by ensuring that all transactions are properly committed or aborted
before they are written to the database.

ADVANTAGES :
 Additional overhead: Maintaining the log file incurs an additional overhead
on the database system, which can reduce the performance of the system.
 Complexity: Log-based recovery is a complex process that requires careful
management and administration. If not managed properly, it can lead to
data inconsistencies or loss.
 Storage space: The log file can consume a significant amount of storage
space, especially in a database with a large number of transactions.
 Time-Consuming: The process of replaying the transactions from the log
file can be time-consuming, especially if there are a large number of
transactions to recover.

DISADVANTAGES:
 A checkpoint is a process that saves the current state of the database to
disk. This includes all transactions that have been committed, as well as any
changes that have been made to the database but not yet committed. The
checkpoint process also includes a log of all transactions that have
occurred since the last checkpoint. This log is used to recover the database
in the event of a system failure or crash.
 When a checkpoint occurs, the DBMS will write a copy of the current state
of the database to disk. This is done to ensure that the database can be
recovered quickly in the event of a failure. The checkpoint process also
includes a log of all transactions that have occurred since the last
checkpoint. This log is used to recover the database in the event of a
system failure or crash.

CHECKPOINTS
 There are two main types of checkpoints −
 Automatic Checkpoints
 Automatic checkpoints occur at regular intervals, such as every hour or
every day. The interval can be configured by the database
administrator. Automatic checkpoints are useful for large databases that
are constantly being updated, as they ensure that the database can be
recovered quickly in the event of a failure.
 For example, in SQL Server, the default interval for automatic
checkpoints is every minute, but this can be configured to occur at
different intervals.

TYPES OF CHECKPOINTS
 Manual Checkpoints
 Manual checkpoints are triggered by the database administrator, rather
than occurring at regular intervals. Manual checkpoints are useful for
smaller databases that are updated less frequently, as they allow the
administrator to choose when the checkpoint occurs.

CONT…..
 Shadow paging is one of the techniques that is used to recover from
failure. We all know that recovery means to get back the information,
which is lost. It helps to maintain database consistency in case of
failure.

SHADOW PAGING
 Now let see the concept of shadow paging step by step −
 Step 1 − Page is a segment of memory. Page table is an index of pages. Each table entry
points to a page on the disk.
 Step 2 − Two page tables are used during the life of a transaction: the current page table
and the shadow page table. Shadow page table is a copy of the current page table.
 Step 3 − When a transaction starts, both the tables look identical, the current table is
updated for each write operation.
 Step 4 − The shadow page is never changed during the life of the transaction.
 Step 5 − When the current transaction is committed, the shadow page entry becomes a
copy of the current page table entry and the disk block with the old data is released.
 Step 6 − The shadow page table is stored in non-volatile memory. If the system crash
occurs, then the shadow page table is copied to the current page table.

CONCEPT OF SHADOW PAGING


 The advantages of shadow paging are as follows −
 No need for log records.
 No undo/ Redo algorithm.
 Recovery is faster.

ADVANTAGES
 The disadvantages of shadow paging are as follows −
 Data is fragmented or scattered.
 Garbage collection problem. Database pages containing old versions of
modified data need to be garbage collected after every transaction.
 Concurrent transactions are difficult to execute.

DISADVANTAGES
 In a database management system (DBMS), a deadlock occurs when two or
more transactions are waiting for each other to release resources, such as
locks on database objects, that they need to complete their operations. As
a result, none of the transactions can proceed, leading to a situation where
they are stuck or “deadlocked.”
 Deadlocks can happen in multi-user environments when two or more
transactions are running concurrently and try to access the same data in a
different order. When this happens, one transaction may hold a lock on a
resource that another transaction needs, while the second transaction may
hold a lock on a resource that the first transaction needs. Both transactions
are then blocked, waiting for the other to release the resource they need.

DEADLOCK HANDLING
 DBMSs often use various techniques to detect and resolve deadlocks
automatically. These techniques include timeout mechanisms, where a
transaction is forced to release its locks after a certain period of time,
and deadlock detection algorithms, which periodically scan the
transaction log for deadlock cycles and then choose a transaction to
abort to resolve the deadlock.
 It is also possible to prevent deadlocks by careful design of
transactions, such as always acquiring locks in the same order or
releasing locks as soon as possible. Proper design of the database
schema and application can also help to minimize the likelihood of
deadlocks

CONT…..
 In a database, a deadlock is an unwanted situation in which two or
more transactions are waiting indefinitely for one another to give up
locks. Deadlock is said to be one of the most feared complications in
DBMS as it brings the whole system to a Halt.
Example – let us understand the concept of Deadlock with an example
:
Suppose, Transaction T1 holds a lock on some rows in the Students
table and needs to update some rows in the Grades table.
Simultaneously, Transaction T2 holds locks on those very rows (Which
T1 needs to update) in the Grades table but needs to update the rows
in the Student table held by Transaction T1.

CONT…..
 Now, the main problem arises. Transaction T1 will wait for transaction
T2 to give up the lock, and similarly, transaction T2 will wait for
transaction T1 to give up the lock. As a consequence, All activity comes
to a halt and remains at a standstill forever unless the DBMS detects the
deadlock and aborts one of the transactions.

CONT…..
DEADLOCK IN DBMS
THANK YOU

You might also like