Dbms Module5 Notes
Dbms Module5 Notes
Module 5
Chapter 1: Concurrency Control in Databases
Example:
In concurrent execution environment if T1 conflicts with T2 over a data item A, then the
existing concurrency control decides if T1 or T2 should get the A and if the other
transaction is rolled-back or waits.
The concept of locking data items is one of the main techniques used for controlling the
concurrent execution of transactions.
A lock is a variable associated with a data item in the database. Generally there is a lock
for each data item in the database.
A lock describes the status of the data item with respect to possible operations that can be
applied to that item.
It is used for synchronizing the access by concurrent transactions to the database items.
A transaction locks an object before using it
When an object is locked by another transaction, the requesting transaction must wait
1. Binary Locks
A binary lock can have two states or values: locked and unlocked (or 1
and 0).
If the value of the lock on X is 1, item X cannot be accessed by a database
operation that requests the item
23
Database Management System [BCS403]
lock_item(X):
B: if LOCK(X) = 0 (* item is unlocked *)
then LOCK(X (* lock the item *)
else
begin
wait (until LOCK(X) = 0
and the lock manager wakes up the transaction);
go to B
end;
unlock_item(X):
LOCK(X) 0; (* unlock the item *)
if any transactions are waiting
then wakeup one of the waiting transactions;
24
Database Management System [BCS403]
The lock_item and unlock_item operations must be implemented as indivisible units that
is, no interleaving should be allowed once a lock or unlock operation is started until the
operation terminates or the transaction waits
The wait command within the lock_item(X) operation is usually implemented by putting
the transaction in a waiting queue for item X until X is unlocked and the transaction can
be granted access to it
Other transactions that also want to access X are placed in the same queue.Hence, the
wait command is considered to be outside the lock_item operation.
It is quite simple to implement a binary lock; all that is needed is a binary-valued variable,
LOCK, associated with each data item X in the database
In its simplest form, each lock can be a record with three fields: <Data_item_name,
LOCK, Locking_transaction> plus a queue for transactions that are waiting to access the
item
If the simple binary locking scheme described here is used, every transaction must obey
the following rules:
1. A transaction T must issue the operation lock_item(X) before any
read_item(X) or write_item(X) operations are performed in T.
2. A transaction T must issue the operation
unlock_item(X) after all read_item(X) and write_item(X) operations are
completed in T.
3. A transaction T will not issue a lock_item(X) operation if it already holds the lock
on item X.
4. A transaction T will not issue an unlock_item(X) operation unless it already holds
the lock on item X.
25
Database Management System [BCS403]
A read-locked item is also called share-locked because other transactions are allowed
to read the item, whereas a write-locked item is called exclusive-locked because a
single transaction exclusively holds the lock on the item
Method to implement read/write lock is to
- keep track of the number of transactions that hold a shared (read) lock
on an item in the lock table
- Each record in the lock table will have four fields:
<Data_item_name, LOCK, No_of_reads, Locking_transaction(s)>.
If LOCK(X)=write-locked, the value of locking_transaction(s) is a single transaction that
holds the exclusive (write) lock on X
If LOCK(X)=read-locked, the value of locking transaction(s) is a list of one or more
transactions that hold the shared (read) lock on X.
26
Database Management System [BCS403]
When we use the shared/exclusive locking scheme, the system must enforce the following
rules:
3 A transaction T must issue the operation unlock(X) after all read_item(X) and
write_item(X) operations are completed in T.3
4. A transaction T will not issue a read_lock(X) operation if it already holds a read (shared)
lock or a write (exclusive) lock on item X.
Conversion of Locks
A transaction that already holds a lock on item X is allowed under certain conditions to
convert the lock from one locked state to another
For example, it is possible for a transaction T to issue a read_lock(X) and then later to
upgrade the lock by issuing a write_lock(X) operation
- If T is the only transaction holding a read lock on X at the time it issues
the write_lock(X) operation, the lock can be upgraded;otherwise, the
transaction must wait
27
Database Management System [BCS403]
A transaction is said to follow the two-phase locking protocol if all locking operations
(read_lock, write_lock) precede the first unlock operation in the transaction
Such a transaction can be divided into two phases:
Expanding or growing (first) phase, during which new locks on items can be
acquired but none can be released
Shrinking (second) phase, during which existing locks can be released but no
new locks can be acquired
If lock conversion is allowed, then upgrading of locks (from read-locked to write-locked)
must be done during the expanding phase, and downgrading of locks (from write-locked
to read-locked) must be done in the shrinking phase.
Transactions T1 and T2 in Figure 22.3(a) do not follow the two-phase locking protocol
because the write_lock(X) operation follows the unlock(Y) operation in T1, and similarly
the write_lock(Y) operation follows the unlock(X) operation in T2.
28
Database Management System [BCS403]
Basic 2PL
Technique described previously
Conservative (static) 2PL
Requires a transaction to lock all the items it accesses before the transaction
begins execution by predeclaring read-set and write-set
Its Deadlock-free protocol
29
Database Management System [BCS403]
Strict 2PL
guarantees strict schedules
Transaction does not release exclusive locks until after it commits or aborts
no other transaction can read or write an item that is written by T unless T has
committed, leading to a strict schedule for recoverability
Strict 2PL is not deadlock-free
Rigorous 2PL
guarantees strict schedules
Transaction does not release any locks until after it commits or aborts
easier to implement than strict 2PL
Figure 22.5 Illustrating the deadlock problem (a) A partial schedule of T and T that is in a
state of deadlock (b) A wait-for graph for the partial schedule in (a)
30
Database Management System [BCS403]
31
Database Management System [BCS403]
Both schemes end up aborting the younger of the two transactions (the transaction that
started later) that may be involved in a deadlock, assuming that this will waste less
processing.
It can be shown that these two techniques are deadlock-free, since in wait-die,
transactions only wait for younger transactions so no cycle is created.
Similarly, in wound-wait, transactions only wait for older transactions so no cycle is
created.
Another group of protocols that prevent deadlock do not require timestamps. These
include the
no waiting (NW) and
cautious waiting (CW) algorithms
No waiting algorithm,
if a transaction is unable to obtain a lock, it is immediately aborted and then restarted
after a certain time delay without checking whether a deadlock will actually occur or
not.
no transaction ever waits, so no deadlock will occur
this scheme can cause transactions to abort and restart needlessly
cautious waiting
try to reduce the number of needless aborts/restarts
Suppose that transaction Ti tries to lock an item X but is not able to do so because
X is locked by some other transaction Tj with a conflicting lock.
The cautious waiting rules are as follows:
If Tj is not blocked (not waiting for some other locked item), then Ti is
blocked and allowed to wait; otherwise abort Ti.
It can be shown that cautious waiting is deadlock-free, because no transaction will
ever wait for another blocked transaction.
32
Database Management System [BCS403]
This can happen if the transactions are short and each transaction locks only a few items,
or if the transaction load is light.
On the other hand, if transactions are long and each transaction uses many items, or if
the transaction load is quite heavy, it may be advantageous to use a deadlock prevention
scheme.
A simple way to detect a state of deadlock is for the system to construct and maintain a
wait-for graph.
One node is created in the wait-for graph for each transaction that is currently executing.
Whenever a transaction Ti is waiting to lock an item X that is currently locked by a
transaction Tj, a directed edge (Ti Tj) is created in the wait-for graph.
When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is
dropped from the wait-for graph.We have a state of deadlock if and only if the wait-for
graph has a cycle.
One problem with this approach is the matter of determining when the system should
check for a deadlock.
One possibility is to check for a cycle every time an edge is added to the wait-for graph,
but this may cause excessive overhead.
Criteria such as the number of currently executing transactions or the period of time
several transactions have been waiting to lock items may be used instead to check for a
cycle. Figure 22.5(b) shows the wait-for graph for the (partial) schedule shown in Figure
22.5(a).
If the system is in a state of deadlock, some of the transactions causing the deadlock must
be aborted.
Choosing which transactions to abort is known as victim selection.
The algorithm for victim selection should generally avoid selecting transactions that have
been running for a long time and that have performed many updates, and it should try
instead to select transactions that have not made many changes (younger transactions).
Timeouts
Another simple scheme to deal with deadlock is the use of timeouts.
This method is practical because of its low overhead and simplicity.
In this method, if a transaction waits for a period longer than a system-defined
timeout period, the system assumes that the transaction may be deadlocked and
aborts it regardless of whether a deadlock actually exists or not.
33
Database Management System [BCS403]
Starvation.
Another problem that may occur when we use locking is starvation, which occurs
when a transaction cannot proceed for an indefinite period of time while other
transactions in the system continue normally.
This may occur if the waiting scheme for locked items is unfair, giving priority to some
transactions over others
One solution for starvation is to have a fair waiting scheme, such as using a first-
come-first-served queue; transactions are enabled to lock an item in the order in
which they originally requested the lock.
Another scheme allows some transactions to have priority over others but increases
the priority of a transaction the longer it waits, until it eventually gets thehighest
priority and proceeds.
Starvation can also occur because of victim selection if the algorithm selects the
same transaction as victim repeatedly, thus causing it to abort and never finish
execution.
The algorithm can use higher priorities for transactions that have been aborted
multiple times to avoid this problem.
34
Database Management System [BCS403]
... in this scheme. A computer counter has a finite maximum value, so the
system must periodically reset the counter to zero when no transactions are
executing for some short period of time.
Another way to implement timestamps is to use the current date/time value of
the system clock and ensure that no two timestamp values are generated
during the same tick of the clock.
5.13.2 The Timestamp Ordering Algorithm
The idea for this scheme is to order the transactions based on their
timestamps.
A schedule in which the transactions participate is then serializable, and the
only equivalent serial schedule permitted has the transactions in order of their
timestamp values. This is called timestamp ordering (TO).
The algorithm must ensure that, for each item accessed by conflicting
Operations in the schedule, the order in which the item is accessed does not
violate the timestamp order.
To do this, the algorithm associates with each database item X two timestamp
(TS) values:
1. read_TS(X). The read timestamp of item X is the largest timestamp
among all the timestamps of transactions that have successfully read
item X that is, read_TS(X) = TS(T), where T is the youngest
transaction that has read X successfully.
2. write_TS(X). The write timestamp of item X is the largest of all the
timestamps of transactions that have successfully written item X
that is, write_TS(X) = TS(T), where T is the youngest transaction that
has written X successfully.
Basic Timestamp Ordering (TO).
Whenever some transaction T tries to issue a read_item(X) or a write_item(X) operation,
the basic TO algorithm compares the timestamp of T with read_TS(X) and write_TS(X) to
ensure that the timestamp order of transaction execution is not violated.
If this order is violated, then transaction T is aborted and resubmitted to the system as a
new transaction with a new timestamp.
If T is aborted and rolled back, any transaction T1 that may have used a value written by T
must also be rolled back.
35
Database Management System [BCS403]
Similarly, any transaction T2 that may have used a value written by T1 must also be rolled
back, and so on. This effect is known as cascading rollback and is one of the problems
associated with basic TO, since the schedules produced are not guaranteed to be
recoverable.
An additional protocol must be enforced to ensure that the schedules are recoverable,
cascadeless, or strict.
The basic TO algorithm :
The concurrency control algorithm must check whether conflicting operations violate
the timestamp ordering in the following two cases:
1. Whenever a transaction T issues a write_item(X) operation, the following is checked:
a. If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then abort and roll back T and
reject the operation. This should be done because some younger transaction with
a timestamp greater than TS(T) and hence after T in the timestamp ordering
has already read or written the value of item X before T had a chance to write X,
thus violating the timestamp ordering.
b. If the condition in part (a) does not occur, then execute the write_item(X) operation
of T and set write_TS(X) to TS(T).
2. Whenever a transaction T issues a read_item(X) operation, the following is checked:
a. If write_TS(X) > TS(T), then abort and roll back T and reject the operation. This
should be done because some younger transaction with timestamp greater than
TS(T) and hence after T in the timestamp ordering has already written the value
of item X before T had a chance to read X.
b. If write_TS(X T), then execute the read_item(X) operation of T and set
read_TS(X) to the larger of TS(T) and the current read_TS(X).
Whenever the basic TO algorithm detects two conflicting operations that occur in
the incorrect order, it rejects the later of the two operations by aborting the
transaction that issued it. The schedules produced by basic TO are hence
guaranteed to be conflict serializable
Strict Timestamp Ordering (TO)
A variation of basic TO called strict TO ensures that the schedules are both strict
(for easy recoverability) and (conflict) serializable.
36
Database Management System [BCS403]
Write Rule
A modification of the basic TO algorithm, known as , does not
enforce conflict serializability, but it rejects fewer write operations by modifying the
checks for the write_item(X) operation as follows:
1. If read_TS(X) > TS(T), then abort and roll back T and reject the operation.
2. If write_TS(X) > TS(T), then do not execute the write operation but continue
processing. This is because some transaction with timestamp greater than TS(T)
and hence after T in the timestamp ordering has already written the value of X.
Thus, we must ignore the write_item(X) operation of T because it is already outdated
and obsolete. Notice that any conflict arising from this situation would be detected by
case (1).
If neither the condition in part (1) nor the condition in part (2) occurs, then execute
the write_item(X) operation of T and set write_TS(X) to TS(T).
37
Database Management System [BCS403]
Hence, the state of LOCK(X) for an item X can be one of read-locked, writelocked,
certify-locked, or unlocked
We can describe the relationship between read and write locks in the standard
scheme by means of the lock compatibility table shown in Figure 22.6(a)
An entry of Yes means that if a transaction T holds the type of lock specified in the
column header on item X and if transaction T_ requests the type of lock specified in
38
Database Management System [BCS403]
the row header on the same item X, then T_ can obtain the lock because the locking
modes are compatible
Figure 22.6: Lock compatibility tables. (a) A compatibility table for read/write locking scheme.
(b) A compatibility table for read/write/certify locking scheme.
On the other hand, an entry of No in the table indicates that the locks are not compatible,
so T must wait until T releases the lock
The idea behind multiversion 2PL is to allow other transactions T to read an item X
while a single transaction T holds a write lock on X
This is accomplished by allowing two versions for each item X; one version must always
have been written by some committed transaction
The second version X is created when a transaction T acquires a write lock on the item
39
Database Management System [BCS403]
During transaction execution, all updates are applied to local copies of the data items
that are kept for the transaction
At the end of transaction execution, a validation phase checks whether any of the
updates violate serializability.
There are three phases for this concurrency control protocol:
1. Read phase. A transaction can read values of committed data items from the
database. However, updates are applied only to local copies (versions) of the data
items kept in the transaction workspace.
2. Validation phase. Checking is performed to ensure that serializability will not be
violated if the transaction updates are applied to the database.
3. Write phase. If the validation phase is successful, the transaction updates are applied
to the database; otherwise, the updates are discarded and the transaction isrestarted.
The idea behind optimistic concurrency control is to do all the checks at once; hence,
transaction execution proceeds with a minimum of overhead until the validation phase is
reached
The techniques are called optimistic because they assume that little interference will occur
and hence that there is no need to do checking during transaction execution.
The validation phase for Ti checks that, for each such transaction Tj that is either
committed or is in its validation phase, one of the following conditions holds:
1. Transaction Tj completes its write phase before Ti starts its read phase.
2. Ti starts its write phase after Tj completes its write phase, and the read_set
of Ti has no items in common with the write_set of Tj.
3. Both the read_set and write_set of Ti have no items in common with the
write_set of Tj, and Tj completes its read phase before Ti completes its read
phase.
40
Database Management System [BCS403]
The size of data items is often called the data item granularity.
Fine granularity refers to small item sizes, whereas coarse granularity refers to large
item sizes
The larger the data item size is, the lower the degree of concurrency permitted.
For example, if the data item size is a disk block, a transaction T that needs to lock a
record B must lock the whole disk block X that contains B because a lock is associated
with the whole data item (block). Now, if another transaction S wants to lock a different
record C that happens to reside in the same block X in a conflicting lock mode, it is forced
to wait. If the data item size was a single record, transaction S would be able to proceed,
because it would be locking a different data item (record).
The smaller the data item size is, the more the number of items in the database. Because
every item is associated with a lock, the system will have a larger number of active locks
to be handled by the lock manager. More lock and unlock operations will be performed,
causing a higher overhead
The best item size depends on the types of transactions involved.
If a typical transaction accesses a small number of records, it is advantageous to have the
data item granularity be one record
On the other hand, if a transaction typically accesses many records in the same file, it
may be better to have block or file granularity so that the transaction will consider all
those records as one (or a few) data items
41
Database Management System [BCS403]
Figure 22.7 A granularity hierarchy for illustrating multiple granularity level locking
To make multiple granularity level locking practical, additional types of locks, called
intention locks, are needed
The idea behind intention locks is for a transaction to indicate, along the path from the root
to the desired node, what type of lock (shared or exclusive) it will require from one of the
descendants.
There are three types of intention locks:
1. Intention-shared (IS) indicates that one or more shared locks will be requested on some
descendant node(s).
2. Intention-exclusive (IX) indicates that one or more exclusive locks will be requested on
some descendant node(s).
3. Shared-intention-exclusive (SIX) indicates that the current node is locked in shared
mode but that one or more exclusive locks will be requested on some descendant
node(s).
The compatibility table of the three intention locks, and the shared and exclusive locks, is
shown in Figure 22.8.
42
Database Management System [BCS403]
The multiple granularity locking (MGL) protocol consists of the following rules:
1. The lock compatibility (based on Figure 22.8) must be adhered to.
2. The root of the tree must be locked first, in any mode.
3. A node N can be locked by a transaction T in S or IS mode only if the parent
node N is already locked by transaction T in either IS or IX mode.
4. A node N can be locked by a transaction T in X, IX, or SIX mode only if the
parent of node N is already locked by transaction T in either IX or SIX mode.
5. A transaction T can lock a node only if it has not unlocked any node (to
enforce the 2PL protocol).
A transaction T can unlock a node, N, only if none of the children of node N
are currently locked by T.
The multiple granularity level protocol is especially suited when processing a mix of
transactions that include
(1) short transactions that access only a few items (records or fields) and
(2) long transactions that access entire files.
43