CH-4 Concurrency Control
CH-4 Concurrency Control
Concurrency Control
1
Outline
Definition and Purpose
Problem of Concurrency
Lock-based Protocols
Example:
In concurrent execution environment if T1 conflicts with
T2 over a data item A, then the existing concurrency control
decides if T1 or T2 should get the A and if the other
transaction is rolled-back or waits.
4
4.2 Classic problem of Concurrency NB
Many users can perform different operations at the same time.
Both transaction start nearly at the same time and read account
balance of $5000.
T1 will reduce the balance by $1000 write the result at time t4 and T2
will increase the amount by $2000 and write the result after the T1 is
committed.
Finally the account gain extra $1000 which would have been wrong
7
Uncommitted dependency problem.N
Write – read conflict of transaction
Occurs when one transaction is allowed to see the intermediate
results of another transaction before it is committed.
Sequence T1 T2 Balance
01 Begin transaction 5000
02 Read(CA2090) 5000
03 CA2090:= CA2090-1000) 5000
04 Write(CA2090) Begin transaction 4000
05 Read(CA2090) 4000
06 Roll back CA2090:=CA2090+2000 4000
07 Write(CA2090) 6000
08 Commit 6000
8
Incorrect Analysis problem NB
read – write conflict of transaction
So far we have seen problem arise when concurrent
transactions are updating the database.
But problem could arise even when transaction is not
updating the DB.
So if they are allowed to read the DB when the database is in
an inconsistent state.
This problem is often referred to as dirty read or unrepeated
read.
The problem of dirty read occurs when a transaction reads
several values from the DB while other transaction are
updating those values.
9
Cont…
Initially: Acc 1 = 40; Acc2 = 50; Acc3 = 30;
Locking types
Binary Locks
Multiple-mode locks
Two phase locking
13
1) Binary Locks
The lock can have two states (locked, unlocked) or (1,0).
It is simple but restrictive.
A distinct lock is associated with each database item X.
The current value of the lock of item X is LOCK(X).
If LOCK(X) is 0, item X can be accessed when required.
If LOCK(X) is 1, item X cannot be accessed when required.
15
Cont…
Example:
Lock_item(X)
B: if LOCK(X) = 0 (* item is unlocked *)
then LOCK(X) 1 (* lock the item *)
else
begin
wait (until LOCK(X) = 0 and lock manager wakes up the
transaction);
go to B
end;
Unlock_item(X)
LOCK(X) 0; (* unlock the item*)
if any transaction are waiting
then wakeup one of the waiting transactions;
16
Cont…
Rules followed in binary locking by every transaction
1. A transaction T must issue lock_item(X) before any read item(X)
or write_item(X) performed in T.
2. A transaction T must issue unlock_item(X) after all read item(X)
and write_item(X) performed in T.
3. A transaction T will not issue a lock_item(X) if it already holds
the lock on item X.
4. A transaction T will not issue an unlock_item(X) unless it already
holds the lock on item X.
The rules can be forced by the lock manager module of the DBMS
Thus, at most one transaction can hold the lock on an item which
leads to no two transactions can accessed the same item
concurrently.
17
2) Shared/Exclusive Locking
It is called Read/Write locking and a multiple-mode lock.
The lock has three values and data items can be locked in two
modes:
read locked (shared lock): the item is locked for read purpose
and can be shared for reading by another transaction.
write locked (exclusive lock): the item is locked for write
purpose and cannot be accessed by another transaction.
unlocked: the item is unlocked and can be accessed by any
transaction.
The three operations are also indivisible.
Lock requests are made to concurrency control manager.
Transaction can proceed only after request is granted.
18
Cont…
Lock-compatibility matrix
B: if LOCK(X) = “unlocked“
then begin
LOCK(X) “read locked“;
no_of_reads(X) 1
end
else if LOCK(X) = “read locked“
then begin
no_of_reads(X) no_of_reads(X) + 1
else
begin
wait (until LOCK(X) = “unlocked“ and
the lock manager wakes up the transaction);
go to B
21
end;
Cont…
Example Algorithm: Write_lock(X)
B: if LOCK(X) = “unlocked or 0“
then begin
LOCK(X) “write_locked“;
end
else
begin
wait (until LOCK(X) = “unlocked“ and
the lock manager wakes up the transaction);
go to B
end;
22
Cont…
Example Algorithm: Unlock(X)
B: if LOCK(X) = “write_locked“
then begin
LOCK(X) “unlocked“;
wakeup one of the waiting transaction, if any
end
else if LOCK(X) = “read_locked“
then begin
no_of_reads(X) no_of_reads(X) - 1
if no_of_reads(X) = 0
then begin
LOCK(X) = “unlocked“;
wakeup one of the waiting
transaction, if any
end
end;
23
Conversion of Locks NB
A Transaction is allowed under certain conditions to convert the lock
from one state to another.
Upgrading:
Convert the lock from shared to exclusive by issuing
write_lock(X) after its read_lock(X).
The transaction must be the only one has the read lock or it must
wait.
Downgrading:
Convert the lock from exclusive to shared by issuing
read_lock(X) after the write_lock(X).
Notes:
Upgrading and downgrading relax rule 4 and 5 of the
Read/Write locking scheme.
24
Cont… NB
Algorithms for conversion of locks
Lock upgrade: existing read lock to write lock
if Ti has a read-lock (X) and Tj has no read-lock(X) (i j)
then
convert read-lock (X) to write-lock (X)
else
force Ti to wait until Tj unlocks X
25
Cont…
Binary locks or R/W locks will not guarantee serializability
Example 1:
T2:
lock-S(A); read (A); unlock(A);
lock-S(B); read (B); unlock(B);
display(A+B)
X:=X+Y; Y:=X+Y;
write_item(X); write_item(Y);
unlock(X): unlock(Y);
27
Cont…
Result of schedule S
Time X=50, Y=50
(nonserializable)
28
3) Two-phase Locking Protocol (2PL)
A safe locking policy which is based on the simple rule saying a
transaction is not allowed to further lock a data item once it has
already unlocked some data item.
A transaction is said to follow the 2PL protocol if all locking
operations precede the first unlock operation of the transaction.
The transaction is divided into two phases:
Phase 1: Growing or Expanding Phase
Where new locks can be issued and non can be released
Transaction may obtain locks, but may not release locks
Phase 2: Shrinking Phase
Where existing locks can be released and no new locks can be
granted
Transaction may release locks, but may not obtain locks
29
Cont…
• Two-phase locking.
Cont…
Requirement for 2PL:
For a transaction lock and unlock phases must be mutually
exclusively, that is, during locking phase unlocking phase must
not start and during unlocking phase locking phase must not
begin.
2PL limits the concurrency by
Early locking all items even it may not need all of them early
Delaying unlocking all items until locking all the item it needs
even it may not need the locked item.
The protocol assures(guarantees) serializability schedule without
the need of testing it.
It can be proved that the transactions can be serialized in the order
of their lock points (i.e. the point where a transaction acquired its
31
final lock).
Cont… NB
Conservative 2PL: It is called also Static 2PL.
It requires a transaction to lock all the items it accesses before the
transaction begins execution (by declaring its read and write
sets).
If the transaction cannot lock any item, it must wait until lock all
the items.
Not practical because it is not possible to get the read and write
sets in most cases.
The conservative 2PL is a deadlock-free protocol.
33
Cont…
38
Cont…
Solutions for deadlock:
Deadlock Prevention Protocols.
Using the conservative 2PL (not practical).
Ordering the DB items.
Nowaiting Protocols (NW).
Using the concept of transaction Timestamps.
Cautious waiting Protocol.
Timeouts.
Having a system-defined timeout period and a mechanism to
abort the transaction that waits for a period longer than the
predefined timeout
39
Cont…
I) Deadlock Prevention Protocols
1) Using the Conservative 2PL protocol:
The protocol had been discussed and it was shown it is a deadlock-free
protocol.
The protocol is not practical because of the need of the pre-declared
read and write sets.
2) Ordering the DB items:
The protocol needs to order the DB items and enforce the transaction to
lock its items in that order.
It is impractical because it requires that the programmer be aware of the
chosen order of the items.
3) No Waiting Protocol (NW):
If a transaction is unable to obtain a lock, it is immediately aborted and
then restarted after a certain time delay without checking if a deadlock
will actually occur or not.
The protocol can cause transactions to abort and restart needlessly.
40
Cont…
4) The concept of transaction Timestamp:
Used to decide if the transaction involved in a deadlock
situation should wait, abort or preempt another transaction.
The timestamp of a transaction T is TS(T) which is a unique
identifier assigned to the transaction T and is based on the
order in which the transaction T is started.
If T1 started before T2, then TS(T1) < TS(T2)
[T1 is the older and T2 is the younger]
There are two schemes to prevent deadlock which are wait-
die and wound-wait.
The wait-die aborts Ti if it is the younger while wound-wait
aborts Tj if it is the younger.
Both protocols may cause some transactions to be aborted and
41 restarted even they may never actually cause a deadlock.
Cont…
Wait-die:
Suppose that T tries to lock X but it is not able to because X is
i
locked by Tj with a conflicting lock:
The rule is:
If TS(T ) < TS(T ), then T is allowed to wait; otherwise abort
i j i
Ti (Ti dies) and restart it later with the same timestamp.
Wound-wait:
Suppose that T tries to lock X but it is not able to because X is
i
locked by Tj with a conflicting lock:
The rule is:
If TS(T ) < TS(T ), then abort T (T wounds T ) and restart
i j j i j
it later with the same timestamp; otherwise Ti is allowed to
wait.
42
Cont…
5) Cautious Waiting Protocol:
Proposed to reduce the number of needless aborts/restarts.
Suppose that T tries to lock X but it is not able to because X is
i
locked by Tj with a conflicting lock:
The rule is:
If T is not blocked (not waiting for some other locked item),
j
then Ti is blocked and allowed to wait; otherwise abort Ti.
II. Timeouts
It is simple and practical due to its low overhead.
The idea is to have a system-defined timeout period and a
mechanism to abort the transaction that waits for a period longer
than the predefined timeout.
The system assumes that the transaction expired the timeout may
be in a deadlock.
43
Cont…
I) Deadlock Detection Protocols
The concept is to not enforce any restrictions on executing the
transactions, but check if a deadlock actually exists.
It is a more practical set of protocols.
It is a beneficial if the transactions are short or the load is light
(conflicts are not expected to highly exist).
Wait-for graph is a simple way to detect the state of deadlock.
Wait-for graph:- used to check the existing of a deadlock.
How to build it:
1. One node for each currently executing transaction.
2. If Ti is waiting to lock an item X that is currently locked by T j, a
directed edge from Ti to Tj is created.
3. If Tj releases the lock of item that Ti was waiting for, the directed
edge is dropped from the graph.
If the graph has a cycle, the state of the deadlock exists.
44
Cont…
The system must abort some of the transactions when a deadlock
had been detected.
Selecting a transaction to be aborted is known as Victim Selection.
The algorithm of Victim Selection should avoid selecting
transactions that have been running for a long time and that have
performed many updates.
T1’ T2’
read_lock(Y);
read_item(Y);
T1’ T2’
read_lock(X);
Tim
e read_item(X);
Wait-for graph write_lock(X);
write_lock(Y);
Deadlock detection/resolution.
47
4.3.2 Multiple Granularity NB
What is Granularity?
Granularity is the size of the data item chosen as the unit of
protection by a concurrency control program.
The size of the locked item determines the granularity of the
lock.
The data item can be as large as the entire database or it can
be as small a field value of a record.
When the size of the data item is small we say the
granularity is fine & when it is large we say it is coarse.
Fine granularity (lower in tree): high concurrency, high
locking overhead
Coarse granularity (higher in tree): low locking overhead,
low concurrency
48
Cont…
Granularity Locking
Up to now we have considered locking (and execution) at the level
of a single item/row
However there are circumstances at which it is preferable to perform
lock at different level (sets of tuples, relation, or even sets of
relations)
As extreme example consider a transaction that needs to access
to whole database: performing locks tuple by tuple would be
time-consuming
Data item granularity significantly affects concurrency control
performance. Thus, the degree of concurrency is low for coarse
granularity and high for fine granularity.
Allow data items to be of various sizes and define a hierarchy (tree)
of data granularities, where small granularities are nested within
49 larger ones
Cont… NB
Locking can take place at following levels:
Database-level lock:- Entire database is locked
IS IX S SIX X
IS
IX
S
SIX
X
55
Cont…
Granularity of data items and Multiple Granularity Locking: An
example of a serializable execution:
T1 T2 T3
IX(db)
IX(f1)
IX(db)
IS(db)
IS(f1)
IS(p11)
IX(p11)
X(r111)
IX(f1)
X(p12)
S(r11j)
IX(f2)
IX(p21)
IX(r211)
Unlock (r211)
Unlock (p21)
Unlock (f2)
56
S(f2)
Cont…
Granularity of data items and Multiple Granularity Locking: An
example of a serializable execution (continued):
T1 T2 T3
unlock(p12)
unlock(f1)
unlock(db)
unlock(r111)
unlock(p11)
unlock(f1)
unlock(db)
unlock (r111j)
unlock (p11)
unlock (f1)
unlock(f2)
57
unlock(db)
4.3.3 Timestamp-Based Protocol
This protocols using the timestamps to order the execution of
transaction for an equivalent serial schedule to guarantee
serializability instead of determining the order of each operation in a
transaction at execution time.
Definition:
Timestamp TS(T) is a unique identifier created by the DBMS to
identify the transaction.
Each transaction is issued a timestamp when it enters the system.
The timestamp values are assigned in the order in which the
transactions are submitted to the system.
The timestamps can be generated by using a logical counter that is
incremented each time its value is assigned to a transaction or by
58
using the current value of date/time value of the system clock.
Cont…
Since there is no lock, the timestamp concurrency protocols are free
from deadlock and starvation!
If an old transaction Ti has time-stamp TS(Ti), a new transaction
Tj is assigned time-stamp TS(Tj) such that TS(Ti) < TS(Tj).
The protocol manages concurrent execution such that the time-
stamps determine the serializability order.
In order to assure such behavior, the protocol maintains for each
data Q two timestamp values:
W-timestamp(Q) is the largest time-stamp(youngest transaction)
of any transaction that executed write(Q) successfully.
Monotonicity
Ensures that time stamp values always increase
62
Cont…
Example
A partial schedule for several data items for transactions with
timestamps 1, 2, 3, 4, 5
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
abort
read(X)
write(Z)
abort
write(Y)
write(Z)
63
4.3.4
Validation-Based Protocol NB
If majority of transactions are read-only transactions, the rate of
conflicts among transactions may be low.
Protocol requires that each transaction T executes in two or three
different phases in its lifetime, depending on whether it is a read-
only or an update transaction.
The three phases of concurrently executing transactions can be
interleaved, but each transaction must go through the three phases
in that order.
It is a type of optimistic concurrency control
67
and Ti completes its read phase before Tj complete its read phase.
Cont…
Example of schedule produced using validation
T14 T15
read(B)
read(B)
B:- B-50
read(A)
A:- A+50
read(A)
(validate)
display (A+B)
(validate)
write (B)
write (A)
68
Cont…
Reading Assignment
Thomas’s Write Rule
Multiversion sheme
o Multiversion Timestamp Ordering
o Multiversion Two-Phase Locking
69
THE END!
Do you have QUESTION’S ???
70
When we need Concurrency Control
71
Concurrency Control vs. Serializability Tests
Concurrency-control protocols allow concurrent schedules,
but ensure that the schedules are conflict/view serializable,
and are recoverable and cascadeless .
Concurrency control protocols generally do not examine the
precedence graph as it is being created
Instead a protocol imposes a discipline that avoids
nonseralizable schedules.
Different concurrency control protocols provide different
tradeoffs between the amount of concurrency they allow and
the amount of overhead that they incur.
Tests for serializability help us understand why a concurrency
control protocol is correct.
72
Cont…
The R/W locking operations can be implemented by using a lock
table consisting of records of the type
<data item name, lock, no_of_reads, locking-transaction(s)>
73
Timestamps (continued)
Thomas’s Write Rule
It is another variation for the basic TO.
It does not enforce conflict serializability, but it rejects fewer
write operations.
The rule modifies the write_item(X) as follows:
If read_TS(X) > TS(T), then abort and rollback T and reject
the operation.
else If write_TS(X) > TS(T), then do not execute the
operation but continue processing.
else execute write_item(X) and set
write_TS(X)=TS(T).
74
Multiversion Schemes
Up to now we only considered a single copy (the most recent) of each database
item.
Multiversion schemes keep old versions of data item to increase concurrency.
Multiversion Timestamp Ordering
Multiversion Two-Phase Locking
Basic Idea of multiversion schemes
Each successful write results in the creation of a new version of the data
item written.
Use timestamps to label versions.
When a read(Q) operation is issued, select an appropriate version of Q
based on the timestamp of the transaction, and return the value of the
selected version.
reads never have to wait as an appropriate version is returned immediately.
A drawback is that creation of multiple versions increases storage overhead
Garbage collection mechanism may be used…
75
Multiversion Timestamp Ordering
Each data item Q has a sequence of versions <Q1, Q2,...., Qm>. Each version Qk
contains three data fields:
Content -- the value of version Qk.
W-timestamp(Qk) -- timestamp of the transaction that created (wrote) version
Qk
R-timestamp(Qk) -- largest timestamp of a transaction that successfully read
version Qk
when a transaction Ti creates a new version Qk of Q, Qk's W-timestamp and R-
timestamp are initialized to TS(Ti).
R-timestamp of Qk is updated whenever a transaction Tj reads Qk, and TS(Tj) >
R-timestamp(Qk).
76
Multiversion Timestamp Ordering (Cont)
Suppose that transaction Ti issues a read(Q) or write(Q) operation. Let Qk
denote the version of Q whose write timestamp is the largest write timestamp less
than or equal to TS(Ti).
1. If transaction Ti issues a read(Q), then the value returned is the content
of version Qk.
2. If transaction Ti issues a write(Q)
1. if TS(Ti) < R-timestamp(Qk), then transaction Ti is rolled back.
2. if TS(Ti) = W-timestamp(Qk), the contents of Qk are overwritten
3. else a new version of Q is created.
Observe that
Reads always succeed
A write by Ti is rejected if some other transaction Tj that (in the serialization
order defined by the timestamp values) should read
Ti's write, has already read a version created by a transaction older than Ti.
Protocol guarantees serializability
77
Multiversion Two-Phase Locking
Differentiates between read-only transactions and update transactions
Update transactions acquire read and write locks, and hold all locks up to the
end of the transaction. That is, update transactions follow rigorous two-phase
locking.
Each successful write results in the creation of a new version of the data item
written.
each version of a data item has a single timestamp whose value is obtained
from a counter ts-counter that is incremented during commit processing.
Read-only transactions are assigned a timestamp by reading the current value of
ts-counter before they start execution; they follow the multiversion timestamp-
ordering protocol for performing reads.
78
Multiversion Two-Phase Locking (Cont.)
When an update transaction wants to read a data item:
it obtains a shared lock on it, and reads the latest version.
When it wants to write an item
it obtains X lock on; it then creates a new version of the item and sets this
version's timestamp to .
When update transaction Ti completes, commit processing occurs:
Ti sets timestamp on the versions it has created to ts-counter + 1
Ti increments ts-counter by 1
Read-only transactions that start after Ti increments ts-counter will see the
values updated by Ti.
Read-only transactions that start before Ti increments the
ts-counter will see the value before the updates by Ti.
Only serializable schedules are produced.
79
Insert and Delete Operations
If two-phase locking is used :
A delete operation may be performed only if the transaction
deleting the tuple has an exclusive lock on the tuple to be deleted.
A transaction that inserts a new tuple into the database is given
an X-mode lock on the tuple
Insertions and deletions can lead to the phantom phenomenon.
A transaction that scans a relation (e.g., find all accounts in
Perryridge) and a transaction that inserts a tuple in the relation
(e.g., insert a new account at Perryridge) may conflict in spite of
not accessing any tuple in common.
If only tuple locks are used, non-serializable schedules can result:
the scan transaction may not see the new account, yet may be
serialized before the insert transaction.
Insert and Delete Operations (Cont.)
The transaction scanning the relation is reading information that indicates
what tuples the relation contains, while a transaction inserting a tuple
updates the same information.
The information should be locked.
One solution:
Associate a data item with the relation, to represent the information
about what tuples the relation contains.
Transactions scanning the relation acquire a shared lock in the data
item,
Transactions inserting or deleting a tuple acquire an exclusive lock on
the data item. (Note: locks on the data item do not conflict with locks
on individual tuples.)
Above protocol provides very low concurrency for insertions/deletions.
Index locking protocols provide higher concurrency while
preventing the phantom phenomenon, by requiring locks
on certain index buckets.
Weak Levels of Consistency
Some applications are willing to live with weak levels of consistency,
allowing schedules that are not serializable
E.g. a read-only transaction that wants to get an approximate total
balance of all accounts
E.g. database statistics computed for query optimization can be
approximate
Such transactions need not be serializable with respect to other
transactions
Tradeoff accuracy for performance
82