0% found this document useful (0 votes)
12 views

03_Concurrency (1)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

03_Concurrency (1)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Data bases 2

Concurrency Control
2

Concurrency in the DBMS architecture


Query Manager Transaction Manager
begin, commit, abort
select, insert,
Consistency
Concurrency Control System Isolation
delete, update
Lock Tables

Access Method Manager Reliability Manager

fix, unfix

Atomicity
Buffer Manager Durability
read, write

Secondary Store Manager

DB + Log

• The Concurrency Control System


• Manages the simultaneous execution of transactions
• Avoids the insurgence of anomalies
• While ensuring performance
Advantages of Concurrency
T1 begin commit end
T2 b c e
T3 b c e

T4 b c e

T5 b c e t

• Goal: exploit parallelism to maximise transactions


per second (TPS)
Problems due to Concurrency
T1: begin transaction T2: begin transaction
update account update account
set balance = balance + 3 set balance = balance + 6
where customer = 'Smith' where customer = 'Smith'
commit work commit work

Concurrent SQL transactional statements addressing the same resource


T1 : begin transaction T2 : begin transaction
D = D + 3 D = D + 6
commit work commit work

T1 : begin transaction T2 : begin transaction


var x; var y;
read(D  x) read(D  y)
x = x + 3 y = y + 6
write(x  D) write(y  D)
commit work commit work
Serial Executions
bot(T1)
var x; Initially D0=100
r(D  x)
b(T1) c(T1) b(T2) c(T2)
x = x + 3
w(x  D) t
commit(T1) D=100 D=103 D=109

bot(T2)
b(T2) c(T2) b(T1) c(T1)
var y;
r(D  y) t
y = y + 6 D=100 D=106 D=109
w(y  D)
commit(T2)

Note: it is not important that the final value be the same

6
Concurrency Control
• Concurrency is fundamental
• Tens or hundreds of transactions per second cannot be
executed serially
• Examples: banks, ticket reservations

• Problem: concurrent execution may cause


anomalies
• Concurrency needs to be controlled

7
Concurrent Executions
b(T1) e(T1) b(T2) e(T2)
t SERIAL

b(T2) e(T2) b(T1) e(T1)


t SERIAL

b(T1) b(T2)e(T1) e(T2)


t INTERLEAVED

b(T1) b(T2) e(T2) e(T1)


t NESTED

8
Execution with Lost Update
T1 : UPDATE account
D = 100 SET balance = balance + 3

1 T1: r(D x) WHERE client = ‘Smith’

2 T 1: x = x + 3 T2 : UPDATE account

3 T2: r(D  y) SET balance = balance + 6


WHERE client = ‘Smith’
4 T2: y = y + 6
5 T1: w(x  D) D = 103
6 T2: w(y  D) D = 106 !

 Note: this anomaly does not depend merely on T2 overwriting the value produced
by T1
 w1(x), w2(x) is ok (serial)
 r1(x), w1(x), w2(x) is ok too (serial)
 r1(x), r2(x), w1(x), w2(x) is not ok: inconsistent updates from the same initial value
Sequence of I/O actions
producing the error

r1 r2 w1 w2

or

r1 r2 w2 w1

10
Dirty Read
T1 : UPDATE account
SET balance = balance + 3
WHERE client = ‘Smith’

T2 : UPDATE account

D = 100 SET balance = balance + 6


WHERE client = ‘Smith’
1 T1: r(D  x)
2 T1: x = x + 3
3 T1: w(x  D) D = 103
4 T2: r(D  y) this read is “dirty” (uncommitted value)
5 T1: rollback
6 T2: y = y + 6
7 T2: w(y  D) D = 109 !
Nonrepeatable Read
T1 : SELECT balance FROM account
WHERE client = ‘Smith’

D = 100 SELECT balance FROM account


WHERE client = ‘Smith’

T2 : UPDATE account
1 T1: r(D  x) SET balance = balance + 6
WHERE client = ‘Smith’
2 T2: r(D  y)
3 T2: y = y + 6
4 T2: w(y  D) D = 106
5 T1: r(D  z) z <> x !
Phantom Update
Constraint: A+B+C=100, A=50, B=30, C=20

T1: r(A  x), r(B  y)……


T2: r(B  s), r(C  t)
T2: s = s + 10, t = t – 10
T2: w(s  B), w(t  C) (now B=40, C=10, A+B+C=100)
T1: r(C  z) (but, for T1, x+y+z = A+B+C = 90!)

• So for T1 it is as if “somebody else” had updated the value


of the sum
• But for T2 the update is perfectly legal (does not change
the value of the sum)
Tab

Phantom Insert A B

T1: C=AVG(B: A=1) C = select avg(B) from Tab where A=1

T2: Insert (A=1,B=2) insert into Tab values( 1, 2 )

T1: C=AVG(B: A=1) C = …

• Note: this anomaly does not depend on data already


present in the DB when T1 executes, but on a
“phantom” tuple that is inserted by “someone else”
and satisfies the conditions of a previous query of T1
Summary of Anomalies
• Lost update r1 - r2 - w2 - w1
• An update is applied from a state that ignores a preceding
update, which is lost
• Dirty read r1 - w1 - r2 - abort1 - w2
• An uncommitted value is used to update the data
• Nonrepeatable read r1 - r2 - w2 - r1
• Someone else updates a previously read value
• Phantom update r1 - r2 - w2 - r1
• Someone else updates data that contributes to a previously
valid constraint
• Phantom insert r1 - w2(new data) - r1
• Someone else inserts data that contributes to a previously
read datum
Concurrency Theory vs System
Implementation
• Model: an abstraction of a system, object or process, which
purposely disregards details to simplify the investigation of
relevant properties
• Concurrency Theory builds upon a model of transactions
and concurrency control principles that helps understanding
real systems
• Real systems exploit implementation level mechanisms
(locks, snapshots) which help achieve some of the desirable
properties postulated by the theory
• Do not look for a view or conflict serializability checker in your
DBMS
• Look instead for lock tables, lock types, lock granting rules,
snapshots, etc..
• Understand how the implementation mechanisms ensure
properties modelled by the Concurrency Theory
Transactions, Operations,
Schedules
• Operation: a read or write of a specific datum by a
specific transaction
• NB: the schedule notation omits the program variables
• r1(x) = r1(x  .)
• w1(x) = w1(.  x)
• r1(x) and r1(y) are different operations
• r1(x) and r2(x) are different operations
• w1(x) and w2(x) are different operations
• Schedule: a sequence of operations performed by
concurrent transactions that respects the order of
operations of each transaction
• T1: r1(x) w1(x)
• T2: r2(z) w2(z)
• S1: r1(x) r2(z) w1(x) w2(z)
Schedules
• How many distinct schedules exist for two
transactions?
• With T1 and T2 from the previous slide:
N=6
• r1(x) w1(x) r2(z) w2(z)
serial
• r2(z) w2(z) r1(x) w1(x)
• r1(x) r2(z) w1(x) w2(z)
interleaved
• r2(z) r1(x) w2(z) w1(x)
• r1(x) r2(z) w2(z) w1(x)
nested
• r2(z) r1(x) w1(x) w2(z)
Principles of Concurrency Control
• Goal: to reject schedules that cause anomalies
• Scheduler: a component that accepts or rejects the
operations requested by the transactions

• Serial schedule: a schedule in which the actions of


each transaction occur in a contiguous sequence

• S: r0(x) r0(y) w0(x) r1(y) r1(x) w1(y) r2(x) r2(y) r2(z) w2(z)
T0 T1 T2
Principles of Concurrency Control
• In case of non-serial schedules?
S: r0(x) r1(y) r0(y) w0(x) r1(x) r2(x) w1(y) r2(y) r2(z) w2(z)

• Notion of Serializable schedule


• A schedule that leaves the database in the same state
as some serial schedule of the same transactions
• This is commonly accepted as a notion of schedule
correctness
• Requires a notion of schedule equivalence to identify
classes of schedules that ensure serializability
• Different notions  different classes (cost of checking if
schedule is serializable)
Principles of Concurrency Control
• Assumption
• We initially assume that transactions are
observed “a-posteriori” and limited to those
that have committed (commit-projection), and
we decide whether the observed schedule is
admissible
• In practice (and in contrast), schedulers must
make decisions while transactions are running
Basic Idea

All
Serial Schedules
Serializable Schedules
(“good”) (“good but
Schedules unrealistic”)

This schedule may generate anomalies.


Its execution leads to a database state that
no serial execution would produce
Schedules: how many?
• 𝑛𝑛 transactions (T1,T2,…,Ti,…,Tn), each with ki operations
• T1 has k1 operations, Ti has ki operations…
T1: o11 o12 … o1 k1 … Ti: oi1 oi2 … oi ki … Tn: on1 on2 … on kn

• How many serial schedules (NS) exist for n transactions?

NS = 𝑛𝑛! NS = number of permutations


of N transactions

• How many distinct schedules (ND) exist for n transactions? NS << ND

∑𝑛𝑛𝑖𝑖=1 𝑘𝑘𝑖𝑖 ! ND = number of permutations of all operations /


ND = product of the number of permutations of the
∏𝑛𝑛
𝑖𝑖=1(𝑘𝑘𝑖𝑖 !) operations of each transaction (only 1 out of the ki
permutations is valid, the one that respects the
sequence of operations)
View-serializability
• Preliminary definitions:
• ri(x) reads-from wj(x) in a schedule S when wj(x) precedes ri(x)
and there is no wk(x) in S between ri(x) and wj(x)
• wi(x) in a schedule S is a final write if it is the last write on x
that occurs in S
• Two schedules are view-equivalent (Si ≈V Sj) if: they
have 1) the same operations, 2) the same reads-from
relationship, and 3) the same final writes
• A schedule is view-serializable if it is view-equivalent to
a serial schedule of the same transactions
• The class of view-serializable schedules is named VSR
• Mnemonically: S is view-serializable if 1) every read
operation sees the same values and 2) the final value
of each object is written by the same transaction as if
the transactions were executed serially in some order
Examples of view-serializability
• S3: w0(x) r2(x) r1(x) w2(x) w2(z)
• S4: w0(x) r1(x) r2(x) w2(x) w2(z) serial
• r1(x) reads from w0(x); r2(x) reads from w0(x)
• final writes on x: w2(x) on z: w2(z)

• S3 is view-equivalent to serial schedule S4 (so it is view-serializable)

• S5: w0(x) r1(x) w1(x) r2(x) w1(z)


• S6: w0(x) r1(x) w1(x) w1(z) r2(x) serial
• r1(x) reads from w0(x); r2(x) reads from w1(x)
• final writes on x: w1(x) on z: w1(z)

• S5 is view-equivalent to serial schedule S6, so it is also view-serializable


Examples of view-serializability
• S7 : r1(x) r2(x) w1(x) w2(x)
• S8 : r1(x) r2(x) w2(x) r1(x)
• S9 : r1(x) r1(y) r2(z) r2(y) w2(y) w2(z) r1(z)

• S7 corresponds to a lost update


• S8 corresponds to a non-repeatable read
• S9 corresponds to a phantom update
• They are all non view-serializable
(check by testing view equivalence with T1, T2 and with T2, T1)
A More Complex Example
S10 : w0(x) r1(x) w0(z) r1(z) r2(x) w0(y) r3(z) w3(z) w2(y) w1(x) w3(y)
• Is S10 serializable? Yes iff there exists a serial schedule Ss s.t. S10 ≈V Ss

Let’s try with S11 : T0 T1 T2 T3


• w0(x) w0(z) w0(y) r1(x) r1(z) w1(x) r2(x) w2(y) r3(z) w3(z) w3(y)
• In S11 r2(x) reads-from w1(x)
• In S10 r2(x) reads-from w0(x)

Let’s try with S12: T0 T2 T1 T3


And also with S13: T0 T2 T3 T1
A More Complex Example

S10: w0(x) r1(x) w0(z) r1(z) r2(x) w0(y) r3(z) w3(z) w2(y) w1(x) w3(y)

S12: w0(x) w0(z) w0(y) r2(x) w2(y) r1(x) r1(z) w1(x) r3(z) w3(z) w3(y)

reads-from OK: r1(x) from w0(x),


r1(z) from w0(z), S10 ∈ VSR
r2(x) from w0(x),
r3(z) from w0(z),
final writes OK: w1(x), w3(y), w3(z)
A More Complex Example

S10: w0(x) r1(x) w0(z) r1(z) r2(x) w0(y) r3(z) w3(z) w2(y) w1(x) w3(y)

S13: w0(x) w0(z) w0(y) r2(x) w2(y) r3(z) w3(z) w3(y) r1(x) r1(z) w1(x)

reads-from : r1(x) from w0(x), S10 not view-equivalent to T0,T2,T3,T1


r1(z) from w3(z)
Different reads-from relationship
Complexity of view-serializability
• Deciding view-equivalence of two given schedules is
done in polynomial time and space
• Deciding if a generic schedule is in VSR is an NP-
complete problem
• requires considering the reads-from and final writes of all
possible serial schedules with the same operations –
combinatorial in the general case
• Performance!! what can we trade for that?
• ...Accuracy!
• We look for a stricter definition that is easier to check
• May lead to rejecting some schedule that would be
acceptable under view-serializability but not under the
stricter-faster criterion
VSR schedules are "too many"

All
Schedules
VSR
Schedules
? Serial
Schedules
Conflict-serializability
• Preliminary definition:
• Two operations oi and oj (i ≠ j) are in conflict if
they address the same resource and at least one
of them is a write
• read-write conflicts ( r-w or w-r )
• write-write conflicts ( w-w )
Conflict-serializability
• Two schedules are conflict-equivalent (Si ≈C Sj) if :
• Si and Sj contain the same operations and in all the
conflicting pairs the transactions occur in the same
order
• A schedule is conflict-serializable iff it is conflict-
equivalent to a serial schedule of the same
transactions
• The class of conflict-serializable schedules is named
CSR
Relationship between CSR and VSR
• VSR ⊃ CSR : all conflict-serializable schedules are also
view-serializable, but the converse is not necessarily
true

• Proof: there are VSR schedules not in CSR


• Counter-example: we consider r1(x) w2(x) w1(x) w3(x)
that
• is view-serializable: it is view-equivalent to
T1T2T3 = r1(x) w1(x) w2(x) w3(x)
• is not conflict-serializable, due to the presence of
r1(x) w2(x) and w2(x) w1(x)
these two operations cannot be conflict-equivalent to T1T2 or T2T1
CSR implies VSR
• CSR  VSR: conflict-equivalence ≈C implies view-
equivalence ≈V
• We assume S1 ≈C S2 and prove that S1 ≈V S2
• S1 and S2 must have:
• The same final writes: if they didn’t, there would be at
least two writes in a different order, and since two
writes are conflicting operations, the schedules would
not be ≈C
• The same "reads-from" relations: if not, there would be
at least one pair of conflicting operations in a different
order, and therefore, again, ≈C would be violated
CSR and VSR

All
Schedules
VSR CSR Serial
Schedules Schedules Schedules
Testing conflict-serializability
• Is done with a conflict graph that has:
• One node for each transaction Ti
• One arc from Ti to Tj if there exists at least one conflict
between an operation oi of Ti and an operation oj of Tj
such that oi precedes oj
• Theorem:
• A schedule is in CSR if and only if its conflict graph is
acyclic
Testing conflict-serializability
S10 : w0(x) r1(x) w0(z) r1(z) r2(x) w0(y) r3(z) w3(z) w2(y) w1(x) w3(y)

• Resource-based projections:
• x : w0 r1 r2 w1
• y : w0 w2 w3 T1
• z : w0 r1 r3 w3

T0
T2

T3
CSR implies acyclicity of the CG
• Consider a schedule S in CSR. As such, it is ≈C to a
serial schedule
• W.l.o.g. we can (re)label the transactions of S to say
that their order in the serial schedule is: T1 T2 … Tn
• Since the serial schedule has all conflicting pairs in
the same order as schedule S, in the conflict graph
there can only be arcs (i,j), with i<j
• Then the graph is acyclic, as a cycle requires at least
an arc (i,j) with i>j
Acyclicity of the CG implies CSR
• If S’s graph is acyclic then it induces a
topological (partial) ordering on its
nodes, i.e., an ordering such that the
graph only contains arcs (i,j) with i<j. The
same partial order exists on the
transactions of S
T0 T1
• Any serial schedule whose transactions
are ordered according to the partial
order is conflict-equivalent to S, because T2
for all conflicting pairs (i,j) it is always i<j
• In the example before: T0 < T2 < T1 < T3 T3
• In general, there can be many compatible
serial schedules (i.e., many serializations
for the same acyclic graph)
• As many as the total orders compatible
with the partial topological order
Let’s go back... (2)
r1(x) w2(x) w1(x) w3(x)

r1(x) w2(x) r1(x) w2(x)

T1 T1 T2
T2
w2(x) w1(x) w2(x) w1(x)

r1(x) w3(x) r1(x) w3(x)


w2(x) w3(x) w2(x) w3(x)
w1(x) w3(x) w1(x) w3(x)
T3 T3

RW – WR – WW conflicts Reads-from and final write constraints

43
Back to: A More Complex Example

S10: w0(x) r1(x) w0(z) r1(z) r2(x) w0(y) r3(z) w3(z) w2(y) w1(x) w3(y)

• w0(x) r1(x) r2(x) w1(x)  T0<T1, T0<T2, T2<T1 T1


T0
• w0(y) w2(y) w3(y)  T0<T3, T2<T3
T3
• w0(z) r1(z) r3(z) w3(z)  T0<T1, T0<T3, T1<T3 T2

 T0 T2 T1 T3
S12: w0(x) w0(z) w0(y) r2(x) w2(y) r1(x) r1(z) w1(x) r3(z) w3(z) w3(y)
Concurrency Control in Practice
• CSR checking would be efficient if we knew the graph from
the beginning — but we don’t
• A scheduler must rather work “online”, i.e., decide for each
requested operation whether to execute it immediately or
to reject/delay it
• It is not feasible to maintain the conflict graph, update it,
and check its acyclicity at each operation request
• The assumption that concurrency control can work only
with the commit-projection of the schedule is unrealistic,
aborts do occur
• Some simple on-line “decision criterion” is required for the
scheduler, which must
• avoid as many anomalies as possible
• have negligible overhead

45
Arrival sequences vs a posteriori
schedules
• So far the notation
• r1(x) w2(x) w1(x) w3(x)
• represented a “schedule”, which is an a posteriori view of the execution
of concurrent transactions in the DBMS (also called “history” in some
books)
• A schedule represents “what has happened”, “which operations have
been executed by which transaction in which order”
• They can be further restricted by the commit-projection hypothesis to
operations executed by committed transactions
• When dealing with “online” concurrency control, it is important also to
consider “arrival sequences”, i.e., sequences of operation requests
emitted in order by transactions
• With an abuse of notation, we will denote an arrival sequence in the
same way as a posteriori schedule
• r1(x) w2(x) w1(x) w3(x)
• The distinction will be clear from the context
Concurrency control approaches
• How can concurrency control be implemented
“online”?
• Two main families of techniques:
• Pessimistic
• Based on locks, i.e., resource access control
• If a resource is taken, make the requester wait or pre-empt the
holder
• Optimistic
• Based on timestamps and versions
• Serve as many requests as possible, possibly using out-of-date
versions of the data
• We will compare the two families after introducing
their features
• Commercial systems take the best of both worlds
Locking
• It’s the most common method in commercial systems
• A transaction is well-formed w.r.t. locking if
• read operations are preceded by r_lock r_lock1(x) r1(x) unlock1(x)
(aka SHARED LOCK) and followed by unlock
• write operations are preceded by w_lock w_lock1(x) w1(x) unlock1(x)
(aka EXCLUSIVE LOCK) and followed by unlock
• Note: unlocking can be delayed w.r.t. to the end of the read/write
operation
• Transactions that first read and then write an object may: r1(x) w1(x)

• Acquire a w_lock already when reading w_lock1(x) r1(x) w1(x) …


• Acquire a r_lock first and then upgrade it into a w_lock (lock escalation)
• Possible states of an object: r_lock1(x) r1(x) w_lock1(x) w1(x) …
• free
• r-locked (locked by one or more readers)
• w-locked (locked by a writer) 48
Behavior of the Lock Manager
(Conflict Table)
• The lock manager receives the primitives from the
transactions and grants resources according to the
conflict table
• When a lock request is granted, the resource is acquired
• When an unlock is executed, the resource becomes
available
REQUEST RESOURCE STATUS
FREE R_LOCKED W_LOCKED

r_lock OK OK NO
R_LOCKED R_LOCKED (n++) W_LOCKED

OK NO NO n: counter of the
w_lock W_LOCKED R_LOCKED W_LOCKED concurrent readers,
(inc|dec)remented at
unlock OK OK each (r_|un)lock
ERROR
DEPENDS (n--) FREE
Example
• Arrival sequence: r1(x), w1(x), r2(x), r3(y), w1(y), …
• r1(x) x state y state
• r1-lock(x) request  OK  r-locked, nx=1
• w1(x)
• w1-lock(x) request  OK (upgrade)  w-locked
• r2(x)
• r2-lock(x) request  NO because w-locked T2 waits for x
• r3(y)
• r3-lock(y) request  OK  r-locked, ny=1
• T3 unlock(y)  free ny=0
• w1(y)
• w1-lock(y) request  OK  w-locked
• T1 unlock(x)  free nx=0
• T1 unlock(y)  free

• r2(x) was waiting for r2-lock(x) request  r-locked, nx=1


• …
Note: T2 is delayed, it can be executed when T1 unlocks x
How are locks implemented
• Typically by lock tables, which are 4 T1 T3 T7
hash tables indexing the lockable
items via hashing
• Each locked item has a linked list 24
associated with it T8 T2 T6
• Every node in the linked list
represents the transaction which
requested for lock, lock mode
(SL/XL) and current status
(granted/waiting). 45 T5 T9
• Every new lock request for the data
item is appended as a new node to
the list.
• Locks can be applied on both data
and index records
Is respecting locks enough for
serializability?
• Arrival sequence r1(x), r2(x), w2(x), r1(x) x state
• r1(x)
• r1-lock(x) request  OK  r-locked, n=1
• T1 unlock(x)  free, n=0
• r2(x)
• r2-lock(x) request  OK  r-locked, n=1
• w2(x)
• w2-lock(x) request  OK (upgrade)  w-locked
• T2 unlock(x)  free, n=0
• r1(x)
• r1-lock(x) request  OK  r-locked , n=1 ..
• But the schedule mapped by the lock manager does not
eliminate T1’s non repeatable read
• T1 behavior: 1) releases its read lock on x too quickly; 2)
acquires another lock after releasing one lock
Two-Phase Locking (2PL)
• Requirement (two-phase rule):
• A transaction cannot acquire any other lock after
releasing a lock
• Ensures serializability!

number of plateau
resources growing shrinking
locked by Ti phase phase

t
commit-work/rollback-work
53
Serializability
• Consider a scheduler that
• Only processes well-formed transactions
• Grants locks according to the conflict table
• Checks that all transactions apply the two-phase rule
• The class of generated schedules is called 2PL
• Result: schedules in 2PL are both view- and
conflict-serializable
• (VSR ⊃ CSR ⊃ 2PL)

54
CSR, VSR and 2PL

VSR Schedules
All
Schedules

CSR 2PL Serial


Schedules Schedules Schedules
2PL implies CSR
• CSR ⊃ 2PL: Every 2PL schedule is also conflict-serializable
• 2PL  CSR:
• Suppose a 2PL schedule S is not CSR
• S’s graph must contain at least one cycle Ti  Tj  Ti
• Therefore there must be a pair of conflicting operations in reverse order,
suppose that the conflictual operations appear as follows in the schedule
• OPhi(x), OPkj (x) … OPuj(y), OPwi(y) where one of OPhi(x), OPkj (x) is a write and
one of OPuj(y), OPwi(y) is a write
• OPhi(x), OPkj(x) unlock lock

• Ti must have released a lock for Tj to access x


• Later in the schedule… OPuj(y) , OPwi(y)
Ti  Tj  Ti
• Ti must have acquired a lock for the conflict to occur 
CONTRADICTION

• Inclusion of 2PL in VSR descends from VSR ⊃ CSR


• This proves that all 2PL schedules are view-serializable too 56
2PL smaller than CSR
• CSR ⊃ 2PL: Every 2PL schedule is also conflict-
serializable, but the converse is not necessarily true
• Counter-example: r1(x) w1(x) r2(x) w2(x) r3(y) w1(y)
• It violates 2PL
• r1(x) w1(x) r2(x) w2(x) r3(y) w1(y)
• T1 releases T1 acquires
• However, it is conflict-serializable: T3 < T1 < T2
• On x: r1(x) w1(x) r2(x) w2(x)
• On y: r3(y) w1(y) r1(x) w2(x)
r3(y) w1(y)
w1(x) w2(x)

T3 T1 T2
59
A visualization of the 2PL test
Resources on the Y axis and operation times on the X axis

1 2 3 4 5 6
1 1 1 2 2 2

x r1 w1 r2 w2
3 3 1

y r3 w1
2PL and other anomalies
• Nonrepeatable read r1 - r2 - w2 - r1
• Already shown
• Lost update r1 - r2 - w2 - w1
• T1 releases a lock to T2 and then tries to acquire another one
• Phantom update: r1 - r2 - w2 - r1
• T1 releases a lock to T2 and then tries to acquire another one
• Phantom insert: r1 - w2(new data) - r1
• T1 releases a lock to T2 and then tries to acquire another one
(NOTE: T2 does not necessarily write on data already locked
by T1  requires lock on “future data” aka predicate locks)
• Dirty read r1 - w1 - r2 - abort1 - w2
• Requires dealing with abort
Dirty reads are still a menace:
Strict 2PL
• Up to now, we were still using the hypothesis of commit-
projection (no transactions in the schedule abort)
• 2PL, as seen so far, does not protect against dirty (uncommitted
data) reads (and therefore neither do VSR nor CSR)
• Releasing locks before rollbacks exposes “dirty” data
• To remove this hypothesis, we need to add a constraint to
2PL, that defines strict 2PL:
• Locks held by a transaction can be released only after
commit/rollback
• Remember: rollback restores the state prior to the aborted updates
• This version of 2PL is used in most commercial DBMSs when
a high level of isolation is required (see next: SQL isolation
levels)
Strict 2PL in Practice
number of plateau
growing
resources phase shrinking
locked by Ti phase

commit-work/rollback-work

• Strict 2PL locks are also called long duration locks, 2PL locks short duration
locks
• Note: real systems may apply 2PL policies differently to read and write locks
• Typically: long duration strict 2PL write locks, variable policies for read locks
• NOTE: long duration read locks are costly in terms of performances: real
systems replace them with more complex mechanisms
63
How to prevent phantom inserts:
predicate locks Tab
A B
• A phantom insertion occurs when a transaction adds items
to a data set previously read by another transaction
• To prevent phantom inserts a lock should be placed also on
“future data”, i.e., inserted data that would satisfy a
previous query
Example:
• Predicate locks extend the notion of data locks to “future T1: C=AVG(B: A=1)
data” T2: Insert (A=1,B=2)
T1: C=AVG(B: A=1)
• Example: For T1  lock on
• Suppose that transaction T = update Tab set B=1 where A>1 predicate A=1
• Then, the lock is on predicate A>1  T2 cannot Insert
(A=1,B=2)
• Other transactions cannot insert, delete, or update any tuple
satisfying this predicate
• In the worst case (predicate locks not supported):
• The lock extends to the entire table
• In case the implementation supports predicate locks:
• The lock is managed with the help of indexes (gap lock)
64
Isolation Levels in SQL:1999 (and JDBC)
• SQL defines transaction isolation levels which
specify the anomalies that should be prevented by
running at that level
• The level does not affect write locks. A transaction
should always get exclusive lock on any data it
modifies, and hold it until completion (strict 2PL on
write locks), regardless of the isolation level. For
read operations, levels define the degree of
protection from the effects of modifications made
by other transactions

65
Why long duration write locks are
necessary
• Consider the following schedule (admissible if write locks are short
duration) and remove the hypothesis that aborts do not occur
1 1 2 2
• w1[x] ... w2[x]...((c1 or a1) and (c2 or a2) in any order)
• T2 is allowed to write over the same object updated by T1 which has not yet
completed
• If T1 aborts… e.g., w1[x]...w2[x] ... , a1, (c2 or a2)
• How to process event a1?
• If x is restored to the state before T1, T2’s update is lost, so if T2 commits x has a stale
value
• If x is NOT restored and T2 also aborts… then T2’s proper before state cannot be
reinstalled either!

• Thus: write locks are held until the completion of the transaction to
enable the proper processing of abort events
• The anomaly of the above non commit-projection schedule is named
dirty write
Isolation Levels in SQL:1999 (and JDBC)
• READ UNCOMMITTED allows dirty reads, nonrepeatable reads and phantom updates
and inserts:
• No read locks (and ignores locks of other transactions)
• READ COMMITTED prevents dirty reads but allows nonrepeatable reads and
phantom updates/inserts:
• Read locks (and complies with locks of other transactions), but without 2PL on read locks (read
locks are released as soon as the read operation is performed and can be acquired again)
• REPEATABLE READ avoids dirty reads, nonrepeatable reads and phantom updates, but
allows phantom inserts:
• long duration read locks  2PL also for reads
• SERIALIZABLE avoids all anomalies:
• 2PL with predicate locks to avoid phantom inserts
• Note that SQL standard isolation levels dictate minimum requirements, real systems
may go beyond (e.g., in MySQL and Postgres REPEATABLE READ avoids phantom
inserts too) and use different mechanisms (e.g., to avoid long duration read locks)
Dirty Read Non rep. read Phantoms
Read uncommitted Y Y Y
Read committed N Y Y
Repeatable read N N Y (insert)
Serializable N N N
SQL92 serializable <> serial !
• Serializable transactions don't necessarily execute
serially
• The requirement is that transactions can only commit if
the result would be as if they had executed serially in
any order
• The locking requirements to meet this guarantee can
frequently lead to a deadlock (see next slides) where
one of the transactions needs to be rolled back
• Therefore, the SERIALIZABLE isolation level is used
sparingly and is NOT the default in most commercial
systems
SQL isolation levels and locks
 SQL Isolation levels may be implemented with the appropriate
use of locks
 Commercial systems make joint use of locks and of timestamp-
based concurrency control mechanisms
READ LOCKS WRITE LOCKS
READ UNCOMMITTED Not required Well formed writes
Long duration write locks

READ COMMITTED Well formed reads Well formed writes


Short duration read locks (data Long duration write locks
and predicate)
REPEATABLE READ Well formed reads Well formed writes
Long duration data read locks Long duration write locks
Short duration predicate read
locks

SERIALIZABLE Well formed reads Well formed writes


Long duration read locks Long duration write locks
(predicate and data)
Setting transaction characteristics
in SQL
<set transaction statement> ::=
SET [ LOCAL ] TRANSACTION <transaction characteristics>

<transaction characteristics> ::= [ <transaction mode> [ { <comma>


<transaction mode> }... ] ]

<transaction mode> ::= <isolation level> | <transaction access


mode>
| <diagnostics size>

<transaction access mode> ::= READ ONLY| READ WRITE

<isolation level> ::= ISOLATION LEVEL <level of isolation>

<level of isolation> ::= READ UNCOMMITTED | READ COMMITTED


| REPEATABLE READ | SERIALIZABLE
The impact of locking: waiting is
dangerous!
• Transactions requesting locks are either granted
the lock or suspended and queued (first-in first-
out). There is risk of:
• Deadlock: two or more transactions in endless (mutual)
wait
• Typically occurs when each transaction waits for another to
release a lock (in particular: r1 r2 w1 w2 – see next slide)
• Starvation: a single transaction in endless wait
• Typically occurs due to write transactions waiting for resources
that are continuously read (e.g., index roots)

73
Deadlock
• Occurs because concurrent transactions hold and in
turn request resources held by other transactions

• T1 : r1(x) w1(y) holds (SL) T1 requests (XL)

• T2 : r2(y) w2(x) x
y
requests (XL)
T2
holds (SL)

• S: r_lock1(x), r_lock2(y), r1(x), r2(y), w_lock1(y),


w_lock2(x)  deadlock
Deadlock
• Lock graph: a bipartite graph in which nodes are
resources or transactions and arcs are lock requests
or lock assignments
• Wait-for graph: a graph in which nodes are
transactions and arcs are “waits for” relationships
• A deadlock is represented by a cycle in the wait-for
graph of transactions T wait-for 4
holds T1 waits-for graph
requested-by (XL) T3
(XL)
holds T1 waits-for
(SL)
x y T3
waits-for
requested-by (SL)
requested-by waits-for
(SL)
holds
(SL)
T4 T2 T2
Deadlock Resolution Techniques
• Timeout
• Transactions killed after a long wait
• How long?
• Deadlock prevention
• Transactions killed when they COULD BE in a deadlock
• Heuristics
• Deadlock detection
• Transactions killed when they ARE in a deadlock
• Inspection of the wait-for graph

76
1) Timeout Method
• A transaction is killed and restarted after a given
amount of waiting (assumed as due to a deadlock)
• The simplest method, widely used in the past
• The timeout value is system-determined (sometimes it
can be altered by the database administrator)
• The problem is choosing a proper timeout value
• Too long: useless waits whenever deadlocks occur
• Too short: unrequired kills, redo overhead

• http://davebland.com/how-often-does-sql-server-look-
for-deadlocks

78
2) Deadlock Prevention
• Idea: killing transactions that could cause cycles
• Resource-based prevention: restrictions on lock requests
• Transactions request all resources at once, and only once
• Resources are globally sorted and must be requested “in global
order”
• Problem: it’s not easy for transactions to anticipate all requests!
• Transaction-based prevention: restrictions based on
transactions’ IDs
• Assigning IDs to transactions incrementally  transactions’ “age”
• Preventing “older” transactions from waiting for “younger” ones to
end their work
• Options for choosing the transaction to kill
• Preemptive (killing the holding transaction – wound-wait)
• Non-preemptive (killing the requesting transaction – wait-die)
• Problem: too many “killings”! (waiting probability >> deadlock
probability) 79
80

Non-preemptive algorithm: Wait-Die


Conflicts are between a requesting transaction (RT) and a conflicting
transaction (CT), that holds the lock.

Wait-Die Algorithm:
If RT T1 is older than CT T2, then T1 waits, otherwise T1 dies.
This is a non-preemptive algorithm in which RT never forces CT to abort.

T1 T2 T2 T1
(old) (young) (young) (old)

Requests X Holds X Requests X Holds X


WAIT DIES
It is restarted with the same time
(abort + restart) stamp (it becomes older!) –
avoids starvation
The oldest transaction survives!

The young transaction is killed when it (the young transaction)


requests a lock held by the older transaction
81

Preemptive algorithm: Wound-Wait


• Wound-Wait Algorithm:
If RT T1 is older than CT T2, then T1 wounds T2, otherwise T1 waits.
This is a preemptive algorithm.

T1 T2 T2 T1
(old) (young) (young) (old)

Requests X Holds X Requests X Holds X


PREEMPTS WAIT
(T2 is wounded)

The oldest transaction survives!


The young transaction is killed when the old transaction
requests a lock held by the young transaction
3) Deadlock Detection
• Requires an algorithm to detect cycles in the wait-for graph
• Must work with distributed resources efficiently & reliably
• An elegant solution: Obermarck’s algorithm (DB2-IBM,
published on ACM Transactions on Database Systems)
• Assumptions
• Transactions execute on a single main node (one locus of control)
• Transactions may be decomposed in “sub-transactions” running on
other nodes
• Synchronicity: when a transaction spawns a sub-transaction it
suspends work until the latter completes
• Two wait-for relationships:
• Ti waits for Tj on the same node because Ti needs a datum locked by Tj
• A sub-transaction of Ti waits for another sub-transaction of Ti running on a
different node (via external call E)

82
Distributed Deadlock Detection
Distributed dependency graph: external call nodes represent a sub-transaction
activating another sub-transaction at a different node
Node A Node B T3
Call T2b
T2a EB EA

T4
T1a EB EA
Call T1b

• Representation of the status


• at Node A: Eb T2a  T1a  Eb
• at Node B: Ea T1b  T2b Ea
• NOTATION: symbol  : 1) “wait for” relation among local transactions 2) if one term is an external
call, either the source is waited for by a remote transaction or the sink waits for a remote
transaction
• Potential Deadlock: T2a waits for T1a (data lock) that waits for T1b (call) that waits for T2b (data
locks) that waits for T2a (call): cycle!
• Problem: how to detect such an occurrence without maintaining the global view
83
Obermarck’s Algorithm
• Goal: detection of a potential deadlock looking only at the
local view of a node
• Method: establishing a communication protocol whereby
each node has a local projection of the global dependencies
• Nodes exchange information and update their local graph
based on the received information
• Communication is optimized to avoid that multiple nodes
detect the same potential deadlock
• Node A sends its local info to a node B only if
• A contains a transaction Ti that is waited for from another remote transaction and waits
for a transaction Tj active on B
• i>j (this ensure a kind of message “forwarding” along a node path where node A
“precedes” node B if i>j)
• Mnemonically: I send info to you if a distributed transaction listed at me waits for a
distributed transaction listed at you with smaller index

84
Forwarding rule
Node A Node B T3
Call T2b
T2a EB EA

T4
T1a EB
EA
Call T1b

• Node A:
• Activation/wait sequence: Eb  T2  T1  Eb
• i=2, j=1 i>j
• A can dispatch info to B
• Node B: (only distributed transactions count)
• Activation/wait sequence: Ea  T1  T2  Ea
• i=1, j=2
• B does not dispatch info to A
i<j
Obermarck’s Algorithm
• Runs periodically at each node
• Consists of 4 steps
• Get graph info (wait dependencies among transactions
and external calls) from the “previous” nodes.
Sequences contain only node and top-level transaction
identifiers
• Update the local graph by merging the received
information
• Check the existence of cycles among transactions
denoting potential deadlocks: if found, select one
transaction in the cycle and kill it
• Send updated graph info to the “next” nodes
• Propagate also killed transactions
86
Algorithm execution, step 1:
communication
Node A Node B T3
Call T2b
T2a EB EA

T4
T1a EB EA

Call T1b

• Distributed Deadlock Detection: forwarding rule


• at Node A: Eb  T2  T1  Eb info sent to Node B
• at Node B: Ea  T1  T2  Ea info not sent (i<j)

87
Algorithm execution, step 2: local
graph update EB

• at Node B: EA T3

• 1. Eb  T2  T1  Eb is
received T2

• 2. Eb  T2  T1  Eb is T4

added to the local wait-for EA


T1
graph
• 3. Deadlock detected (cycle
among T1 and T2:
• T1 or T2 or T4 killed
(rollback)

88
Algorithm execution: deadlock
resolution EB

Node A Node B
EA T3
Call
///////
T2a EB
///////
T2

T4
T1a EB
EA
Call T1
EB

• For example, T2 killed


• Forward info about killed transaction to node A

Obermarck algorithm (original paper - 1982)


89
Another example
• Initially: at Node A, EC  T3  T2  EB
at Node B, EA  T2  T1  EC
at Node C, EB  T1  T3  EA

90
Another example, continued
Node A Node B Node C

EC T3 EA T2 EB T1

T2 EB T1 EC T3 EA

3 > 2 can send (to B) 2 > 1 can send (to C) 1 < 3 cannot send

91
Another example, continued
Node A Node B Node C

EC T3 EC T3 EA T2

T2 EB EA T2 EB EB T1 EC

T1 EC T3 EA

3 > 1 can send (to C) 2 < 3 cannot send


Ec T3 T1  Ec

92
Another example, continued
Node A Node B Node C

EC T3 EC T3 EA T2

T2 EB EA T2 EB EB T1 EC

T1 EC EC T3 EA

Cycle detected!

93
Obermarck: immateriality of
conventions
• There are two arbitrary choices in the algorithm:
• Send messages only if: (1) i > j vs. (2) i < j
• Send them to: (a) the following node vs. (b) the
preceding node
• Therefore, there are four versions/variants of the
algorithm
• (1+a), (1+b), (2+a), (2+b)
• The sequence of the sent messages is different
• However, they all identify deadlocks (if present)

94
Deadlocks in practice
• Their probability is much less than the conflict probability
• Consider a file with n records and two transactions doing two
accesses to their records (uniform distribution); then:
• Conflict probability is O(1/n): ∑ i=1..n (1/n * 1/n) = n * (1/n * 1/n) = 1/n
• Deadlock probability is O(1/n2) : T1 conflicts with T2 AND vice versa
• Still, they do occur (once every minute in a mid-size bank)
• The probability is linear in the number of transactions, quadratic in
their length (measured by the number of lock requests)
• Shorter transactions are healthier (ceteris paribus)
• There are techniques to limit the frequency of deadlocks
• Update Lock, Hierarchical Lock, …,

95
96

Update lock
• The most frequent deadlock occurs when 2 concurrent
transactions start by reading the same resources (SL) and
then decide to write and try to upgrade their lock to XL
• To avoid this situation, systems offer the UPDATE LOCK (UL)
– asked by transactions that will read and then write
Resource status
Request free SL UL XL
SL OK OK OK No
UL OK OK No No
XL OK No No No

• Update locks are easy to implement and mitigate the most frequent
cause of collision: r1(x) r2(x) w1(x) w2(x)
• They are requested by using SQL SELECT FOR UPDATE statement
97

Update lock
r1(x) r2(x) w1(x) w2(x) Locks for r1(x)w1(x)
Sequence of lock requests with SL and XL only: SL(upgrade)XL
SL1 (granted), SL2 (granted), XL1 (T1 waits) , XL2 (T2 waits)
Deadlock!
Resource status
Request free SL UL XL
SL OK OK OK No
UL OK OK No No
XL OK No No No

Locks for r1(x)w1(x)


Sequence of lock requests with UL: UL(upgrade)XL

UL1 (granted), UL2 (T2 waits), XL1 (granted), …


No deadlocks!
SELECT statement in MySQL
Rows examined by the query
are write-locked until the end of
the current transaction

The set shared locks permit other


transactions to read the examined
rows but not to update or delete
them

NOWAIT causes a FOR UPDATE or


FOR SHARE query to execute
immediately, returning an error if a
row lock cannot be obtained due
to a lock held by another
transaction

SKIP LOCKED causes a FOR UPDATE


or FOR SHARE query to execute
immediately, excluding rows from the
result set that are locked by another
transaction.
# Session 1:
mysql> CREATE TABLE t (i INT, PRIMARY KEY (i)) ENGINE = InnoDB;
mysql> INSERT INTO t (i) VALUES (1),(2),(3);
mysql> START TRANSACTION;
mysql> SELECT * FROM t WHERE i = 2 FOR UPDATE;
+---+
|i|
+---+
|2|
+---+
# Session 2:
mysql> START TRANSACTION;
mysql> SELECT * FROM t WHERE i = 2 FOR UPDATE NOWAIT;
ERROR 3572 (HY000): Do not wait for lock.
// execute a command only if it doesn't run into a conflict

# Session 3:
mysql> START TRANSACTION;
mysql> SELECT * FROM t FOR UPDATE SKIP LOCKED;
+---+ // you get unclaimed records – only committed records
| i | // when avoiding conflicts is more important than getting all the rows
+---+ // queue management: e.g., queue of tasks to consume
|1| FROM: https://dev.mysql.com/doc/refman/8.0/en/innodb-
|3| locking-reads.html
+---+
Hierarchical Locking
• Update locks prudentially extend the
interval during which a resource is
locked file
• What to lock? An entire table?  Coarser
reduces concurrency too much Granularity
• Locks can be specified with different page
granularities
• e.g.: schema, table, fragment, page,
tuple, field
• Objectives: tuple
• Locking the minimum amount of data Increased
• Recognizing conflicts as soon as
possible Concurrency
value
• Method: asking locks on hierarchical
resouces by:
• Requesting resources top-down until
the right level is obtained
• Releasing locks bottom-up
Intention Locking Scheme
• 5 Lock modes:
• In addition to read (SHARED) locks (SL) and write
(EXCLUSIVE) locks (XL)
• The new modes express the “intention” of locking
at lower (finer) levels of granularity
• ISL: Intention of locking a subelement of the current
element in shared mode
• IXL: Intention of locking a subelement of the current
element in exclusive mode
• SIXL: Lock of the element in shared mode with intention
of locking a subelement in exclusive mode (SL+IXL)
Hierarchical Locking Protocol
• Locks are requested starting from the root (e.g., starting
from the whole table) and going down in the hierarchy
• Locks are released starting from the locked resource and
going up in the hierarchy
• To request an SL or ISL lock on a non-root element, a
transaction must hold an equally or more restrictive lock
(ISL or IXL) on its “parent”
• To request an IXL, XL or SIXL lock on a non-root element, a
transaction must hold an equally or more restrictive lock
(SIXL or IXL) on its “parent”
• When a lock is requested on a resource, the lock manager
decides based on the rules specified in the hierarchical lock
granting table
Hierarchical lock granting table

Resource state
Request free ISL IXL SL SIXL XL
ISL OK OK OK OK OK No
IXL OK OK OK No No No
SL OK OK No OK No No
SIXL OK OK No No No No
XL OK No No No No No
104

Example
Root = TableX
P1 P2 Page 1 (P1): t1,t2,t3,t4 tuples
t1 t5 Page 2 (P2): t5,t6,t7,t8 tuples
t2 t6 Locks
T1
Transaction 1: read(P1)
t3 t7
write(t3)
t4 t8
read(t8)

P1 P2
Transaction 2: read(t2)
read(t4)
t1 t5
Locks
write(t5)
t2 t6
T2 write(t6)
t3 t7
t4 t8 They are NOT in r-w conflict!
(indipendently of the order)
106

Lock Sequences - Transaction 1

Transaction 1:
ISL root +ISL
+IXL
 IXL
IXL root P1 ... t3 ... P2 ... t8
SL read(P1)
+IXL P1 P2 ISL
 SIXL ISL1 SL1
t1 t2 t3 t4 t5 t6 t7 t8
XL SL write(t3)
IXL1 SIXL1 XL1

read(t8)
IXL1 SIXL1 XL1 ISL1 SL1
107

Lock Sequences – Transaction 2


Transaction 2:

root P1 .. t2 .. t4 P2 .. t5 t6
ISL root +IXL  IXL
read(t2)
ISL2 ISL2 SL2
ISL P1 P2 IXL
read(t4)
t1 t2 t3 t4 t5 t6 t7 t8 ISL2 ISL2 SL2 SL2
SL SL XL XL

write(t5)
IXL2 ISL2 SL2 SL2 IXL2 XL2

write(t6)
IXL2 ISL2 SL2 SL2 IXL2 XL2 XL2
108

Lock Sequences – interleaved


execution

ISL2 +ISL1 root P1 .. t2 .. t4 P2 .. t5 t6


root +IXL1  IXL1
read2(t2)
ISL2 ISL2 ISL2 SL2
+SL1 P1 P2 IXL1
read1(P1)
t1 t2 t3 t4 t5 t6 t7 t8 ISL1,2 ISL2, SL2
SL2 SL2 XL1 SL1
write1(t5) ISL2, ISL ,
2 SL2 IXL1 XL1
IXL1 SL1

read2(t4) ISL2, ISL2,


SL2 SL2 IXL1 XL1
IXL1 SL1

109

Lock Sequences – let’s change t5 into t3

ISL2 +ISL1 root P1 .. t2 .. t4 P2 .. t5 t6


root +IXL1  IXL1
read2(t2)
ISL2 ISL2 ISL2 SL2
+SL1 P1
+IXL1 P2
SIXL1 read1(P1)
ISL2,
t1 t2 t3 t4 t5 t6 t7 t8 ISL1,2 SL2
SL2 XL1 SL1
write1(t3) ISL2, ISL2,
SL2 XL1
IXL1 SIXL 1
110

Lock Sequences – let’s change w1 into w2

+IXL2  IXL2
ISL2 +ISL1 root P1 .. t2 .. t4 P2 .. t5 t6
root
+IXL2?
read2(t2)
ISL2 ISL2 ISL2 SL2
+SL1 P1
P2
read1(P1)
ISL2,
t1 t2 t3 t4 t5 t6 t7 t8 ISL1,2 SL2
SL2 SL1
write2(t3) ISL1, ISL2,
SL SL2
Conflict! IXL2 (IXL2)
1
T2 waits!
Concurrency Control Based on
Timestamps
• Locking is also named pessimistic concurrency control because
it assumes that collisions (transactions reading-writing the same
object concurrently) will arise
• Assumption: conflicts occur  lock the records to prevent them
• Alternative and complementary to 2PL (and to locking in
general) are optimistic concurrency control methods
• Assumption: conflicts are rare  run the transaction and validate the
operations before commit (normal validation) or before each operation
(early validation)
• Timestamp:
• Identifier that defines a total ordering of the events of a system
• Each transaction has a timestamp representing the time at
which the transaction begins so that transactions can be
ordered by “birth date”: smaller index  older transaction
• A schedule is accepted only if it reflects the serial ordering of
the transactions induced by their timestamps
TS concurrency control principles
• The scheduler has two counters: RTM(x) and WTM(x) for each
object e.g., after r2(x), r3(x), r1(x) RTM=3
• RTM (x) = timestamp of the transaction with the highest ts that has read x
• WTM (x) = timestamp of the transaction that did the last write on x

• The scheduler receives read/write requests tagged with the


timestamp of the requesting transaction:
• rts(x): e.g., w1(x), w3(x), r2(x)
• If ts < WTM(x) the request is rejected and the transaction is killed
• Else, access is granted and RTM(x) is set to max(RTM(x), ts)
• wts(x):
• If ts < RTM(x) or ts < WTM(x) the request is rejected and the transaction is killed
• Else, access is granted and WTM(x) is set to ts
e.g., w2(x), w1(x)
or r2(x), w1(x)
• Many transactions are killed
Example
Assume RTM(x) = 7  previously: r7(x)
WTM(x) = 4  previously: w4(x)

Request Response RTM(x) WTM(x)


7 4
r6(x) ok (6>4)
ok (8>4) 8
r8(x)
ok (9>4) 9
r9(x)
no (8<9) T8 killed
w8(x)
ok (11>9 and 11
w11(x) 11>4)
r10(x) no (10<11) T10 killed
113
2PL vs. TS
They are incomparable
• Schedule in TS but not in 2PL
• r1(x) w1(x) r2(x) w2(x) r0(y) w1(y)
1 1

• Schedule in 2PL but not in TS


• r2(x) w2(x) r1(x) w1(x)
• This is serial!
• Schedule both in TS and in 2PL (and not serial)
• r1(x) r2(y) w2(y) w1(x) r2(x) w2(x)
1

114
TS and CSR
• TS => CSR
• Let S be a TS schedule of T1 and T2
• Suppose S is not CSR, which implies that it contains a cycle
between T1 and T2
• S contains op1(x), op2(x) where at least one of the opi is a write
• S contains also op2(y), op1(y) where at least one of the opi is a
write
• When op1(y) arrives:
• If op1(y) is a read, T1 is killed by TS because it tries to read a value
written by a younger transaction [ts < WTM(y)] [1 < 2] 
CONTRADICTION
• If op1(y) is a write, T1 is killed no matter what op2(y) is because it
tries to write a value already read or written by a younger
transaction [ts < R/WTM(y)] [1 < 2]  CONTRADICTION
CSR, VSR, 2PL and TS
All schedules

VSR 2PL
CSR TS Serial
TS and dirty reads
• Basic TS-based control considers only committed
transactions in the schedule, aborted transactions
are not considered (commit-projection hypothesis)
• If aborts occur, dirty reads may happen
• To cope with dirty reads, a variant of basic TS must
be used e.g., w1(x), r2(x) T1 aborts!
• A transaction Ti that issues a rts(x) such that ts > WTM(x)
(i.e., acceptable) has its read operation delayed until the
transaction T’ that wrote the values of x has committed
or aborted
• Similar to long duration write locks
• But…buffering operations introduces delays
2PL vs. TS
• The serialization order with 2PL is imposed by conflicts,
while in TS it is imposed by the timestamps
• In 2PL transactions can be actively waiting. In TS they
are killed and restarted
• The necessity of waiting for commit of transactions
causes long delays in strict 2PL
• 2PL can cause deadlocks, TS can be used to prevent
deadlocks with the wound-wait and wait-die schemes
• Restarting a transaction costs more than waiting: 2PL
wins!
• Commercial systems implement a mix of optimistic and
pessimistic concurrency control (e.g., Strict 2PL or 2PL +
Multi Version TS)
Reducing kill rate: Thomas Rule
• The scheduler has two counters: RTM(x) and WTM(x) for each
object
• The scheduler receives read/write requests tagged with
timestamps:
• rts(x):
• If ts < WTM(x) the request is rejected and the transaction is killed
• Else, access is granted and RTM(x) is set to max(RTM(x), ts)
ts < RTM(x)
• wts(x):
• If ts < RTM(x) the request is rejected and the transaction is killed
e.g., r2(x), w1(x)
• Else, if ts < WTM(x) then our write is "obsolete": it can be skipped ts < WTM(x)
• Else, access is granted and WTM(x) is set to ts
e.g., w2(x), w1(x)

• Rationale: skipping a write on an object that has already been


written by a younger transaction, without killing the transaction
• Does this modification affect the taxonomy of the serialization
classes?
TS’ (TS with Thomas Rule)
All schedules
?

VSR 2PL

CSR TS Serial

x: r2 w3 Not CSR!
r1(y) r2(x) w3(y) w2(y) w3(x) w4(y)
y: r1 w3 w2 w4
w2(y) is an obsolete write and thus skipped
It reflects the behavior of: r1(y) r2(x) w2(y) w3(y) w3(x) w4(y) 120
TS’ (TS with Thomas Rule)
? All schedules

VSR 2PL

TS Serial
CSR

w2(x) w1(x) r2(x)


• w1(x) skipped by Thomas Rule
• The schedule is not in VSR
w1(x) w2(x) r2(x)
• T1, T2 different final write relation
• T2, T1 different reads-from relation for r2(x) w2(x) r2(x) w1121
(x)
Multiversion Concurrency Control
• Idea: writes generate new versions, reads access
the “right” version
• Writes generate new copies, each one with a new
WTM. Each object x always has N>=1 active
versions
• The i-th version of x has its write timestamp WTMi(x)
• There is a unique global RTM(x)
• Old versions are discarded when there are no
transactions that need their values

122
Example in theory:
TS-Multi allowing unordered writes
• Mechanism:
• rts(x) is always accepted. A copy xk is selected for
reading such that:
• If ts >= WTMN(x), then k = N
• Else take k such that WTMk(x) <= ts < WTMk+1(x)
• wts(x):
• If ts < RTM(x) the request is rejected
• Else a new version is created for timestamp ts (N is
incremented)
• WTM1(x), …, WTMN(x) are the new versions, kept sorted
from oldest to youngest
• NB: this version shows what can be done in theory but is
not the one used in the exercises
Example in theory:
TS-Multi allowing unordered writes
Assume RTM(x) = 7, N=1, WTM1(x) = 4
Request Response RTM(x) WTM(x)
r6(x) ok 7 WTM1(x) = 4, N=1
r8(x) ok 8
r9(x) ok 9
w8(x) no - T8 killed
w11(x) ok WTM2(x) = 11, N=2
r10(x) ok on x1 (not killed) 10
r12(x) ok on x2 12
w14(x) ok WTM3(x) = 14, N=3
w13(x) ok (not killed) WTM3(x)=13, N=3 requires
resorting
WTM4(x)=14, N=4
2nd version (used in practice):
TS-Multi under Snapshot Isolation
• Mechanism:
• rts(x) is always accepted. A copy xk is selected for
reading such that:
• If ts >= WTMN(x), then k = N
• Else take k such that WTMk(x) <= ts < WTMk+1(x)
• wts(x):
• If ts < RTM(x) or ts < WTMN(x) the request is rejected
• Else a new version is created for timestamp ts (N is
incremented)
• WTM1(x), …, WTMN(x) are the new versions, kept sorted
from oldest to youngest
• NB: this version is used in real systems based, e.g., on
snapshot isolation (see later) and in the exercises
Example in practice (for the exam):
TS-Multi under Snapshot Isolation
Assume RTM(x) = 7, N=1, WTM1(x) = 4
Request Response RTM(X) WTM(X)
r6(x) ok 7 WTM1(x) = 4, N=1
r8(x) ok 8
r9(x) ok 9
w8(x) no - T8 killed
w11(x) ok WTM2(x) = 11, N=2
r10(x) ok on x1 10
r12(x) ok on x2 12
w14(x) ok WTM3(x) = 14, N=3
w13(x) no - T13 killed*
* in the exercises!!!
CSR, VSR, 2PL, TSmono, TSmulti
? All
TS(multi)

2PL
TS(mono)
VSR
Serial
CSR

X : w1 w2 r1
• Versions: X0 (original value), X1, X2
• r1 reads X1 [WTMk(x) <= ts < WTMk+1(x)]
• The schedule is not in VSR
• T1, T2 different reads-from relation for r1(x) w1 r1 w2
• T2, T1 different final write relation w2 w1 r1 127
Snapshot Isolation (SI)
• The realization of multi-TS gives the opportunity to introduce into
DBMSs (e.g., Oracle, MySQL, PostgreSQL, MongoDB, Microsoft SQL
Server) another isolation level, SNAPSHOT ISOLATION
• In this level, no RTM is used on the objects, only WTMs
• Every transaction reads the version consistent with its timestamp (i.e.,
the version that existed when the transaction started a.k.a. snapshot),
and defers writes to the end
• Write: when a transaction attempts to write e.g., on a row, it first checks
whether any other transactions have modified that row since it began. If
there has been a modification (i.e., if the snapshot view is no longer
valid), the transaction is rolled back or retried, depending on the
DBMS's implementation.
• Read operations in a transaction do not block write operations in other
transactions. Transactions can read data without waiting for other
transactions to complete, improving performance and concurrency.
• It is yet another case of optimistic concurrency control
Anomalies in Snapshot Isolation
• Snapshot isolation does not guarantee serializability
• T1: update Balls set Color=White where Color=Black
• T2: update Balls set Color=Black where Color=White
• Serializable executions of T1 and T2 will produce a final
configuration with balls that are either all white or all
black
• An execution under Snapshot Isolation in which the
two transactions start with the same snapshot will just
swap the two colors
• This anomaly is called write skew
Assigning timestamps in
distributed systems
• Timestamp: an indicator of the “current time”
• Assumption: no “global time” is available
• Mechanism: a system’s function gives out timestamps on requests
• Syntax: timestamp = event-id.node-id
• event-ids are unique at each node
• Note that the notion of time is “lexical”:
timestamp 5.1 “occurs before” timestamp 5.2
• Synchronization: send-receive of messages
• for a given message m, send(m) precedes receive(m)
• Algorithm (Lamport method): cannot receive a message from “the
future”, if this happens the “bumping rule” is used to bump the
timestamp of the receive event beyond the timestamp of the send
event
• Mnemonically: if I receive a message from you that has a timestamp
greater than my last emitted one I update my current timestamp to
exceed yours
Example of timestamp assignment
A(1.1) B(2.1) C(3.1) D(4.1) E(5.1) F(6.1) G(8.1) H(9.1) I(10.1)
A B C D H
E F G I

node1 ok! 7.1<7.2

node2 ok! 4.2<5.1 8.2<10.1

Y Z T U V
X P

X(1.2) Y(2.2) Z(3.2) T(5.2) P(6.2) U(7.2) V(10.2)

• Events Y and F represent messages received “from the present


or past”  OK the local timestamp is aligned without bumping
• Events T, G and V represent messages received “from the
future”  The local timestamp is incremented (bumped)
accordingly leaving a “hole” in the local event sequence
• Note that the timestamp of the receive events T, G and V is
generated so as to exceed that of the send event: G(8.1)>U(7.2)
131
Concurrency control in some
commercial DBMS
• ORACLE
• https://docs.oracle.com/cd/B19306_01/server.102/b14220/consist.htm

• MYSQL
• https://dev.mysql.com/doc/refman/8.0/en/locking-issues.html

• IBM DB2
• https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/co
m.ibm.db2.luw.admin.perf.doc/doc/c0054923.html

• MONGODB
• https://docs.mongodb.com/manual/faq/concurrency/

You might also like