18csc303j Dbms Unit-V Updated
18csc303j Dbms Unit-V Updated
1
• Understand the practical problems of concurrency
Course control and gain knowledge about failures and
Learnin recovery
g
Rational
e CLR - 6
Concurrency control
Deadlock
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement
– If the transaction fails after step 3 and before step 6, money will be “lost” leading to an inconsistent
database state
• Failure could be due to software or hardware
– The system should ensure that updates of a partially executed transaction are not reflected in the database
• Durability requirement — once the user has been notified that the transaction has completed (i.e., the transfer
of the $50 has taken place), the updates to the database by the transaction must persist even if there are
– e.g., sum of balances of all accounts, minus sum of loan amounts must
equal value of cash-in-hand
• A transaction, when starting to execute, must see a consistent database.
• During transaction execution the database may be temporarily inconsistent.
• When the transaction completes successfully the database must be consistent
– Erroneous transaction logic can lead to inconsistency
Required Properties of a Transaction (Cont.)
• Isolation requirement — if between steps 3 and 6 (of the fund transfer transaction) ,
another transaction T2 is allowed to access the partially updated database, it will see an
inconsistent database (the sum A + B will be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B)
• Isolation can be ensured trivially by running transactions serially
– That is, one after the other.
• However, executing multiple transactions concurrently has significant benefits, as we
will see later.
Required Properties of a Transaction (Cont.)
• Isolation requirement
• Let X= 500, Y = 500.
Consider two transactions T and T”. Suppose T has been executed till Read (Y) and
then T’’ starts. As a result , interleaving of operations takes place due to which T’’ reads
correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions
must take place in isolation and changes should be visible only after they have been
made to the main memory.
ACID Properties
A transaction is a unit of program execution that accesses and possibly updates various
data items. To preserve the integrity of data the database system must ensure:
• Atomicity. Either all operations of the transaction are properly reflected in the
database or none are.
• Consistency. Execution of a transaction in isolation preserves the consistency of the
database.
• Isolation. Although multiple transactions may execute concurrently, each transaction
must be unaware of other concurrently executing transactions. Intermediate
transaction results must be hidden from other concurrently executed transactions.
– That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution
• Durability. After a transaction completes successfully, the changes it has made to the
database persist, even if there are system failures.
Transaction State
• Active – the initial state; the transaction stays in this state while it is executing
• Failed -- after the discovery that normal execution can no longer proceed.
• Aborted – after the transaction has been rolled back and the database restored
to its state prior to the start of the transaction. Two options after it has been
aborted:
– Restart the transaction
• can be done only if no internal logical error
• System recovery
Concurrent Executions
Advantages are:
transaction throughput
E.g., one transaction can be using the CPU while another is reading from
– A schedule for a set of transactions must consist of all instructions of those transactions
– Must preserve the order in which the instructions appear in each individual transaction.
• A transaction that successfully completes its execution will have a commit instructions as the
last statement
• A transaction that fails to successfully complete its execution will have an abort instruction
conflict equivalent.
serial schedule
Conflict Serializability (Cont.)
• Schedule 3 can be transformed into Schedule 6, a serial schedule
instructions.
Schedule 3 Schedule 6
Conflict Serializability (Cont.)
• Example of a schedule that is not conflict serializable:
either the serial schedule < T3, T4 >, or the serial schedule < T4, T3
>.
Transaction States
Summary
• Transaction Concepts
• Properties of Transaction
• Transaction States
Concurrent Executions
– Reduced average response time for transactions: short transactions need not wait
behind long ones.
• A transaction that fails to successfully complete its execution will have an abort
instruction as the last statement
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.
• An example of a serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule in which T2 is followed by T1 :
Schedule 3
• Let T1 and T2 be the transactions defined previously. The following schedule is
not a serial schedule, but it is equivalent to Schedule 1.
• With concurrent transactions, all transactions share a single disk buffer and a single
log
– A buffer block can have data items updated by one or more transactions
– Can be ensured by obtaining exclusive locks on updated items and holding the locks till
end of transaction (strict two-phase locking)
• Undo of a log record <Ti, X, V1, V2> writes the old value V1 to X
• Redo of a log record <Ti, X, V1, V2> writes the new value V2 to X
• each time a data item X is restored to its old value V a special log record <Ti , X, V> is written out
– redo(Ti) sets the value of all data items updated by Ti to the new values, going forward from the
• but does not contain either the record <Ti commit> or <Ti abort>.
• Note that If transaction Ti was undone earlier and the <Ti abort> record written to the log,
(a) undo (T0): B is restored to 2000 and A to 1000, and log records
(b) redo (T0) and undo (T1): A and B are set to 950 and 2050 and C is restored to 700. Log records <T1, C, 700>, <T1, abort> are
written out.
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
1. exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X
instruction.
2. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction.
• Lock requests are made to the concurrency-control manager by the programmer. Transaction can
• Lock-compatibility matrix
• A transaction may be granted a lock on an item if the requested lock is compatible with
locks already held on the item by other transactions
• Any number of transactions can hold shared locks on an item,
– But if any transaction holds an exclusive on the item no other transaction may hold any lock on
the item.
• If a lock cannot be granted, the requesting transaction is made to wait till all incompatible
locks held by other transactions have been released. The lock is then granted.
Lock-Based Protocols (Cont.)
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
• Locking as above is not sufficient to guarantee serializability — if A and B get updated in-between
the read of A and B, the displayed sum would be wrong.
• A locking protocol is a set of rules followed by all transactions while requesting and releasing
locks. Locking protocols restrict the set of possible schedules.
The Two-Phase Locking Protocol
• The protocol assures serializability. It can be proved that the transactions can be serialized in the
order of their lock points (i.e., the point where a transaction acquired its final lock).
The Two-Phase Locking Protocol (Cont.)
locking is used.
– Given a transaction Ti that does not follow two-phase locking, we can find a transaction
Tj that uses two-phase locking, and a schedule for Ti and Tj that is not conflict
serializable.
Lock Conversions
– First Phase:
– can acquire a lock-S on item
– Second Phase:
– can release a lock-S
• This protocol assures serializability. But still relies on the programmer to insert the
various locking instructions.
Automatic Acquisition of Locks
• A transaction Ti issues the standard read/write instruction, without explicit locking calls.
if Ti has a lock on D
then
read(D)
else begin
if necessary wait until no other
transaction has a lock-X on D
grant Ti a lock-S on D;
read(D)
end
Automatic Acquisition of Locks (Cont.)
• Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while
• The potential for deadlock exists in most locking protocols. Deadlocks are a necessary evil.
• Cascading roll-back is possible under two-phase locking. To avoid this, follow a modified protocol
called strict two-phase locking -- a transaction must hold all its exclusive locks till it
commits/aborts.
• Rigorous two-phase locking is even stricter. Here, all locks are held till commit/abort. In this
• Deadlock prevention protocols ensure that the system will never enter into a
– Require that each transaction locks all its data items before it begins execution
(predeclaration).
– Impose partial ordering of all data items and require that a transaction can lock data
• If Ti Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for
• When Ti requests a data item currently being held by Tj, then the edge Ti Tj is
inserted in the wait-for graph. This edge is removed only when Tj is no longer holding a
• The system is in a deadlock state if and only if the wait-for graph has a cycle. Must
invoke a deadlock-detection algorithm periodically to look for cycles.
Deadlock Detection (Cont.)
• More effective to roll back transaction only as far as necessary to break deadlock.
• When a transaction locks a node in the tree explicitly, it implicitly locks all the
node's descendents in the same mode.
• Granularity of locking (level in tree where locking is done):
– fine granularity (lower in tree): high concurrency, high locking overhead
granularity:
– intention-shared (IS): indicates explicit locking at a lower level of the tree but only with shared
locks.
– intention-exclusive (IX): indicates explicit locking at a lower level with exclusive or shared locks
– shared and intention-exclusive (SIX): the sub-tree rooted by that node is locked explicitly in
shared mode and explicit locking is being done at a lower level with exclusive-mode locks.
• intention locks allow a higher level node to be locked in S or X mode without having to
3. A node Q can be locked by Ti in S or IS mode only if the parent of Q is currently locked by Ti in either
IX or IS mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the parent of Q is currently locked by Ti in
either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any node (that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are currently locked by Ti.
• Observe that locks are acquired in root-to-leaf order, whereas they are released in leaf-to-
root order.
• Lock granularity escalation: in case there are too many locks at a particular level, switch to
higher granularity S or X lock
Recovery System
Recovery System
• Failure Classification
• Storage Structure
• Recovery and Atomicity
• Log-Based Recovery
• Remote Backup Systems
Failure Classification
• Transaction failure :
– Logical errors: transaction cannot complete due to some internal error condition
– System errors: the database system must terminate an active transaction due to an error
• System crash: a power failure or other hardware or software failure causes the
system to crash.
– Fail-stop assumption: non-volatile storage contents are assumed to not be corrupted by
system crash
• Database systems have numerous integrity checks to prevent corruption of disk data
• Disk failure: a head crash or similar disk failure destroys all or part of disk storage
– Destruction is assumed to be detectable: disk drives use checksums to detect failures
Recovery Algorithms
• Consider transaction Ti that transfers $50 from account A to account B
– Two updates: subtract 50 from A and add 50 to B
• Transaction Ti requires updates to A and B to be output to the database.
– A failure may occur after one of these modifications have been made but before
both of them are made.
– Modifying the database without ensuring that the transaction will commit may
leave the database in an inconsistent state
– Not modifying the database may result in lost updates if failure occurs just after
transaction commits
• Recovery algorithms have two parts
1. Actions taken during normal transaction processing to ensure enough information
exists to recover from failures
2. Actions taken after a failure to recover the database contents to a state that ensures
atomicity, consistency and durability
Storage Structure
• Volatile storage:
– does not survive system crashes
– examples: main memory, cache memory
• Nonvolatile storage:
– survives system crashes
– examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
– but may still fail, losing data
• Stable storage:
– a mythical form of storage that survives all failures
– approximated by maintaining multiple copies on distinct nonvolatile media
– See book for more details on how to implement stable storage
Stable-Storage Implementation
• Maintain multiple copies of each block on separate disks
– copies can be at remote sites to protect against disasters such as fire or flooding.
• Failure during data transfer can still result in inconsistent copies: Block transfer can result in
– Successful completion
– Partial failure: destination block has incorrect information
– Total failure: destination block was never updated
• Protecting storage media from failure during data transfer (one solution):
– Execute output operation as follows (assuming two copies of each block):
• Block movements between disk and main memory are initiated through the
following two operations:
– input(B) transfers the physical block B to main memory.
– output(B) transfers the buffer block B to the disk, and replaces the appropriate
physical block there.
• We assume, for simplicity, that each data item fits in, and is stored inside, a
single block.
Example of Data Access
Data Access (Cont.)
• Each transaction Ti has its private work-area in which local copies of all data
items accessed and updated by it are kept.
– Ti's local copy of a data item X is called xi.
• Transferring data items between system buffer blocks and its private work-
area done by:
– read(X) assigns the value of data item X to the local variable xi.
– write(X) assigns the value of local variable xi to data item {X} in the buffer block.
– Note: output(BX) need not immediately follow write(X). System can perform the
output operation when it deems fit.
• Transactions
– Must perform read(X) before accessing X for the first time (subsequent reads can
be from local copy)
– write(X) can be executed at any time before the transaction commits
Recovery and Atomicity
• To ensure atomicity despite failures, we first output information describing
the modifications to stable storage without modifying the database itself.
• We study log-based recovery mechanisms in detail
– We first present key concepts
– And then present the actual recovery algorithm
• Less used alternative: shadow-copy and shadow-paging
shadow-copy
Log-Based Recovery
• A log is kept on stable storage.
– The log is a sequence of log records, and maintains a record of update activities on the database.
is written, where V1 is the value of X before the write (the old value), and V2 is the value to be written to X
(the new value).
• When Ti finishes it last statement, the log record <Ti commit> is written.
– (Will see later that how to postpone log record output to some extent)
• Output of updated blocks to stable storage can take place at any time before or after
transaction commit
• Order in which blocks are output can be different from the order in which they are
written.
• The deferred-modification scheme performs updates to buffer/disk only at the time of
transaction commit
– Simplifies some aspects of recovery
– all previous log records of the transaction must have been output already
<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start> BC output before T1
<T1, C, 700, 600> commits
C = 600
BB , BC
<T1 commit>
BA BA output after T0
commits
• Note: BX denotes block containing X.
Checkpoints
2. we might unnecessarily redo transactions which have already output their updates
to the database.
3. Write a log record < checkpoint L> onto stable storage where L is a list of all
transactions active at the time of checkpoint.
– All updates are stopped while doing checkpointing
Checkpoints (Cont.)
• During recovery we need to consider only the most recent transaction Ti that started before
1. Scan backwards from end of log to find the most recent <checkpoint L> record
– Only transactions that are in L or started after the checkpoint need to be redone or
undone
– Transactions that committed or aborted before the checkpoint already have all their
updates output to stable storage.
• Some earlier part of the log may be needed for undo operations
1. Continue scanning backwards till a record <Ti start> is found for every transaction Ti in L.
– Parts of log prior to earliest <Ti start> record above are not needed for recovery, and can
be erased whenever desired.
Example of Checkpoints
Tc Tf
T1
T2
T3
T4
– Scan log backwards from the end, and for each log record of Ti of the form <Ti, Xj, V1, V2>
– Once the record <Ti start> is found stop the scan and write the log record <Ti abort>
Recovery Algorithm (Cont.)
1. Whenever a log record <Ti, Xj, V1, V2> is found where Ti is in undo-list perform same actions as
– Log records are output to stable storage when a block of log records in the buffer is full, or
• Log force is performed to commit a transaction by forcing all its log records (including
• Several log records can thus be output using a single output operation, reducing the
I/O cost.
Log Record Buffering (Cont.)
– Transaction Ti enters the commit state only when the log record
– Before a block of data in main memory is output to the database, all log records
pertaining to data in that block must have been output to stable storage.
• This rule is called the write-ahead logging or WAL rule
– Strictly speaking WAL only requires undo information to be output