Lecture Notes-DBMS IV&v Unit
Lecture Notes-DBMS IV&v Unit
Q) What is the transaction system? List the ACID properties of transactions. Discuss the transaction
failures also. (10 marks)
Transaction Concept
A transaction is a unit of program execution that accesses and possibly updates various data items.
A transaction must see a consistent database.
During transaction execution the database may be inconsistent.
When the transaction is committed, the database must be consistent.
Two main issues to deal with:
o Failures of various kinds, such as hardware failures and system crashes
o Concurrent execution of multiple transactions
ACID Properties
Atomicity: Either all operations of the transaction are properly reflected in the database or none
are.
Consistency: Execution of a transaction in isolation preserves the consistency of the database.
Isolation. Although multiple transactions may execute concurrently, each transaction must be
unaware of other concurrently executing transactions. Intermediate transaction results must be
hidden from other concurrently executed transactions. That is, for every pair of transactions Ti and
Tj, it appears to Ti that either Tj, finished execution before Ti started, or Tj started execution after Ti
finished.
Durability: After a transaction completes successfully, the changes it has made to the database
persist, even if there are system failures.
Concurrent Executions
Multiple transactions are allowed to run concurrently in the system. Advantages are:
o increased processor and disk utilization, leading to better transaction throughput: one
transaction can be using the CPU while another is reading from or writing to the disk
o reduced average response time for transactions: short transactions need not wait behind
long ones.
Concurrency control schemes – mechanisms to achieve isolation, i.e., to control the interaction
among the concurrent transactions in order to prevent them from destroying the consistency of the
database
Schedules – When the transactions are executed concurrently in interleaved fashion then the order
of execution of transactions is called schedule.
Serial Schedule -A schedule is said to be serial if, transactions are executed serially. i.e. one after
another. Other wise it is said to be non-Serial.
Example Schedules
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. The following is a
serial schedule, in which T1 is followed by T2.
Schedule 1
Let T1 and T2 be the transactions defined previously. The following schedule is not a serial
schedule, but it is equivalent to Schedule 1, the sum A + B is preserved
Since, when the schedules are executed non-serially, it may produce wrong results. So
serializabilty is used to determine which of the non-serial schedules produce correct results
and which
Serializability of Schedules
Serializable Schedule -A schedule S of n transactions is serializable if it is equivalent to some serial
schedule of the same n transactions.
1. Conflict serializability
2. View serializability
Two Instructions in the schedule are said to conflict (equivalent) if the following three
conditions holds.
Schedule 3 Schedule 1
We are unable to swap instructions in the above schedule to obtain either the serial schedule < T3, T4 >, or
the serial schedule < T4, T3 >.
Schedule 3 below can be transformed into Schedule 1, a serial schedule where T2 follows T1, by
series of swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable.
iii) The schedule S is serializable if and only if the precedence graph has no cycles.
Since above schedule is view equivalent to serial schedule <T3, T4, T6>, so it is view
serializable.
Every conflict serializable schedule is view serializable, but not necessary that every view
serializable schedule is conflict serializable.
Every view serializable schedule that is not conflict serializable has blind writes.
Transactions that perform write operations without read are called blind writes. (as in above
schedules for transaction T4 and T5)
************************************************************************************
Recoverability
If a transaction Ti fails we have to undo the effect of Ti , to ensure the atomicity property of the
transaction.
T1 T2
read(A)
write(A)
Commit
read(A)
Commit
Since the above schedule is recoverable, becoz T2 read A that was written by T1 and commit of
T1 appears before the commit of T2.
The following schedule is not recoverable becoz T2 commits before T1. Also if T1 fails before
commits, we have to rollback T2.
T1 T2
read(A)
write(A)
read(A)
Commit
Commit
Read(B)
2. Cascading rollback – a single transaction failure leads to a series of transaction rollbacks. Consider
the following schedule -
- If T10 fails, T11 and T12 must also be rolled back.Can lead to the undoing of a significant amount of
work
3.Strict Schedule : In this schedule a transaction can neither read nor write an item X until the last
transaction that write X has committed
T1: read(A)
read(B)
if A=0 then B=B+1
write (B)
Q. What are schedules? Define conflict and view serializable schedules. State whether the following
schedules are conflict serializable or not. Justify your answer.
i) r1(X), r3(X), w1(X), r2(X), w3(X);
ii) r1(X), r3(X), w3(X), w1(X), r2(X);
iv) r3(X), r2(X),w3(X), r1(X), w1(X);
Types of failures
Failures are generally classified as transaction, system and media failures. The transaction may be fail due
to-
1. A computer failure (System crash): a power failure or other hardware or software failure
causes the system to crash. If the hardware crashes the content of memory may be lost.
2. Disk failure: a R/W head crash or similar disk failure destroys all or part of disk storage.
This may happen during r/w operation of the transaction.
3. A transaction or System error :
- Some operations of transactions may causes to fail the transaction such as integer
overflow, divide by zero or wrong parameters (logical errors).
- A transaction may fail due to system error such as deadlock or user may interrupt
execution of transaction.
4. Concurrency control enforcement: Concurrency control method may decide to abort the
transaction, to start later becoz it violet Serializability or transaction in deadlock state.
5. Physical problems or catastrophes: Transaction may be fail due to power or air
conditioning failure, fire, theft, overwriting disk by mistake. Catastrophic failure is a
sudden and total failure of some systems from which recovery is impossible.
Q. What is log? How it is maintained discuss the silent features of deferred database modifications
and immediate database modification strategies in brief.
Log-Based Recovery
The deferred database modification scheme records all modifications to the log, but
defers all the writes operations of the transaction until the transaction partially commit.
When the transaction partially commits, information in log is used for executing deferred
writes.
Below we show the log for execution of T0 T1 . Let A=1000, B=2000 and C=700
<T0 Start>
< T0 , 950>
< T0 , 2050>
< T0 , commit>
< T1, start>
< T1 , 600>
< T1, commit>
The recovery system uses the following procedure to handle any failure
o Redo(Ti) - set values of all data items updated by Ti , to new value
Redo (Ti) is performed when log contain < T1, start> and < T1, commit>
Consider the flowing three cases of failure –
Case a: system fail after W(B)
Case b: System fail after W(C)
Case c: System fail after T1 commits.
In this scheme database modifications to be written (output) to the database while transaction is in
active state.
These modifications are called uncommitted transactions.
When transaction fails, the recovery system uses the log which contains old and new values of data
items to restore the system.
Example consider the transactions T0 and T1 as shown in above technique.(T0 executes before T1):
Checkpoints
Since when the failure occurs system consults the log to determine which transaction need
to be redone and which to be undone, so the above procedures has following disadvantages-
o searching the entire log is time-consuming
o we might unnecessarily redo transactions which have already output their updates to
the database.
So checkpoints are used to reduce the above said problems.
System periodically performs checkpoints.
During recovery we need to consider only the most recent transaction Ti that started before
the checkpoint and transactions that started after Ti.
Scan backwards from end of log to find the most recent <checkpoint> record
Continue scanning backwards till a record <Ti start> is found.
Need only consider the part of log following above start record. Earlier part of log can be
ignored during recovery, and can be erased whenever desired.
For all transactions (starting from Ti or later) with no <Ti commit>, execute undo(Ti).
(Done only in case of immediate modification.)
Scanning forward in the log, for all transactions starting from Ti or later with a <Ti
commit>, execute redo(Ti).
Lock-Based Protocols
One of the way to ensure serializability of concurrent execution, is the lock based protocols.
A lock is a mechanism to control concurrent access to a data item
Lock: - Lock is a variable that specifies the status of the data item with respect to read or write
operations applied on it.
Data items can be locked in two modes :
o exclusive (X) mode – If Ti obtain a exclusive mode lock of data item Q, then Ti can read
and write Q.
o shared (S) mode - If Ti obtain a shared mode lock of data item Q, then Ti can read but can
not write Q.
Lock requests are made to concurrency-control manager. Transaction can proceed only after
request is granted.
Lock-compatibility matrix
i.e. If Ti has a lock on data item Q and if Tj requesting a lock on Q, then lock is granted only if it is
compatible. Since shared mode lock is compatible with shared mode lock i.e. at any time several S-lock
can be granted.
Lock can be release by unlock procedure.
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
display(A+B)
unlock(B);
A locking protocol is a set of rules followed by all transactions while requesting and releasing
locks. Locking protocols restrict the set of possible schedules.
Deadlock is also possible in lock based protocols. Consider the following two transactions:
Starvation is also possible in lock based protocols: Suppose T2 has an S-lock on data item. Another
transaction T1 request an X-lock on same data item, then T1 has to wait for T2 to release a lock.
Meanwhile transaction T3 requested a S-lock on same data item then lock is granted to T3, and T1 is still
waiting for T3 to release a lock on data item, this situation is called starvation.
Q. What is two phases locking. Describe with the help of example. Will 2PL result in deadlock?
Justify your answer with help of an example. Discuss the recovery with concurrent
transaction also
Q. With reference to two phase locking protocol (2PL), explain how the upgrading and
downgrading of locks takes place. Explain with suitable example.
Initially transaction in growing phase. Once the transaction releases the lock, it enters in the shrinking
phase and it can not issue request for another locks.
The protocol assures serializability. It can be proved that the transactions can be serialized in the order
of their lock points (i.e. the point where a transaction acquired its final lock).
Two-phase locking does not ensure freedom from deadlocks. Consider the flowing schedule, which is
in 2PL but transactions are in deadlock states.
T5 T6 T7
LOCK-X(A)
R(A)
W(A)
UNLOCK(A)
LOCK-X(A)
R(A)
W(A)
UNLOCK(A)
LOCK-S(A)
R(A)
Since failure of T5 after read (A) leads to cascading rollback of T6 and T7.
This problem of cascading rollbacks can be voided by modifying 2PL to strict two-phase locking. In
this method a transaction must hold all its exclusive locks till it commits/aborts.
Rigorous two-phase locking is even stricter: here all locks (shared and exclusive) are held by the
transactions till they commit/abort. In this protocol transactions can be serialized in the order in which they
commit.
There can be conflict serializable schedules that cannot be obtained if two-phase locking is used.
However, in the absence of extra information (e.g., ordering of access to data), two-phase locking is
needed for conflict serializability in the following sense:
o Given a transaction Ti that does not follow two-phase locking, we can find a transaction Tj
that uses two-phase locking, and a schedule for Ti and Tj that is not conflict serializable.
– Growing Phase: - Can acquire shared and exclusive locks on data items.
- Convert shared locks to exclusive mode (Upgraded)
– Shrinking Phase: - Can release shared and exclusive locks on data items.
- Convert exclusive locks to shared mode locks (Downgraded)
Since in above example T1 obtain shared mode lock on a1 then before w(a1), lock is upgraded to
exclusive mode.
This protocol assures serializability. But still relies on the programmer to insert the various locking
instructions.