Module 5 - NOTES
Module 5 - NOTES
TRANSACTION MANAGEMENT
5.1 Transaction
1
Transaction Management Module-5
✔ Detailed Explanation :
5.2.1 Atomicity:
● This property ensures that either all the operations of a transaction reflect in
database or none.
2
Transaction Management Module-5
OR
● This is the responsibility or duty of the recovery subsystem of the DBMS to ensure
atomicity.
● Example:
Read (A)
A = A – 100;
Write (A)
Read B;
B = B + 100;
Write (B)
❖ The transaction has 6 instructions to extract the amount from A and submit it to B.
❖ Problem :
o Now, suppose there is a power failure just after instruction 3 (Write A) has
been complete.
o What happens now? After the system recovers , will show Rs 900/- in A, but
the same Rs 1000/-in B.
o It would be said that Rs 100/- evaporated in thin air for the power failure.
Clearly such a situation is not acceptable.
❖ Solution :
o It is to keep every value calculated by the instruction of the transaction not in any
stable storage (hard disc) but in a volatile storage (RAM), until the transaction
completes its last instruction.
o When we see that there has not been any error we do something known as a
COMMIT operation.
o Its job is to write every temporarily calculated value from the volatile storage on to
the stable storage.
3
Transaction Management Module-5
o In this way, even if power fails at instruction 3, the post recovery image of the
database will show accounts A and B both containing Rs 1000/-, as if the failed
transaction had never occurred.
5.2.2 Consistency:
● To preserve the consistency of database, the execution of transaction should take
place in isolation (that means no other transaction should run concurrently when
there is a transaction already running).
OR
● A transaction must alter the database from one steady state to another steady state.
● This is the responsibility of both the DBMS and the application developers to make
certain consistency.
● The DBMS can ensure consistency by putting into effect all the constraints that
have been particularly on the database schema such as integrity and enterprise
constraints.
● Concurrently executing transactions may have to deal with the problem of sharable
resources, i.e. resources that multiple transactions are trying to read/write at the
same time.
● For example:
❖ we may have a table or a record on which two transactions are trying to read or
write at the same time.
4
Transaction Management Module-5
5.2.3 Isolation
● For every pair of transactions, one transaction should start execution only when the
other finished execution.
In case multiple transactions are executing concurrently and trying to access a
sharable resource at the same time, the system should create an ordering in their
execution so that they should not create any anomaly in the value stored at the
sharable resource.
● There are several ways to achieve this and the most popular one is using some kind of
locking mechanism.
● Again, if you have the concept of Operating Systems, then you should remember the
semaphores, how it is used by a process to make a resource busy before starting to use it,
and how it is used to release the resource after the usage is over.
● Other processes intending to access that same resource must wait during this time.
Locking is almost similar.
● It states that a transaction must first lock the data item that it wishes to access, and
release the lock when the accessing is no longer required.
● Once a transaction locks the data item, other transactions wishing to access the same data
item must wait until the lock is released.
5.2.4 Durability:
● Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure.
● The recovery-management component of database systems ensures the durability of
transaction.
OR
● Once the COMMIT is done, the changes which the transaction has made to the
database are immediately written into permanent storage.
● So, after the transaction has been committed successfully, there is no question of
5
Transaction Management Module-5
● Committing a transaction guarantees that the After transaction has been reached.
✔ There are the following six states in which a transaction may exist:
(i) Active: The initial state when the transaction has just started execution.
(ii) Partially Committed: At any given point of time if the transaction is executing
properly, then it is going towards it COMMIT POINT. The values generated
during the execution are all stored in volatile storage.
(iii) Failed: If the transaction fails for some reason. The temporary values are no
longer required, and the transaction is set to ROLLBACK. It means that any
change made to the database by this transaction up to the point of the failure must
be undone. If the failed transaction has withdrawn Rs. 100/- from account A, then
the ROLLBACK operation should add Rs 100/- to account A.
(iv) Aborted: When the ROLLBACK operation is over, the database reaches the
BFIM. The transaction is now said to have been aborted.
(v) Committed: If no failure occurs then the transaction reaches the COMMIT
POINT. All the temporary values are written to the stable storage and the
transaction is said to have been committed.
(vi) Terminated: Either committed or aborted, the transaction finally reaches this
state.
6
Transaction Management Module-5
✔ That is the user who submits a transaction must ensure that, when run to
completion by itself against a ‘consistent’ database instance, the transaction will
leave the database I a consistent state.
✔ Example:
(i)
● Fund transfer between bank accounts should not change the total amount of money
in the accounts.
● To transfer money from one account to another a transaction must debit one account
temporarily leaving the database inconsistent in a global sense, the database
consistent is preserved when the second account is credited with the transferred
amount.
(ii)
● If a faulty transfer program always credits the second amount with one dollar less
than the amount debited from the first account, the DBMS cannot be expected to
detect inconsistencies due to such errors in the user program’s logic.
5.4.2 Isolation:
✔ This property ensured by guaranteeing that, even though actions of several
transactions might be interleaved, the net effect is identical to executing all
transactions one after the other in some serial order
✔ Example:
● If two transactions T1 and T2 are executed concurrently, the net effect is guaranteed
to be equivalent to executing T1 followed by executing T2 or executing T2
followed by executing T1
7
Transaction Management Module-5
● All updates are done on the new database copy, leaving the original copy, the
shadow copy, untouched.
● If at any point the transaction has to be aborted, the system merely deletes the
new copy.
● This scheme is based on making copies of the database, called shadow copies,
assumes that only one transaction is active at a time.
● The scheme also assumes that the database is simply a file on disk.
❖ First, the operating system is asked to make sure that all pages of the new
copy of the database have been written out to disk. (Unix systems use the
flush command for this purpose.)
❖ After the operating system has written all the pages to disk, the database
system updates the pointer db-pointer to point to the new copy of the
database; the new copy then becomes the current copy of the database.
● Figure below depicts the scheme, showing the database state before and after
the update.
8
Transaction Management Module-5
● The transaction is said to have been committed at the point where the updated
db pointer is written to disk.
Durability:
✔ The log is also used to ensure durability, if the system crashes before the changes
made by a completed transaction are written to disk, the log is used to remember
and restore these changes when the system restarts.
✔ The actions that can be executed by transaction include reads and writes of database
objects.
✔ The simple notations for actions an object O are:
RT (O): action of a transaction T reading an object O.
9
Transaction Management Module-5
5.7.2 Schedule:
Classification :
(i)Serial schedule:
✔ Transactions are not interleaved, that is Transactions are executed from start to
finish, one by one.
(ii)Complete schedule
✔ A schedule that contains either an abort or commit for each transaction whose
actions are listed in it.
10
Transaction Management Module-5
1. While one transaction is waiting for a page to be read in from disk, the CPU can
process another transaction. This is because I/O activity can be done in parallel with CPU
activity in a computer. Overlapping I/O and CPU activity reduces the amount of time
disks and processors are idle and increases the system throughput.
2. Interleaved execution of a short transaction with a long transaction usually allows a
short transaction to complete quickly.
✔ Serial :
● However, a serial schedule is inefficient in the sense that the transactions suffer
for having a longer waiting time and response time, as well as low amount of
resource utilization.
✔ Concurrent :
● However, this creates the possibility that more than one transaction may need to
access a single data item for read/write purpose and the database could contain
inconsistent value if such accesses are not handled properly.
● Let us consider there are two transactions T1 and T2, whose instruction sets
are given as follows.
11
Transaction Management Module-5
Read A;
A = A – 100;
Write A;
Read B;
B = B + 100;
Write B;
T2
Read A;
Temp = A * 0.1;
Read C;
C = C + Temp;
Write C;
● T2 is a new transaction which deposits to account C 10% of the amount in account A.
● If we prepare a serial schedule, then either T1 will completely finish before T2 can
begin, or T2 will completely finish before T1 can begin.
● However, if we want to create a concurrent schedule, then some Context Switching
need to be made, so that some portion of T1 will be executed, then some portion of T2
will be executed and so on.
● For example, 2: say we have prepared the following concurrent schedule.
T1 T2
Read A;
A = A – 100;
Write A;
Read A;
Temp = A * 0.1;
Read C;
C = C + Temp;
Write C;
Read B;
B = B + 100;
Write B;
● No problem here.
● We have made some Context Switching in this Schedule, the first one after executing
the third instruction of T1, and after executing the last statement of T2.
● T1 first deducts Rs 100/- from A and writes the new value of Rs 900/- into A.
● T2 reads the value of A, calculates the value of Temp to be Rs 90/- and adds the value
12
Transaction Management Module-5
to C.
● The remaining part of T1 is executed and Rs 100/- is added to B.
● It is clear that a proper Context Switching is very important in order to maintain the
Consistency and Isolation properties of the transactions.
● But let us take another example where a wrong Context Switching can bring about
disaster.
● Consider the following example 3 involving the same T1 and T2
T1 T2
Read A;
A = A – 100;
Read A;
Temp = A * 0.1;
Read C;
C = C + Temp;
Write C;
Write A;
Read B;
B = B + 100;
Write B;
● This schedule is wrong, because we have made the switching at the second instruction
of T1.
● The result is very confusing.
● If we consider accounts A and B both containing Rs 1000/- each, then the result of
this schedule should have left Rs 900/- in A, Rs 1100/- in B and add Rs 90 in C (as C
should be increased by 10% of the amount in A).
● But in this wrong schedule, the Context Switching is being performed before the new
value of Rs 900/- has been updated in A.
● T2 reads the old value of A, which is still Rs 1000/-, and deposits Rs 100/- in C.
● In the above example, we detected the error simple by examining the schedule and
applying common sense.
13
Transaction Management Module-5
● But there must be some well-formed rules regarding how to arrange instructions of
the transactions to create error free concurrent schedules.
● So , we go for the concept of Serializability.
5.8.2 Serializability
✔ A serializable schedule over a set S of committed transactions is a schedule whose effect
on any consistent database instance is guaranteed to be identical to that of some complete
serial schedule over S.
✔ The database instance that results from executing the given schedule is identical to the
database instance that results from executing the transactions in some serial order
14
Transaction Management Module-5
● The DBMS might use a concurrency control method that ensure the executed
schedule , though not itself serializable, is equivalent to some serial schedule.
● SQL gives application programmers the ability to instruct the DBMS to choose non-
serializable schedules.
✔ Two instructions of two different transactions may want to access the same data item in
order to perform a read/write operation.
✔ Conflict Serializability deals with detecting whether the instructions are conflicting in any
way, and specifying the order in which these two instructions will be executed in case
there is any conflict.
✔ A conflict arises if at least one (or both) of the instructions is a write operation.
1. If two instructions of the two concurrent transactions are both for read operation, then
they are not in conflict, and can be allowed to take place in any order.
2. If one of the instructions wants to perform a read operation and the other instruction
wants to perform a write operation, then they are in conflict, hence their ordering is
important. If the read instruction is performed first, then it reads the old value of the data
item and after the reading is over, the new value of the data item is written. It the write
instruction is performed first, then updates the data item with the new value and the read
instruction reads the newly updated value.
3. If both the transactions are for write operation, then they are in conflict but can be
allowed to take place in any order, because the transaction do not read the value updated
by each other. However, the value that persists in the data item after the schedule is over
is the one written by the instruction that performed the last write.
✔ It may happen that we may want to execute the same set of transaction in a different
schedule on another day.
✔ Keeping in mind these rules, we may sometimes alter parts of one schedule (S1) to create
another schedule (S2) by swapping only the non-conflicting parts of the first schedule.
15
Transaction Management Module-5
✔ The conflicting parts cannot be swapped in this way because the ordering of the
conflicting instructions is important and cannot be changed in any other schedule that is
derived from the first.
✔ If these two schedules are made of the same set of transactions, then both S1 and S2
would yield the same result if the conflict resolution rules are maintained while creating
the new schedule.
(b)View Serializability:
✔ These two schedules would be called View Serializable if the following rules are
followed while creating the second schedule out of the first.
✔ Let us consider that the transactions T1 and T2 are being serialized to create two
different schedules S1 and S2 which we want to be View Equivalent and both T1
and T2 want to access the same data item.
1. If in S1, T1 reads the initial value of the data item, then in S2 also, T1 should read the
initial value of that same data item.
2. If in S1, T1 writes a value in the data item which is read by T2, then in S2 also, T1
should write the value in the data item before T2 reads it.
3. If in S1, T1 performs the final write operation on that data item, then in S2
also, T1 should perform the final write operation on that data item.
✔ Except in these three cases, any alteration can be possible while creating S2 by
modifying S1.
There are 3 main ways in which a schedule involving two consistency preserving, committed
transactions could run against a consistent database and leave it in an inconsistent state.
16
Transaction Management Module-5
● Consider two transactions T1 and T2,each of which, run alone, preserves database
consistency: T1 transfers $100 from A to B and T2 increments both A and B by 6%.
● Suppose that the actions are interleaved so that-
✔ Example :
17
Transaction Management Module-5
● A transaction that places an order first reads A, checks that it is greater than 0, and
then decrements it.
● Transaction T1 reads A and sees the value 1.
● Suppose that Harry and Larry are two employees and their salaries must be kept
equal.
● Transaction T1 sets their salary to $2000 and transaction T2 sets their salaries to
$1000.
● If we execute these in the serial order T1 followed by T2 , both receive the salary
$1000; the serial order T followed by T1 gives each salary $2000.
● This is acceptable from the consistency standpoint.
18
Transaction Management Module-5
✔ Note that neither transaction reads a salary value before writing it such a write is called a
blind write.
✔ Consider the following interleaving actions of T1 and T2; T2 sets Harry’s salaries to
$1000, T1 sets Larry’s salary to $2000, T2 sets Larry’s salary to $1000 and commits, and
finally T1 sets Harry’s salary to $2000 and commits.
✔ The result is not identical to the result o either of the two possible serial executions and
the interleaved schedule is therefore not serializable.
✔ It violates the desired consistency criterion that the two salaries must be equal. This
property is called lost Update problem
19
Transaction Management Module-5
✔ If T2 had not yet committed, we could deal with the situation by cascading the abort of
T1 and also aborting T2; this process recursively aborts any transaction that read data
written by T2 and so on.
✔ But T2 has already committed, and so we cannot undo its actions. We say that such a
schedule is Unrecoverable.
✔ Recoverable Schedule: transactions commit only after all transactions whose changes
they read commit.
✔ Avoid Cascading Aborts: if transaction read only the changes of committed transactions,
not only is the schedule recoverable, but also aborting a transaction can be accomplished
without cascading the abort to another transaction.
20
Transaction Management Module-5
even though actions of several transactions might be interleaved, the net effect is
identical to executing all transactions in some serial order.
✔ Different locking protocols use different types of locks, such as
1. Shared Lock
2. Exclusive Lock
✔ Shared Lock: multiple transactions can share the object (read action)
✔ Rule 1: Each Transaction must obtain a S (shared) lock on object before reading,
and an X (exclusive) lock on object before writing.
✔ Rule 2: All locks held by a transaction are released when the transaction completes
A transaction that has an exclusive lock can also read the object.
A transaction that requests a lock is suspended until the DBMS is able to grant it the
requested lock.
✔ The DBMS keeps track of the locks it has granted and ensures that if a transaction
holds an exclusive lock on an object, no other transaction holds a shared or exclusive
lock on the same object.
✔ Requests to acquire and release locks can be automatically inserted into
transactions by the DBMS; users need not worry about these details.
✔ In effect, the locking protocol allows only safe interleaving of transactions.
● However, this request cannot be granted until T1 releases its exclusive lock on A,
and the DBMS therefore suspends T2.
21
Transaction Management Module-5
● T1 now proceeds to obtain an exclusive lock on B reads and writes B, then finally
commits, at which time its locks are released.
● T2’s lock request is now granted, and it proceeds.
5.9.2 Deadlocks
✔ Consider the following example
22
Transaction Management Module-5
✔ Deadlock Prevention
● We can prevent deadlocks by giving each transaction a priority and ensuring that
lower priority transactions are not allowed to wait for higher priority
transactions (or vice versa).
● One way to assign priorities is to give each transaction a timestamp when it
starts up.
● The lower the timestamp, the higher the transaction's priority, that is, the oldest
transaction has the highest priority.
● If a transaction Ti requests a lock and transaction Tj holds a conflicting lock, the
lock manager can use one of the following two policies:
o Wait-die: If It has higher priority, it is allowed to wait; otherwise it is
aborted.
o Wound-wait: If It has higher priority, abort Tj; otherwise Ti waits. In the
wait-die scheme, lower priority transactions can never wait for higher
priority transactions
● In the wound-wait scheme, higher priority transactions never wait for lower
priority transactions. In either case no deadlock cycle can develop.
● Transactions having lower timestamp value is having higher priority, this
ensures that the oldest transaction will get all the locks that it requires.
● The wait-die scheme is non preemptive; only a transaction requesting a lock can
be aborted.
✔ Deadlock Detection
● This observation suggests that rather than taking measures to prevent deadlocks, it
may be better to detect and resolve deadlocks as they arise.
● In the detection approach, the DBMS must periodically check for deadlocks.
23
Transaction Management Module-5
granted, it must wait until all transactions Tj that currently hold conflicting locks
release them.
● The lock manager maintains a structure called a waits-for graph to detect deadlock
cycles.
● The nodes correspond to active transactions, and there is an arc from Ti to Tj if
(and only if) Ti is waiting for Tj to release a lock.
● The lock manager adds edges to this graph when it queues lock requests and
removes edges when it grants lock requests.
● The waits-for graph is periodically checked for cycles, which indicate deadlock.
✔ The DBMS also maintains a descriptive entry for each transaction in a transaction
table, and among other things, the entry contains a pointer to a list of locks held by
the transaction.
✔ A lock table entry for an object which can be a page or a record contains the
information like: the number of transactions currently holding a lock on the object
(this can be more than one if the object is locked in shared mode), the nature of the
lock (shared or exclusive), and a pointer to a queue of lock requests.
24
Transaction Management Module-5
2. If an exclusive lock is requested, and no transaction currently holds a lock on the object
(which also implies the queue of requests is empty), the lock manager grants the lock and
updates the lock table entry.
3. Otherwise, the requested lock cannot be immediately granted, and the lock request is
added to the queue of lock requests for this object. The transaction requesting the lock is
suspended.
✔ When a transaction aborts or commits, it releases all its locks. When a lock on an
object is released, the lock manager updates the lock table entry for the object and
examines the lock request at the head of the queue for this object.
✔ If this request can now be granted, the transaction that made the request is woken
up and given the lock.
✔ Indeed, if there are several requests for a shared lock on the object at the front of the
queue, all of these requests can now be granted together.
✔ Note that if T1 has a shared lock on O, and T2 requests an exclusive lock, T2's
request is queued.
✔ Now, if T3 requests a shared lock, its request enters the queue behind that of T2,
even though the requested lock is compatible with the lock held by T1.
✔ This rule ensures that T2 does not starve, that is, wait indefinitely while a stream of
other transactions acquire shared locks and thereby prevent T2 from getting the
exclusive lock that it is waiting for.
● The lock manager checks and finds that no other transaction holds a lock on the
object and therefore decides to grant the request.
● But in the meantime, another transaction might have requested and received a
conflicting lock!
● To prevent this, the entire sequence of actions in a lock request call (checking to see
if the request can be granted, updating the lock table, etc.) must be implemented as
an atomic operation.
25
Transaction Management Module-5
26
Transaction Management Module-5
o The third condition allows Ti and Tj to write objects at the same time, and thus
have even more overlap in time than the second condition, but the sets of
objects written by the two transactions cannot overlap. Thus, no RW, WR, or
WW conflicts are possible if any of these three conditions is met.
✔ Checking these validation criteria requires us to maintain lists of objects read and
written by each transaction.
✔ The locking overheads of lock-based approaches are replaced with the overheads
of recording read-lists and write-lists for transactions, checking for conflicts, and
copying changes from the private workspace.
Timestamps
● With each transaction Ti in the system, we associate a unique fixed timestamp,
denoted by TS (4).
● This timestamp is assigned by the database system before the transaction Ti
starts execution.
● If a transaction Ti has been assigned timestamp TS(Ti), and a new transaction Q
enters the system, then TS(4) < TS(4).
● There are two simple methods for implementing this scheme:
1. Use the value of the system clock as the timestamp; that is, a transaction's
timestamp is equal to the value of the clock when the transaction enters the
system.
2. Use a logical counter that is incremented after a new timestamp has been
assigned; that is, a transaction's timestamp is equal to the value of the counter
when the transaction enters the system.
● Thus, if TS (4) <TS(T), then the system must ensure that the produced schedule
is equivalent to a serial schedule in which transaction Ti appears before
transaction Q.
● To implement this scheme, we associate with each data item Q two timestamp
values:
o W-timestamp (Q) denotes the largest timestamp of any transaction that
27
Transaction Management Module-5
Crash Recovery
● Transactions (or units of work) against a database can be interrupted unexpectedly.
● If a failure occurs before all of the changes completed, committed, and written to disk,
the database is left in an inconsistent and unusable state.
● Crash recovery is the process by which the database is moved back to a consistent
and usable state.
● This is done by rolling back incomplete transactions and completing committed
transactions that were still in memory when the crash occurred (Figure 1).
28
Transaction Management Module-5
29
Transaction Management Module-5
the database and hence need not be reapplied. Thus only the necessary REDO
operations are applied during recovery.
3. UNDO
During the UNDO phase, the log is scanned backwards and the operations of
transactions that were active at the time of the crash are undone in reverse order. The
information needed for ARIES to accomplish its recovery procedure includes the log,
the Transaction Table, and the Dirty Page Table. I
Before describing this topic we need to elaborate some concepts:
1. Log sequence number
It refers to a pointer used to identify the log records.
2. Dirty page table
It refers to pages with their updated version placed in main memory and disk version
of it is not updated.
A table is maintained which is useful in reducing unnecessary redo operation.
3. Fuzzy checkpoints.
A new type of checkpoints i.e. fuzzy checkpoints has been derived that allows
processes to process new transactions that alter the log has been updated without
having to update the database.
Media Recovery
● When a database object such as a file or a page is corrupted the copy of that is brought
up-to-date by using the log.
● Media recovery requires a control file, data files (typically restored from backup), and
online and archived redo log files containing changes since the time the data files
were backed up. Media recovery is most often used to recover from media failure,
30
Transaction Management Module-5
such as the loss of a file or disk, or a user error, such as the deletion of the contents of
a table.
1)Analysis
2)Redo
3)Undo
Undoing – If a transaction crashes, then the recovery manager may undo transactions i.e.
reverse the operations of a transaction.
Deferred update – This technique does not physically update the database on disk until a
transaction has reached its commit point. Before reaching commit, all transaction updates are
recorded in the local transaction workspace. If a transaction fails before reaching its commit
point, it will not have changed the database in any way so UNDO is not needed.
Immediate update – In the immediate update, the database may be updated by some
operations of a transaction before the transaction reaches its commit point.
Shadow Paging is a recovery technique that is used to recover databases. In this recovery
technique, a database is considered as made up of fixed size of logical units of storage which
are referred to as pages. pages are mapped into physical blocks of storage, with help of the
page table which allow one entry for each logical page of database
● Full database backup – In this full database including data and database, Meta information
needed to restore the whole database, including full-text catalogs are backed up in a
predefined time series.
● Differential backup – It stores only the data changes that have occurred since last full
database backup. When same data has changed many times since last full database backup,
31
Transaction Management Module-5
a differential backup stores the most recent version of changed data. For this first, we need
to restore a full database backup.
● Transaction log backup – In this, all events that have occurred in the database, like a record
of every single statement executed is backed up. It is the backup of transaction log entries
and contains all transaction that had happened to the database. Through this, the database
can be recovered to a specific point in time. It is even possible to perform a backup from a
transaction log if the data files are destroyed and not even a single committed transaction is
lost.
32