0% found this document useful (0 votes)
3 views

Unit 3_Transaction Management & Serializability

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Unit 3_Transaction Management & Serializability

Uploaded by

omvati343
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 130

A transaction is a unit of

program execution that


accesses and possibly
updates various data items
E.g. transaction to transfer $50 from
account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two main issues to deal with:

Failures of various kinds, such


as hardware failures and system
crashes

 Concurrent execution of
multiple transactions
To preserve the integrity of
data the database system
must ensure ACID properties
A Atomicity
C Consistency
I Isolation
D Durability
Atomicity. Either all operations of the
transaction are properly reflected in the
database or none are.

Consistency. Execution of a transaction in


isolation preserves the consistency of the
database.
Isolation. Although multiple transactions may
execute concurrently, each transaction must be
unaware of other concurrently executing transactions.

Intermediate transaction results must be hidden from


other concurrently executed transactions.

That is, for every pair of transactions Ti and


Tj, it appears to Ti that either Tj, finished
execution before Ti started, or Tj started
execution after Ti finished.
Durability. After a transaction completes
successfully, the changes it has made to the
database persist, even if there are system
failures.
Example of Fund Transfer
Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Atomicity requirement
if the transaction fails after step 3 and before step 6,
money will be “lost” leading to an inconsistent database
state .
Failure could be due to software or hardware
the system should ensure that updates of a partially
executed transaction are not reflected in the database
Durability requirement — once the user has been notified
that the transaction has completed (i.e., the transfer of the
$50 has taken place), the updates to the database by the
transaction must persist even if there are software or
hardware failures.
Consistency requirement
the sum of A and B is unchanged by the execution of
the transaction
In general, consistency requirements include
Explicitly specified integrity constraints such as
primary keys and foreign keys
Implicit integrity constraints
e.g. sum of balances of all accounts, minus sum of
loan amounts must equal value of cash-in-hand
A transaction must see a consistent database.
During transaction execution the database may be
temporarily inconsistent.
When the transaction completes successfully the
database must be consistent
Erroneous transaction logic can lead to inconsistency
• Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated
database, it will see an inconsistent database (the sum A + B will
be less than it should be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
• Isolation can be ensured trivially by running transactions serially
– that is, one after the other.
• However, executing multiple transactions concurrently has
significant benefits, as we will see later.
Transaction State

TRANSACTION STATES
• Active – the initial state; the transaction stays in this
state while it is executing
• Partially committed – after the final statement has
been executed.
• Failed -- after the discovery that normal execution can
no longer proceed.
• Aborted – after the transaction has been rolled back
and the database restored to its state prior to the start
of the transaction. Two options after it has been
aborted:
– restart the transaction
• can be done only if no internal logical error
– kill the transaction
• Committed – after successful completion.
Concurrent Executions
• Multiple transactions are allowed to run
concurrently in the system. Advantages are:
increased processor and disk utilization, leading to better
transaction throughput
E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to
achieve isolation
 that is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
Schedules
• Schedule – a sequences of instructions that specify the order
in which instructions of concurrent transactions are executed
– a schedule for a set of transactions must consist of all instructions
of those transactions
– must preserve the order in which the instructions appear in each
individual transaction.
• A transaction that successfully completes its execution will
have a commit instructions as the last statement
– by default transaction assumed to execute commit instruction as
its last step
• A transaction that fails to successfully complete its execution
will have an abort instruction as the last statement
Schedule 1
• Let T1 transfer $50 from A to B, and T2 transfer 10% of
the balance from A to B.
• A serial schedule in which T1 is followed by T2 :
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 3
• Let T1 and T2 be the transactions defined previously.
The following schedule is not a serial schedule, but it is
equivalent to Schedule 1.

Concurrent
schedule 
In Schedules 1, 2
and 3,
the sum A + B is
preserved.
Schedule 4
• The following concurrent schedule does not
preserve the value of (A + B ).
This problem occurred because two
transactions are working on the same
resource without knowing each other’s
activity
•:transaction reads values written
by another transaction that hasn’t
committed yet.

•This problem has occurred because


two transactions are working on the
same resource without knowing
each other’s activity
•Newly inserted rows appear as phantom to the
transaction
• A transaction re-executes a query returning a set of
rows that satisfy a search condition and finds that
the set of rows satisfying the condition has changed
as a result of another recently committed transaction
It is the process of finding
concurrent schedule equivalent to
serial schedule
•Transactions are programs.
•Here we consider only two operations
READ and WRITE
Two forms
i1,i2 : two consecutive
instructions of different
transactions
4 cases we need to
consider

1. li= read(Q), lj = read(Q). li and lj don’t


conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Series of swaps of non
conflict instruction
If S is Conflict
equivalent to a serial
schedule (here S’)
•It is possible to have two schedules that
produce the same outcome ,but that are
not conflict equivalent
•It is not conflict equivalent to its serial schedule(but
this schedule move db to a consistent state)
•So we must consider computation
performed by transactions rather
than just read and write operations.

•But it is hard to implement


and it is computationally
expensive .
•Another schedule equivalence purely based on read
and write operations

•If schedule s1 and s2 are said to be view equivalent


if three conditions are met.
1. If Ti reads the initial value of object A in s1,it must
also read the initial value of A in s2.
2. If Ti read a value of A written by Tj in s1,it must also
read the value of A written by Tj in s2.
3. For each data object A , the transaction (if any)that
performs the final write on A in s1 must also perform
the final write on A in s2.
If S is view equivalent
to a serial schedule
(here S’)
Ex: for view serializable schedule

It is view equivalent to its serial


schedule
What is the view equivalent serial schedule
for this ?????
The below example is view serializable but not conflict
serializable!!!
•Every conflict serializable schedule is
also view serializable, but reverse need
not be true.

•If blind write appear in any view


serializable schedule that is not conflict
serializable.

•Blind write :- transaction performing


write without a read operation
If transaction fails we need to undo the effect of this
transaction to ensure the atomicity property

(a)Recoverable schedule
(b) Cascadeless schedule
(a)Recoverable schedule

Consider the example for non recoverable schedule:

Recoverable schedule : for each pair of transaction Ti


and Tj such that Tj reads a data item previously written
by Ti ,the commit operation of Ti appears before the
commit operation of Tj
(a)cascadeless schedule
Ex: for cascading schedule

Cascading Rollback: a single transaction


failure leads to a series of transaction
rollbacks
Every cascadeless schedule is also
recoverable
• Isolation property may no long be preserved.
• so we need Concurrency control scheme
lock is a mechanism to control
concurrent access to a data
item

Grants the lock to the


transaction
Data item can be locked in two modes

read
write

read
write
X-lock is requested using
lock-X instruction.

S-lock is requested using


lock-S instruction.
Lock –compatibility matrix

Compatible??
Ex: for performing locking
If a lock cannot be granted, the
requesting transaction is made
to wait till all incompatible locks
held by other transactions have
been released. The lock is then
granted
Necessary condition for deadlock

A state where neither of


the transaction can
proceed with its normal
execution
Consider the example for deadlock:
Starvation : If a transaction T never
make a progress , then is said to be
starved.
When a transaction Ti request a lock
on a data item Q in a particular mode M
We can avoid starvation by:
•There is no other transaction holding a
lock on Q in a mode that conflicts with M

•There is no other transaction that is


waiting for a lock on Q and that made its
lock request before Ti
•Set of rules
• each transaction in the
system follow this rule
• it is for ,when a
transaction may lock or
unlock each of its data
items.
•It restricts the number
possible schedules
A schedule s is legal under a given
locking protocol ,if s possible
schedule for a set of transaction
that follows the rules of the locking
protocol
A locking protocol ensures conflict
serializability if and only if all legal
schedules are conflict serializable.
This is a protocol which ensures conflict
serializable schedules.

•Phase 1: Growing Phase


•transaction may obtain locks
•transaction may not release locks

•Phase 2: Shrinking Phase


•transaction may release
locks
•transaction may not
obtain Locks
•Initially a transaction is in the
growing phase .The transaction
acquires locks as needed.

•Once the transaction releases a


lock ,it enters the shrinking phase
and it can issue no more lock
requests.

•Unlock instruction do not need to


appear at the end of the
transaction
T1: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B);

T1: lock-s(A);
read (A);
lock-S(B);
read (B);
unlock(B);
unlock(B);
display(A+B);
The point in the schedule where
the transaction has obtained its
final lock(the end of growing
phase).

•Two-phase locking does not ensure


freedom from deadlocks
•cascading rollback may occur under two
phase locking
•(a)Strict two phase locking
(avoiding cascading rollback )

•(b)rigorous two phase locking


Locking be in two phase + all exclusive mode
locks of a transaction
held until the
transaction commit

Locking be in two phase + all locks of a


transaction held until
the transaction commit
T1: read(A1);
read(A2);
....
read(An);
Write(A1)

T2: read(A1);
read(A2);
.display(A1+A2);
Two-phase locking with lock conversions:
– First Phase:
•can acquire a lock-S on item
•can acquire a lock-X on item
•can convert a lock-S to a lock-X (upgrade)
– Second Phase:
•can release a lock-S
•can release a lock-X
•can convert a lock-X to a lock-S (downgrade)
•Another method for finding the serializability
order

• The protocol manages concurrent execution


such that the time-stamps determine the
serializability

• A time stamp is assigned by the database


system before the transaction starts
execution.
Assume transactions comes
in the order T1.T2,T3

•If an old transaction Ti has time-stamp TS(Ti),


a new transaction Tj is assigned time-stamp
TS(Tj) such that TS(Ti) <TS(Tj).
For implementing this we use two simple methods

• Use the value of the system clock.

OR

•Use a logical counter.


The time-stamps determine the serializability
order , i.e. ensure that Ti appears before Tj.

maintains for each data Q two timestamp values:


.

is the largest time-stamp of any transaction that


executed write(Q) successfully.

is the largest time-stamp of any transaction that


executed read(Q) successfully
•The timestamp ordering protocol ensures that any
conflicting read and write operations are executed in
timestamp order

•The protocol operates as follows.


Suppose a transaction Ti issues a read(Q)
1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a
value of Q that was already overwritten.
 Hence, the read operation is rejected, and Ti is
rolled back.
2. If TS(Ti) W-timestamp(Q), then the read operation
is executed, and R-timestamp(Q) is set to max(R-
timestamp(Q), TS(Ti)).
READ(B)

Read accepted Read rejected


Read rejected
Suppose a transaction Ti issues a write(Q)
1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is
producing was needed previously, and the system assumed
that that value would never be produced.
 Hence, the write operation is rejected, and Ti is rolled
back.
2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write
an obsolete value of Q.
 Hence, this write operation is rejected, and Ti is rolled
back.
3. Otherwise, the write operation is executed, and W-
timestamp(Q) is set to TS(T i).
WRITE(B)

write accepted
write rejected
write
wite rejected
rejected
A schedule that is possible under the time stamp protocol
assume TS(T25) < TS (T26)
•ensures •may not
freedom cascade-free
•may not
from deadlock recoverable.
• Problem with timestamp-ordering protocol:
• Suppose Ti aborts, but Tj has read a data item
written by Ti , then Tj must abort; if Tj had been
allowed to commit earlier, the schedule is not
recoverable.
• Further, any transaction that has read a data
item written by Tj must abort this can lead to
cascading rollback --- that is, a chain of
rollbacks
• Solution 1:
• A transaction is structured such that its writes are all
performed at the end of its processing
• All writes of a transaction form an atomic action; no
transaction may execute while a transaction is being
written
• A transaction that aborts is restarted with a new
timestamp
• Solution 2: Limited form of locking: wait for data to be
committed before reading it
• Solution 3: Use commit dependencies to ensure
recoverability
Consider the example Example :

Don’t worry, We
have a soloution
THOMAS WRITE
RULE
• Modified version of the timestamp-ordering protocol
in which obsolete write operations may be ignored
under certain circumstances.

• When Ti attempts to write data item Q, if TS(Ti) < W-


timestamp(Q), then Ti is attempting to write an
obsolete value of {Q}.
– Rather than rolling back Ti as the timestamp
ordering protocol would have done, this {write}
operation can be ignored.
• Otherwise this protocol is the same as the timestamp
ordering protocol.
• Thomas' Write Rule allows greater potential concurrency.
– Allows some view-serializable schedules that are not
conflict-serializable
Types of failure

1. Transaction failure
•Logical error : (due to internal
condition) bad input, data
not found, resource limit
exceed.

• System error :entered an undesirable


state (ex:deadlock)
2. System crash
: h/w malfunction , or a bug in the database
s/w or OS
: loss of content in volatile

3. Disk failure
: head crash or failure during data
transfer operation.
•Volatile storage:
•does not survive system crashes
•examples: main memory, cache memory
•Nonvolatile storage:
•survives system crashes
•examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
•but may still fail, losing data
•Stable storage:
•a mythical form of storage that survives all
failures
•approximated by maintaining multiple copies on
distinct nonvolatile media
Blocks : Fixed length storage units

Physical blocks are those blocks residing on the disk.

Buffer blocks are the blocks residing temporarily in main


memory.

Block movements between disk and main memory are initiated


through the following two operations:
input(B) transfers the physical block B to main memory.
output(B) transfers the buffer block B to the disk, and
replaces the appropriate physical block there.
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)

x2
x1
y1

work area work area


of T1 of T2

memory disk
• Each transaction Ti has its private work-area in which local copies
of all data items accessed and updated by it are kept.
– Ti's local copy of a data item X is called xi.
• Transferring data items between system buffer blocks and its
private work-area done by:
– read(X) assigns the value of data item X to the local variable xi.
– write(X) assigns the value of local variable xi to data item {X} in
the buffer block.
– Note: output(BX) need not immediately follow write(X). System
can perform the output operation when it deems fit.
• Transactions
– Must perform read(X) before accessing X for the first time
(subsequent reads can be from local copy)
– write(X) can be executed at any time before the transaction
commits
Transaction Log
• also know as journal log / redo-log
• It is a physical file
• It usually contain
• transaction identifier
• data –item identifier(or time stamp)
• old value
• new value
We denote various type of log record as

1.<Ti start> : transaction Ti starts


2. <Ti, X, V1, V2> : Before Ti executes write(X)
3.<Ti commit> : Ti finishes it last statement
4. <Ti abort>
T start>
< 0

<T0, A, 1000, 950>


<To, B, 2000, 2050>
<T0 commit>
<T1 start>
<T1, C, 700, 600>
<T1 commit>
•Also known as NO UNDO/REDO
• Algorithm to support O/S, application, power,
memory and machine failures
• During transaction run changes recorded only in the
log files not in database
• On commit changes made from
Log  database
• this process is called “Re-doing”(redo(Ti)),sometimes
known as ROLLFORWARD.
• on rollback ,just discard the log files
• on commit, just copy the log files to database
•Disadvantage
•Increased time of recovery in case of
system failure.
•Also known as UNDO/REDO
• Algorithm to support O/S,
application, power, memory and
machine failures

•Transaction updates/alternation
•On commit all the changes to the db are made
permanent and log files discarded

• On rollback , using the log entries old values are


restored. All the changes in the database are discarded.

•This process is called un-doing(undo(Ti)).

• original values are restored using the log files for


uncommitted transaction.
Log Write Output
<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050
A = 950
B = 2050
<T0 commit>
<T1 start> BC output before T1
<T1, C, 700, 600> commits
C = 600
BB , BC
<T1 commit>
BA
• Note: BX denotes block containing X. BA output after T0
commits
• The process of undoing changes using log files is
frequently referred to as rollback

•Disadvantage
•Frequent I/O operations while the
transaction Is active .
•Commercial RDMS is neither deferred nor immediate
• Database updated at fixed interval of time
•Irrespective of transaction commit/un-commit state.
•Check pointing : updating transaction at fixed intervals of
time is called check-pointing.
• @ check point time log files changes applied to the
database.

• During recovery we need to consider only the most recent
transaction Ti that started before the checkpoint, and
transactions that started after Ti.
1. Scan backwards from end of log to find the most recent
<checkpoint L> record
– Only transactions that are in L or started after the
checkpoint need to be redone or undone
– Transactions that committed or aborted before the
checkpoint already have all their updates output to stable
storage.
• Some earlier part of the log may be needed for undo
operations
1. Continue scanning backwards till a record <Ti start> is
found for every transaction Ti in L.
– Parts of log prior to earliest <Ti start> record above are not
needed for recovery, and can be erased whenever desired.
•Alternative to log-based crash-recovery techniques.
•Advantage: require few disk access than log-based.
• pages : fixed length portioned block in database.
• page table :
•The page table has n entries—one for each
database page.
•Each entry contains a pointer to a page on
disk .

You might also like