0% found this document useful (0 votes)
41 views

Transaction DBMS

Transaction isolation is important to ensure database consistency when transactions run concurrently. Allowing concurrency improves throughput and reduces waiting times by enabling transactions to share CPU and disk resources. However, concurrent execution can violate isolation and destroy consistency if not properly controlled. The database system uses concurrency control schemes to prevent inconsistent interactions between transactions. Serializability is a key concept where a concurrent schedule is considered serializable if it is equivalent to a serial schedule where transactions execute one at a time. Conflict serializability checks if two instructions from different transactions that access the same data conflict, and non-conflicting instructions can be reordered to test for equivalence to a serial schedule.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Transaction DBMS

Transaction isolation is important to ensure database consistency when transactions run concurrently. Allowing concurrency improves throughput and reduces waiting times by enabling transactions to share CPU and disk resources. However, concurrent execution can violate isolation and destroy consistency if not properly controlled. The database system uses concurrency control schemes to prevent inconsistent interactions between transactions. Serializability is a key concept where a concurrent schedule is considered serializable if it is equivalent to a serial schedule where transactions execute one at a time. Conflict serializability checks if two instructions from different transactions that access the same data conflict, and non-conflicting instructions can be reordered to test for equivalence to a serial schedule.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Transaction Isolation:

 Transaction-processing systems usually allow multiple transactions to


run concurrently.
 Allowing multiple transactions to update data concurrently causes
several complications with consistency of the data.
 Ensuring consistency in the case of concurrent execution of
transactions requires extra work.
 It is easier to insist that transactions run serially—that is, one at a
time, each starting only after the previous one has completed.

But There are two good reasons for allowing concurrency:

• Improved throughput and resource utilization:

A transaction consists of many steps. Some involve I/O activity; others


involve CPU activity. The CPU and the disks in a computer system can
operate in parallel. Therefore, I/O activity can be done in parallel with
processing at the CPU.

While a read or write on behalf of one transaction is in


progress on one disk, another transaction can be running in the
CPU,while another disk may be executing a read or write on behalf of a
third transaction. All of this increases the throughput of the system—that
is, the number of transactions executed in a given amount of time.
Correspondingly, the processor and disk utilization also increase.

Reduced waiting time. :There may be a mix of transactions running on a


system, some short and some long. If transactions run serially, a short
transaction may have to wait for a preceding long transaction to complete,
which can lead to unpredictable delays in running a transaction. If the
transactions are operating on different parts of the database, it is better to
let them run concurrently, sharing the CPU cycles and disk accesses
among them. Concurrent execution reduces the unpredictable delays in
running transactions.
It also reduces the average response time: the average
time for a transaction to be completed after it has been submitted.
 When several transactions run concurrently, the isolation property
may be violated,resulting in database consistency being destroyed
despite the correctness of each individual transaction.
 The database system must control the interaction among the
concurrent transactions to prevent them from destroying the
consistency of the database.
 It does so through a variety of mechanisms called concurrency-
control schemes.

Consider the below example,which has several accounts, and a set of


transactions that access and update those accounts.

Let T1 and T2 be two transactions that transfer funds from one


account to another.

Transaction T1 transfers $50 from account A to account B. It is


defined as:
T1: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Transaction T2 transfers 10 percent of the balance from account A to
account B. It is defined as:

T2: read(A);
temp := A * 0.1;
A := A − temp;
write(A);
read(B);
B := B + temp;
write(B).
Suppose the current values of accounts A and B are $1000 and $2000,
respectively.Suppose also that the two transactions are executed one at a
time in the order T1 followed by T2. This execution sequence appears in
Figure 14.2.

The final values of accounts A and B, after the execution in Figure 14.2
takes place, are $855 and $2145, respectively.
In the above schedule total amount of money in accounts A and B—that
is, the sum A + B—is preserved after the execution of both transactions.

Similarly, if the transactions are executed one at a time in the order T2


followed by T1, then the corresponding execution sequence is that of
Figure 14.3. Again, the sum A + B is preserved, and the final values of
accounts A and B are $850 and $2150, respectively..
 The execution sequences just described are called schedules. These
schedules are serial: Each serial schedule consists of a sequence of
instructions from various transactions, where the instructions
belonging to one single transaction appear together in that schedule.

When the database system executes several transactions concurrently, the


corresponding schedule no longer needs to be serial.

If two transactions are running concurrently, the operating system may


execute one transaction for a little while, then perform a context
switch,execute the second transaction for some time, and then switch
back to the first transaction for some time, and so on.

With multiple transactions, the CPU time is shared among all the
transactions.Not all concurrent executions result in a correct state. To
illustrate,

Consider the schedule of Figure 14.5. After the execution of this


schedule, we arrive at a state where the final values of accounts A and B
are $950 and $2100, respectively.This final state is an inconsistent state,
since we have gained $50 in the process of the concurrent execution.
Indeed, the sum A + B is not preserved by the execution of the two
transactions.

If control of concurrent execution is left entirely to the operating


system, many possible schedules, including ones that leave the
database in an inconsistent state.

It is the job of the database system to ensure that any schedule that is
executed will leave the database in a consistent state. The
concurrency-control component of the database system carries out
this task.
Serializability:

 A non-serial schedule is called a serializable schedule if it can be


converted to its equivalent serial schedule.
OR
 If a non-serial schedule and a serial schedule result in the same
then the non-serial schedule is called a serializable schedule.
Serializability is related to schedules and transactions. Schedule is a set
of transactions, and a transaction is a set of instructions used to perform
any logical operations in terms of databases.

 Serializability of schedules ensures that a non-serial schedule is


equivalent to a serial schedule. It helps in maintaining the transactions
to execute simultaneously without interleaving one another.

Serializability is a way to check if the execution of two or more


transactions are maintaining the database consistency or not.

There are two types of Serializabilities

1. Conflict Serializability
2. View Serializability

Conflict Serializability:

Let us consider a schedule S in which there are two instructions, I and J ,


of transactions Ti and Tj .

 If I and J refer to different data items, then we can swap I and J


without affecting the results of any instruction in the schedule .

 If I and J refer to the same data item Q, then the order of the two
steps may matter.
 As we are dealing with only read and write instructions, there are
four cases that we need to consider:

1. I = read(Q), J = read(Q). The order of I and J does not matter, since the
same value of Q is read by Ti and Tj , regardless of the order.

2. I = read(Q), J = write(Q). If I comes before J , then Ti does not read the


value of Q that is written by Tj in instruction J . If J comes before I, then
Ti reads the value of Q that is written by Tj. Thus, the order of I and J
matters.

3. I = write(Q), J = read(Q). The order of I and J matters for reasons


similar to those of the previous case.

4. I = write(Q), J = write(Q). Since both instructions are write operations,


the order of these instructions does not affect either Ti or Tj .

But the value obtained by the next read(Q) instruction of S is affected,


since the result of only the latter of the two write instructions is preserved
in the database.

When both I and J are read instructions the order of the instructions
does not matter.

The I and J conflict if they are operations by different transactions


on the same data item, and at least one of these instructions is a write
operation.
Consider the schedule 3 in Figure 14.6.

The write(A) instruction of T1 conflicts with the read(A) instruction


of T2. However, the write(A) instruction of T2 does not conflict with the
read(B) instruction of T1, because the two instructions access different
data items.

Let I and J be consecutive instructions of a schedule S. If I and J are


instructions of different transactions and I and J do not conflict, then we
can swap the order of I and J to produce a new schedule S'.

S is equivalent to S', since all instructions appear in the same order in


both schedules except for I and J , whose order does not matter.

Since the write(A) instruction of T2 in schedule 3 of Figure 14.6 does not


conflict with the read(B) instruction of T1, we can swap these instructions
to generate an equivalent schedule, schedule 5, in Figure 14.7.
We can swap non conflicting instructions as follows
• Swap the read(B) instruction of T1 with the read(A) instruction of T2.
• Swap the write(B) instruction of T1 with the write(A) instruction of T2.
• Swap the write(B) instruction of T1 with the read(A) instruction of T2.
The final result of these swaps, schedule 6 of Figure 14.8, is a serial
schedule.

Schedule 6 is exactly the same as schedule 1, but it shows only the


read and write instructions. Thus, we have shown that schedule 3 is
equivalent to a serial schedule (schedule 1).

Conflict equivalent : If a schedule S can be transformed into a schedule


S' by a series of swaps of non conflicting instructions, we say that S and
S' are conflict equivalent.

Conflict serializable: A schedule S is conflict serializable if it is conflict


equivalent to a serial schedule.

Schedule 3 is conflict serializable, since it is conflict equivalent to the


serial schedule 1.

Consider schedule 7 of Figure 14.9; it consists of only the significant


operations (that is, the read and write) of transactions T3 and T4. This
schedule is not conflict serializable, since it is not equivalent to either the
serial schedule <T3,T4> or the serial schedule <T4,T3>.
Testing for Serializability:

We can test for the Serializability using a precedence graph.

We cans use precedence graph to test both conflict Serializability and


View Serializability of a schedule.
Consider a schedule S. We construct a directed graph, called a
precedence graph, from S. This graph consists of a pair G = (V, E),
where V is a set of vertices and E is a set of edges. The set of vertices
consists of all the transactions participating in the schedule. The set of
edges consists of all edges

Ti →Tj for which one of three conditions holds:

1. Ti executes write(Q) before Tj executes read(Q).

2. Ti executes read(Q) before Tj executes write(Q).

3. Ti executes write(Q) before Tj executes write(Q).

If an edge Ti → Tj exists in the precedence graph, then, in any serial


schedule S` equivalent to S, Ti must appear before Tj .

Example: The precedence graph for schedule 1 in Figure 14.10a contains


the single edge T1 → T2, since all the instructions of T1 are executed
before the first instruction of T2 is executed.

Similarly, Figure 14.10b shows the precedence graph for schedule 2 with
the single edge T2 →T1, since all the instructions of T2 are executed
before the first instruction of T1 is executed.
The precedence graph for schedule 4 appears in Figure 14.11. It contains
the edge T1 →T2, because T1 executes read(A) before T2 executes
write(A). It also contains the edge T2→T1, because T2 executes read(B)
before T1 executes write(B).

If the precedence graph for S has a cycle, then schedule S is not


conflict serializable. If the graph contains no cycles, then the schedule
S is conflict serializable.
If we draw precedence graph for schedule 3 ,then it do not contain any
cycles. Hence this is a conflict Serializable schedule.

The precedence graphs for schedules 1 and 2 (Figure 14.10) do not


contain cycles. The precedence graph for schedule 4 (Figure 14.11), on
the other hand, contains a cycle, indicating that this schedule is not
conflict Serializable.

View Serializability and View Serializable Schedules

If a non-serial schedule is view equivalent to some other serial schedule


then the schedule is called View Serializable Schedule. It is needed to
ensure the consistency of a schedule.

View equivalence

The two conditions needed by schedules(S1 and S2) to be view


equivalent are:
1. Initial read must be on the same piece of data.

Example: If transaction T1 is reading "A" from database in schedule S1,


then in schedule S2, T1 must read A.

2. Final write must be on the same piece of data.

Example: If a transaction T1 updated A at last in S1, then in S2, T1


should perform final write as well.

3. The mid sequence should also be in the same order.

Example: If T1 is reading A which is updated by T2 in S1, then in S2, T1


should read A which should be updated by T2.

This process of checking view equivalency of a schedule is called View


Serializability.

Example: We have a schedule "S" having two transactions T1, and T2


working simultaneously.

Schedule S:

T1 T2
R(x)
W(x)
R(x)
W(x)
R(y)
W(y)
R(y)
W(y)

Form its view equivalent schedule (S') by interchanging mid-read-write


operations of both the transactions. S':

T1 T2
R(x)
W(x)
R(y)
W(y)
R(x)
W(x)
R(y)
W(y)
Since a view equivalent schedule is possible, it is a view serializable
schedule.

A conflict serializable schedule is always viewed as view serializable,


but vice versa is not always true.

Testing for view Serializability

1. First, check for conflict Serializability.


2. Check for a blind write. If there is a blind write, then the schedule
can be view serializable. So, check its view Serializability using
the view equivalent schedule technique . If there is no blind write,
then the schedule can never be view serializable.

Blind write is writing a value or piece of data without reading it.

Example:

We have a schedule "S" having two transactions t1, t2, and t3 working
simultaneously.

Schedule S:

T1 T2 T3
R(x)
W(x)
W(x)
W(x)

It's precedence graph:


Since there is a loop present, the schedule is non-conflicting serializable
schedule. Now, there are blind writes [t2 -> w(x) and t3 -> w(x)] present,
hence check for View Serializability.

One of its view equivalent schedules can be:

S':

T1 T2 T3
R(x)
W(x)
W(x)
W(x)

Transaction Isolation and Atomicity

 If a transaction Ti fails, for whatever reason, we need to undo the


effect of this transaction to ensure the atomicity property of the
transaction.
 In a system that allows concurrent execution, the atomicity
property requires that any transaction Tj that is dependent on Ti
(that is, Tj has read data written by Ti) is also aborted.
 To achieve this, we need to place restrictions on the type of
schedules permitted in the system.
Recoverable Schedules:

Consider the partial schedule 9 in Figure 14.14, in which T7 is a


transaction that performs only one instruction: read(A). We call this a
partial schedule because we have not included a commit or abort
operation for T6. Notice that T7 commits immediately after executing the
read(A) instruction. Thus, T7 commits while T6 is still in the active state.
Now suppose that T6 fails before it commits. T7 has read the value of
data item A written by T6. Therefore, we say that T7 is dependent on T6.
Because of this, we must abort T7 to ensure atomicity. However, T7 has
already committed and cannot be aborted. Thus, we have a situation
where it is impossible to recover correctly from the failure of T6.
Schedule 9 is an example of a nonrecoverable schedule.

A Recoverable schedule: Arecoverable schedule is one where, for each


pair of transactions Ti and Tj such that Tj reads a data item previously
written by Ti , the commit operation of Ti appears before the commit
operation of Tj . For the example of schedule 9 to be recoverable, T7
would have to delay committing until after T6 commits.

Cascadeless Schedules

Even if a schedule is recoverable, to recover correctly from the failure of


a transaction Ti , we may have to roll back several transactions. Such
situations occur if transactions have read data written by Ti . As an
illustration, consider the partial schedule of Figure 14.15. Transaction T8
writes a value of A that is read by transaction T9. Transaction T9 writes a
value of A that is read by transaction T10.
Suppose that, at this point, T8 fails. T8 must be rolled back. Since T9 is
dependent on T8, T9 must be rolled back. Since T10 is dependent on T9,
T10 must be rolled back. This phenomenon, in which a single transaction
failure leads to a series of transaction rollbacks, is called cascading
rollback.

Cascading rollback is undesirable, since it leads to the undoing of a


significant amount of work. It is desirable to restrict the schedules to
those where cascading rollbacks cannot occur. Such schedules are called
cascadeless schedules.

Cascadeless schedule: A Cascadeless schedule is one where, for each


pair of transactions Ti and Tj such that Tj reads a data item previously
written by Ti , the commit operation of Ti appears before the read
operation of Tj . It is easy to verify that every cascadeless schedule is also
recoverable.

Implementation of Isolation:

There are various concurrency-control policies that we can use to ensure


that, even when multiple transactions are executed concurrently, only
acceptable schedules are generated. Some of the concurrency-control
policies are,

Locking

One technique for the implementation of isolation is Locking. In locking


mechanism the transactions lock only those data items that it accesses.
Under such a policy, the transaction must hold locks long enough to
ensure Serializability, but for a period short enough not to harm
performance excessively.
Timestamps

Another category of techniques for the implementation of isolation


assigns each transaction a timestamp, typically when it begins. For each
data item, the system keeps two timestamps. The read timestamp of a data
item holds the largest (that is, the most recent) timestamp of those
transactions that read the data item.

The write timestamp of a data item holds the timestamp of the transaction
that wrote the current value of the data item. Timestamps are used to
ensure that transactions access each data item in order of the transactions’
timestamps if their accesses conflict. When this is not possible, offending
transactions are aborted and restarted with a new timestamp.

Multiple Versions and Snapshot Isolation:

By maintaining more than one version of a data item, it is possible to


allow a transaction to read an old version of a data item rather than a
newer version written by an uncommitted transaction or by a transaction
that should come later in the serialization order. There are a variety of
multiversion concurrency control techniques. One is called snapshot
isolation.

Snapshot isolation:

In snapshot isolation, each transaction is given its own version, or


snapshot, of the database when it begins. It reads data from this private
version and is thus isolated from the updates made by other transactions.

If the transaction updates the database, that update appears only in its
own version, not in the actual database itself. Information about these
updates is saved so that the updates can be applied to the “real” database
if the transaction commits.

When a transaction T enters the partially committed state, it then


proceeds to the committed state only if no other concurrent transaction
has modified data that T intends to update. Transactions that, as a result,
cannot commit abort instead.

Snapshot isolation ensures that attempts to read data never need to wait
(unlike locking). Read-only transactions cannot be aborted; only those
that modify data can be aborted. Since each transaction reads its own
version or snapshot of the database, reading data does not cause
subsequent update attempts by other transactions to wait (unlike locking).
The problem with snapshot isolation is that, it provides too
much isolation.

Consider two transactions T and T`. In a serializable execution, either T


sees all the updates made by T` or T` sees all the updates made by T,
because one must follow the other in the serialization order. Under
snapshot isolation, there are cases where neither transaction sees the
updates of the other. This is a situation that cannot occur in a serializable
execution.
In many cases, the data accesses by the two transactions do not conflict
and there is no problem.

If T reads some data item that T` updates and T`reads some data item that
T updates, it is possible that both transactions fail to read the update made
by the other. The result, may be an inconsistent database state that, of
course, could not be obtained in any Serializable execution.

You might also like