Ad Database Transaction Concept
Ad Database Transaction Concept
1
Outline
Introduction to Transaction Processing
Transaction and System Concepts
Desirable Properties of Transactions
Characterizing Schedules based on
Recoverability
Characterizing Schedules based on Serializability
Transaction Support in SQL
2
Introduction
Single user Vs multiuser systems
One criterion for classifying a database system is according
to the number of users who can use the system at the same
time
Single-User System:
A DBMS is a single user if at most one user at a time can use the system.
Multiuser System:
Many users can access the system concurrently.
Concurrency
Interleaved processing:
Concurrent execution of processes is interleaved in a single CPU
3
Introduction (cont…)
A Transaction:
Logical unit of database processing that includes one or more access
operations (read, retrieval, write, insert or update and delete)
A Transaction is a mechanism for applying the desired
modifications/operations to a database. It is evident in real life
that the final database instance after a successful manipulation of
the content of the database is the most up-to-date copy of the
database.
Action, or series of actions, carried out by a single user or
application program, which accesses or changes contents of
database. (i.e. Logical unit of work on the database.)
A transaction (set of operations) may be stand-alone specified in a
high level language like SQL submitted interactively, or may be
embedded within a program.
4
Introduction (cont…)
Transaction boundaries:
One of specifying transaction boundaries is using explicit Begin and End transaction
statements in an application program
An application program may contain several transactions separated by the Begin
and End transaction boundaries
Simple Model of a Database (for purposes of discussing transactions):
A database is a collection of named data items
simplify our notation, we assume that the program variable is also named X.
write_item(X): Writes the value of program variable X into the database item
named X.
5
Introduction (cont…)
Read and write operations:
Basic unit of data transfer from the disk to the computer
main memory is one block.
In general, a data item (what is read or written) will be
Copy that disk block into a buffer in main memory (if that
named X.
6
Introduction (cont…)
Read and Write Operations (cont.):
write_item(X) command includes the following steps:
Find the address of the disk block that contains item X.
7
Introduction (cont…)
Two outcomes for any transaction:
Success - transaction commits and database
reaches a new consistent state
Committed transaction cannot be aborted or rolled
back.
Failure - transaction aborts, and database must
be restored to consistent state before it started.
Such a transaction is rolled back or undone.
Aborted transaction that is rolled back can be
restarted later.
8
Introduction (cont…)
Example of transactions
(a) Transaction T1
(b) Transaction T2
9
Introduction (cont…)
Transactions submitted by the various users may execute
concurrently and may access and update the same
database items
If this concurrent execution is uncontrolled, it may lead to
problems such as inconsistent database
Why Concurrency Control is needed:
Concurrency control is needed to respond to the effect of
the following problems on database consistency
The Lost Update Problem
This occurs when two transactions that access the
same database items have their operations
interleaved in a way that makes the value of some
database item incorrect.
10
The Lost Update Problem…(Cont’d)
For example, if X = 80, N = 5 and M = 4, the final result should be X =
79; but in the interleaving of operations X = 84 because the update in
that removed the five seats from X was lost.
11
Lost Update problem: Example
Time T1 T2 bal(X)
t1 Begin Tx 100
t5 W(balx) Commit 90
t6 Commit 90
Lost update!!
This could have been avoided if we prevent T1 from
reading untill T2’s update has been completed
12
Introduction (cont…)
The Temporary Update (Dirty Read) Problem
This occurs when one transaction updates a database
13
The temporary update problem: Example
Time T3 T4 bal(X)
t1 Begin Tx 100
t2 R(balX) 100
t3 balx=balx+100 100
t4 Begin Tx W(balx) 200
t5 R(balX) 200
t6 balx=balx-10 Rollback 200
t7 W(balx) 190
t8 Commit 190
Temporary update!!
Could have been avoided if we prevent T3 from reading until after
the decision to commit or rollback T4 has been made
14
Introduction (cont…)
The Incorrect Summary Problem
If one transaction is calculating an aggregate
15
Concurrent execution is uncontrolled:
(c) The incorrect summary problem.
16
The incorrect summary problem: Example
Time T5 T6 Bal(x) Bal(z) Sum
t1 Begin Tx 100 25 0
t2 Begin Tx Sum=0 100 25 0
t3 R(balX) 100 25 0
t4 balx=balx-10 R(balX) 100 25 0
t5 W(balx) Sum+=balx 90 25 100
t6 R(balZ) 90 25 100
t7 balz=balz+10 90 25 100
t8 W(balz) 90 35 100
t9 Commit R(balz) 90 35 100
t10 Sum+=balz 90 35 135
t11 W(sum) 90 35 135
t12 commit 90 35 135
What causes a Transaction to fail?
1. A computer failure (system crash):
A hardware or software error may occur in the
computer system during transaction execution. If
the hardware crashes, the contents of the
computer’s internal memory may be lost.
2. A transaction or system error:
Some operation in the transaction may cause it to
fail, such as integer overflow or division by zero.
Transaction failure may also occur because of
erroneous parameter values or because of a
logical programming error
18
What causes a Transaction to fail (Cont...)
3. Local errors or exception conditions detected by the
transaction:
Certain conditions necessitate cancellation of the
transaction
For example, data for the transaction may not
be found
A condition, such as insufficient account
balance in a banking database, may cause a
transaction, such as a fund withdrawal from
that account, to be canceled.
A programmed abort in the transaction causes it to
fail.
19
What causes a Transaction to fail
(Cont…)
of the transaction.
20
What causes a Transaction to fail (cont.):
21
Transaction and System Concepts
Transaction States and Additional Operations
The System Log
Commit Point of a Transaction
Committed state
Failed state
Terminated State
22
State transition diagram illustrating the
states for transaction execution
23
Transaction and System Concepts (cont…)
Transaction operations
A transaction is an atomic unit of work that is either
completed in its entirety or not done at all. For recovery
purposes, the system needs to keep track of when the
transaction starts, terminates, and commits or aborts.
Hence, recovery manager keeps track of the following
operations:
begin_transaction: This marks the beginning of
transaction execution
read or write: These specify read or write operations on
the database items that are executed as part of a
transaction
24
Transaction and System Concepts (cont…)
25
Transaction and System Concepts (cont…)
26
Transaction and System Concepts (cont…)
The System Log
27
Transaction and System Concepts (cont…)
The System Log (cont):
started execution.
[write_item,T,X,old_value,new_value]: Records that
28
The System Log (cont):
[read_item,T,X]: Records that transaction T has
read the value of database item X.
[commit,T]: Records that transaction T has
completed successfully, and affirms that its effect
can be committed (recorded permanently) to the
database.
[abort,T]: Records that transaction T has been
aborted.
29
Recovery using log records:
If the system crashes, we can recover to a consistent
database state by examining the log record and using
recovery methods.
1. Because the log contains a record of every write
operation that changes the value of some database
item, it is possible to undo the effect of these write
operations of a transaction T by tracing backward
through the log and resetting all items changed by a
write operation of T to their old_values.
2. We can also redo the effect of the write operations of
a transaction T by tracing forward through the log and
setting all items changed by a write operation of T
(that did not get done permanently) to their
new_values.
30
Transaction and System Concepts (cont…)
Commit Point of a Transaction:
Definition a Commit Point:
31
Transaction and System Concepts (cont…)
Undoing transactions
If a system failure occurs, we search back in the log for
32
Transaction and System Concepts (cont…)
33
Desirable Properties of Transactions
Transaction should posses several properties. They are
often called the ACID properties and should be enforced by
the concurrency control and recovery methods of the DBMS.
ACID properties:
Atomicity: A transaction is an atomic unit of processing; it is
either performed in its entirety or not performed at all.
Consistency preservation: A correct execution of the
transaction must take the database from one consistent
state to another.
Isolation: A transaction should not make its updates visible
to other transactions until it is committed; this property, when
enforced strictly, solves the temporary update problem and
makes cascading rollbacks of transactions unnecessary
Durability or permanency: Once a transaction changes the
database and the changes are committed, these changes
must never be lost because of subsequent failure.
34
Schedules and Recoverability
Schedules of Transactions
Characterizing Schedules Based on Recoverability
35
Schedules (cont…)
A shorthand notation for describing a schedule uses the
symbols :
r : for read_item operations ,
w: write_item,
c: commit and
a: abort
Transaction numbers are appended as subscript to each
operation in the schedule
The database item X that is read or written follows the r
and w operations in parenthesis
Example:
Sa: r1(X),r2(x),w1(x), r1(Y),w2(x),w1(Y)
Sb: r1(X),w1(x),r2(x), w2(x), r1(Y),a1
36
Conflicting operations
Two operations in a schedule are said to conflict if they
satisfy all three of the following conditions:
They belong to different transactions
37
Non conflicting operations
The operations r1(x) and r2(x) do not conflict since they are
both are read operations
r1(x) and w1(x) do not conflict because they belong to the
same transaction
W2(x) and w1(y) do not conflict since they operate on
distinct data items x and y
38
Complete schedules
A schedule S of n transactions T1, T2, ……..,Tn is
said to be a complete schedule if the following
conditions hold:
1. The operations in S are exactly those operations
in T1, T2, …Tn including a commit or abort
operations as the last operation for each
transaction in the schedule
2. For any pair of operations from the same
transaction Ti, their order of appearance in S is
the same as their order of appearance in Ti
3. For any two conflicting operations, one of the two
must occur before the other in the schedule
(theoretically, it is not necessary to determine an
order b/n pair of non conflicting operations)
39
Complete schedules (cont…)
Condition (3) above allows for two non conflicting
operations to occur in the schedule without defining
which occurs first leading to the definition of partial
order of the operations in n transactions
40
Complete schedules (cont…)
In general, it is difficult to encounter complete
schedules in a transaction processing system,
because new transactions are continually being
submitted to the system
Hence, it is useful to define committed projection
C(S) of a schedule S, which includes only the
operations in S that belong to committed
transactions – that is transactions Ti whose
commit operation ci is in S
41
Characterizing Schedules based on
Recoverability
Schedules classified based on recoverability:
Recoverable schedule:
rollback T
The schedules that theoretically meet this criterion are called recoverable
and those that do not are non recoverable
A schedule S is recoverable if no transaction T in S commits until all
transactions T’ that have written an item that T reads have committed
A schedule is recoverable if each transaction commits only after each
transaction from which it has read has committed.
A transaction T2 reads from Transaction T1 in a schedule S if
some item X is first written by T1 and latter read by T2
In addition, T1 should not have been aborted before T2 reads
42
Characterizing Schedules based on
Recoverability
Consider the schedule given Sa’ where two commit
operations have been added to Sa :
Sa’ : r1(X),r2(x),w1(x), r1(Y),w2(x);c2;w1(Y);c1
Sa’ is recoverable despite it suffers from lost update
problem
However, consider the two partial schedules Sc and Sd below:
Sc:r1(x);w1(x);r2(x);r1(y);w2(x);c2;a1
Sc is not recoverable because T2 reads X from T1 and
then T2 commits before T1 commits.
If T1 aborts after the c2 operations in Sc, then the value
of x that T2 read is no longer valid and T2 must be
aborted after it had been committed, leading to a
schedule that is not recoverable
43
Recoverability (cont…)
For the above schedule to be recoverable, the c2
operation in Sc must be postponed until after T1
commits as shown in Sd
Sd:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);c1;c2
Recoverable
44
Recoverability
If T1 aborts instead of committing, then T2 should also abort
as shown in Se because the X it read is no longer valid
Se:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);a1;a2 Recoverable
45
Characterizing Schedules based on
Serializability
Serial schedule:
A schedule S is serial if, for every transaction T participating in the
46
Characterizing Schedules based on
Serializability…(cont’d)
Non-serial schedules: are schedule where
operations from a set of concurrent transactions
are interleaved. The objective of serializability is to
find non-serial schedules that allow transactions to
execute concurrently without interfering with one
another. In other words, want to find non-serial
schedules that are equivalent to some serial
schedule. Such a schedule is called serializable.
Thus, Serializable schedule:
A schedule S is serializable if it is equivalent to some serial
47
Characterizing Schedules based on
Serializability…(cont’d)
Serialization
Objective of serialization is to find schedules that allow transactions
to execute concurrently without interfering with one another.
If two transactions only read data, order is not important.
If two transactions either read or write completely separate data
items, they do not conflict and order is not important.
If one transaction writes a data item and another reads or writes the
same data item, order of execution is important
Possible solution: Run all transactions serially.
This is often too restrictive as it limits degree of concurrency or
parallelism in system.
48
Characterizing Schedules based on
Serializability (cont….)
When are two schedules considered equivalent?
Result equivalent:
Two schedules are called result equivalent if they
49
Characterizing Schedules based on
Serializability (cont….)
Being serializable is not the same as being serial
Being serializable implies that the schedule is a
correct schedule
It will leave the database in a consistent state.
The interleaving is appropriate and will result in a
state as if the transactions were serially executed,
yet will achieve efficiency due to concurrent
execution.
50
Characterizing Schedules based on
Serializability (cont…)
It’s not possible to determine when a schedule
begins and when it ends.
Hence, we reduce the problem of checking the
whole schedule to checking only a committed
project of the schedule (i.e. operations from only
the committed transactions.)
Current approach used in most DBMSs:
Use of locks with two phase locking
51
Determining conflict serializability
To determine serializability, first identify the pair of
conflicting operations and check if their order is preserved in
one of the possible serial schedules
schedule A:
r1(x);w1(x),r1(y);w1(y);r2(x);w2(x)- serial schedule
schedule B:
r2(x);w2(x); r1(x);w1(x),r1(y);w1(y)- serial schedule
schedule C:
r1(x);r2(x);w1(x);w2(x)w1(y)- (not serializable).
ScheduleD :
r1(x);w1(x);r2(x);w2(x);r1(y);w1(y)-(serializable, equivalent to
schedule A).
52
Characterizing Schedules based on
Serializability (cont…)
Testing for conflict serializability with precedence graphs: Algorithm
For each transaction Ti participating in Schedule S, create a node
53
Testing serializability with Precedence Graphs
schedule A: r1(x);w1(x),r1(y);w1(y);r2(x);w2(x)
schedule B:r2(x);w2(x); r1(x);w1(x),r1(y);w1(y)
schedule C:r1(x);r2(x);w1(x);w2(x)w1(y) (not serializable).
ScheduleD : r1(x);w1(x);r2(x);w2(x);r1(y);w1(y)
(serializable, equivalent to schedule A).
54
Characterizing Schedules based on
Serializability…(cont’d)
55
6 Transaction Support in SQL
A single SQL statement is always considered to
be atomic.
Either the statement completes execution without
error or it fails and leaves the database
unchanged.
With SQL, there may be no explicit Begin
Transaction statement.
Transaction initiation is done implicitly when
particular SQL statements are encountered.
Every transaction must have an explicit end
statement, which is either a COMMIT or
ROLLBACK.
56
Transaction Support in SQL
Characteristics specified by a SET TRANSACTION
statement in SQL:
Isolation level <isolation>, where <isolation> can
be READ UNCOMMITTED, READ COMMITTED,
REPEATABLE READ, SERIALIZABLE or
SNAPSHOT.
With SERIALIZABLE: the interleaved execution
57
Transaction Support in SQL
Potential problem with lower isolation levels:
Dirty Read:
Reading a value that was written by a transaction which failed.
Nonrepeatable Read:
Allowing another transaction to write a new value between
multiple reads of one transaction.
A transaction T1 may read a given value from a table. If
another transaction T2 later updates that value and T1 reads
that value again, T1 will see a different value.
Consider that T1 reads the employee salary for Smith. Next,
58
Transaction Support in SQL
Potential problem with lower isolation levels
(cont...):
Phantoms:
59
Transaction Support in SQL
The Read Committed Isolation Model is SQL Server’s
default behavior. In this model, the database does not
allow transactions to read data written to a table by an
uncommitted transaction. This model protects against
dirty reads, but provides no protection against phantom
reads or non-repeatable reads.
The Read Uncommitted Isolation Model offers
essentially no isolation between transactions. Any
transaction can read data written by an uncommitted
transaction. This leaves the transactions vulnerable to
dirty reads, phantom reads and non-repeatable reads.
60
Transaction Support in SQL
The Repeatable Read Isolation Model goes a step further than the
Read Committed model by preventing transactions from writing data that
was read by another transaction until the reading transaction completes.
This isolation model protect against both dirty reads and non-repeatable
reads.
The Serializable Isolation Model uses range locks to prevent
transactions from inserting or deleting rows in a range being read by
another transaction. The Serializable model protects against all three
concurrency problems.
The Snapshot Isolation Model also protects against all three
concurrency problems, but does so in a different manner. It provides
each transaction with a "snapshot" of the data it requests. The
transaction may then access that snapshot for all future references,
eliminating the need to return to the source table for potentially dirty data.
61
Transaction Support in SQL
Possible violation of serializabilty:
Type of Violation
62