Chapter 1 - Transaction Processing and MGT
Chapter 1 - Transaction Processing and MGT
Lecture 1
1
Transaction Processing Concepts
Outline
Introduction to Transaction Processing
Transaction and System Concepts
Desirable Properties of Transactions
Characterizing Schedules based on Recoverability
Characterizing Schedules based on Serializability
Transaction Support in SQL
2
Introduction
Single user Vs multiuser systems
One criterion for classifying a database system is
according to the number of users who can use the
system at the same time
Single-User System:
A DBMS is a single user if at most one user at a time can use the
system.
Multiuser System:
Many users can access the system concurrently.
Concurrency
Interleaved processing:
Concurrent execution of processes is interleaved in a single
3
Introduction (cont…)
A Transaction:
Logical unit of database processing that includes one or more
access operations (read, retrieval, write, insert or update and
delete)
A transaction (set of operations) may be stand-alone
specified in a high level language like SQL submitted
interactively, or may be embedded within a program.
Examples include ATM transactions, credit card approvals, flight
reservations, hotel check-in, phone calls, supermarket scanning,
academic registration and billing.
Transaction boundaries:
One way of specifying transaction boundaries is using explicit
Begin and End transaction statements in an application
program
An application program may contain several transactions
separated by the Begin and End transaction boundaries
4
Introduction (cont…)
5
Introduction (cont…)
Read and write operations:
Basic unit of data transfer from the disk to the computer
main memory is one block.
In general, a data item (what is read or written) will be
named X.
6
Introduction (cont…)
Read and Write Operations (cont.):
write_item(X) command includes the following steps:
Find the address of the disk block that contains item X.
7
Introduction (cont…)
Example of transactions
(a) Transaction T1
(b) Transaction T2
8
Introduction (cont…)
Transactions submitted by the various users may execute
concurrently and may access and update the same
database items
If this concurrent execution is uncontrolled, it may lead to
problems such as inconsistent database
Why Concurrency Control is needed:
Concurrency control is needed to respond to the effect of
the following problems on database consistency
The Lost Update Problem
This occurs when two transactions that access the same
database items have their operations interleaved in a
way that makes the value of some database item
incorrect since the update made by the first transaction
is not used by the second transaction.
In other words, the update made by the fist transaction
is lost(overwritten) by the second transaction
9
Introduction (cont…)
The Temporary Update (Dirty Read) Problem
This occurs when one transaction updates a database
10
Concurrent execution is uncontrolled:
(a) The lost update problem
11
E.g. Account with balance A=100.
T1 reads the account A
T1 withdraws 10 from A
T1 makes the update in the Database
T2 reads the account A
T2 adds 100 on A
T2 makes the update in the Database
In the above case, if done one after the other (serially) then we have no problem.
If the execution is T1 followed by T2 then A=190
If the execution is T2 followed by T1 then A=190
But if they start at the same time in the following sequence:
T1 reads the account A=100 T1 T2
Read_item(A)
T1 withdraws 10 making the balance A=90
A=A-10
T2 reads the account A=100 Read_item(A)
T2 adds 100 making A=200 A=A+100
Write_item(A)
T1 makes the update in the Database A=90
Write_item(A)
T2 makes the update in the Database A=200
After the successful completion of the operation the final value of A will be 200
which override the update made by the first transaction that changed the value from
100 to 90.
12
12
Lost Update problem: solution
Lost update!!
This could have been avoided if we prevent T2 from reading
until T1’s update has been completed
13
Concurrent execution is uncontrolled:
(b) The temporary update problem.
14
Example: T2 increases 100 making it 200 but then aborts the transaction
before it is committed. T1 gets 200, subtracts 10 and make it 190. But
the actual balance should be 90
T1 T2
Read_item(A)
A=A+100
A=A-10
Write_item(A)
Abort
15
The temporary update problem: Example
Time T1 T2 bal(X)
t1 Begin Tx 100
t2 R(balX) 100
t3 balx=balx+100 100
t4 Begin Tx W(balx) 200
t5 R(balX) 200
t6 balx=balx-10 Rollback 200
t7 W(balx) 190
t8 Commit 190
Temporary update!!
Could have been avoided if we prevent T1 from reading until after
the decision to commit or rollback T2 has been made
16
Concurrent execution is uncontrolled:
(c) The incorrect summary problem.
17
The incorrect summary problem: Example
Time T5 T6 Bal(x) Bal(z) Sum
t1 Begin Tx 100 25 0
t2 Begin Tx Sum=0 100 25 0
t3 R(balX) 100 25 0
t4 balx=balx-10 R(balX) 100 25 0
t5 Sum+=balx 100 25 100
W(balx)
t6 R(balZ) 90 25 100
t7 balz=balz+10 90 25 100
t8 W(balz) 90 35 100
t9 Commit R(balz) 90 35 100
t10 Sum+=balz 90 35 135
t11 W(sum) 90 35 135
t12 commit 90 35 135
The incorrect summary problem:
• Example 2: T1 would like to add the values of A=10, B=20 and C=30. after
the values are read by T1 and before its completion, T2 updates the
value of B to be 50. at the end of the execution of the two transactions
T1 will come up with the sum of 60 while it should be 90 since B is
updated to 50
T1 T2
Sum= 0;
Read_item(A)
Sum=Sum+A
Read_item(B)
Sum=Sum+B
Read_item(B)
B=50
Read_item(C)
Sum=Sum+C
19
What causes a Transaction to fail?
1. A computer failure (system crash):
A hardware or software error may occur in the
computer system during transaction execution. If
the hardware crashes, the contents of the
computer’s internal memory may be lost.
2. A transaction or system error:
Some operation in the transaction may cause it to
fail, such as integer overflow or division by zero.
Transaction failure may also occur because of
erroneous parameter values or because of a
logical programming error
20
What causes a Transaction to fail (Cont...)
3. Local errors or exception conditions detected by the
transaction:
Certain conditions necessitate cancellation of the
transaction
For example, data for the transaction may not
be found
A programmed abort in the transaction causes it to
fail.
4. Concurrency control enforcement:
The concurrency control method may decide to
abort the transaction, to be restarted later, because
it violates serializability or because several
transactions are in a state of deadlock
21
What causes a Transaction to fail (cont.):
5. Disk failure:
Some disk blocks may lose their data because of a
read or write malfunction or because of a disk
read/write head crash.
This may happen during a read or a write operation
of the transaction.
6. Physical problems and catastrophes:
This refers to an endless list of problems that
includes power or air-conditioning failure, fire, theft,
sabotage, overwriting disks or tapes by mistake,
and mounting of a wrong tape by the operator.
22
Transaction and System Concepts
Committed state
Failed state
Terminated State
23
State transition diagram illustrating the
states for transaction execution
24
Transaction and System Concepts (cont…)
Transaction operations
For recovery purposes, the system needs to keep track of when
the transaction starts, terminates, and commits or aborts
Recovery manager keeps track of the following operations:
begin_transaction: This marks the beginning of transaction
execution
read or write: These specify read or write operations on the
25
Transaction and System Concepts (cont…)
commit_transaction:
This signals a successful end of the transaction so that
26
Transaction and System Concepts (cont…)
The System Log
27
Transaction and System Concepts (cont…)
The System Log (cont):
started execution.
[write_item,T,X,old_value,new_value]: Records that
28
The System Log (cont):
[read_item,T,X]: Records that transaction T has
read the value of database item X.
[commit,T]: Records that transaction T has
completed successfully, and affirms that its effect
can be committed (recorded permanently) to the
database.
[abort,T]: Records that transaction T has been
aborted.
29
Recovery using log records:
If the system crashes, we can recover to a consistent
database state by examining the log record and using
recovery methods.
1. Because the log contains a record of every write
operation that changes the value of some database
item, it is possible to undo the effect of these write
operations of a transaction T by tracing backward
through the log and resetting all items changed by a
write operation of T to their old_values.
2. We can also redo the effect of the write operations of
a transaction T by tracing forward through the log and
setting all items changed by a write operation of T
(that did not get done permanently) to their
new_values.
30
Transaction and System Concepts (cont…)
Commit Point of a Transaction:
Definition a Commit Point:
31
Transaction and System Concepts (cont…)
Undoing transactions
If a system failure occurs, we search back in the log for
32
Transaction and System Concepts (cont…)
33
Desirable Properties of Transactions
Transaction should posses several properties. They are
often called the ACID properties and should be enforced by
the concurrency control and recovery methods of the DBMS.
ACID properties:
Atomicity: A transaction is an atomic unit of processing; it is
either performed in its entirety or not performed at all.
Consistency preservation: A correct execution of the
transaction must take the database from one consistent
state to another.
Isolation: A transaction should not make its updates visible
to other transactions until it is committed; this property, when
enforced strictly, solves the temporary update problem and
makes cascading rollbacks of transactions unnecessary
Durability or permanency: Once a transaction changes the
database and the changes are committed, these changes
must never be lost because of subsequent failure.
34
Example:
Suppose that Ti is a transaction that transfer 200 birr from account
CA2090( which is 5,000 Birr) to SB2359(which is 3,500 birr) as follows
Read(CA2090)
CA2090= CA2090-200
Write(CA2090)
Read(SB2359)
SB2359= SB2359+200
Write(SB2359)
Atomicity- either all or none of the above operation will be done – this is
materialized by transaction management component of DBMS
Consistency-the sum of CA2090 and SB2359 be unchanged by the
execution of Ti i.e 8500- this is the responsibility of application
programmer who codes the transaction
Isolation- when several transaction are being processed concurrently
on a data item they may create many inconsistent problems. So
handling such case is the responsibility of Concurrency control
component of the DBMS
Durability - once Ti writes its update this will remain there when the
database restarted from failure . This is the responsibility of recovery
management components of the DBMS 35
35
Schedules
Schedule (or history) of transaction
When transactions are executing concurrently in an interleaved
36
Schedules (cont…)
A shorthand notation for describing a schedule uses the
symbols :
r : for read_item operations ,
w: write_item,
c: commit and
a: abort
Transaction numbers are appended as subscript to each
operation in the schedule
The database item X that is read or written follows the r
and w operations in parenthesis
Example:
Sa: r1(X),r2(x),w1(x), r1(Y),w2(x);w1(Y)
Sb: r1(X),w1(x),r2(x), w2(x), r1(Y),a1
37
Conflicting operations
Two operations in a schedule are said to conflict if they
satisfy all three of the following conditions:
They belong to different transactions
38
Non conflicting operations
The operations r1(x) and r2(x) do not conflict since both of
them are read operations
r1(x) and w1(x) do not conflict because they belong to the
same transaction
W2(x) and w1(y) do not conflict since they operate on
distinct data items x and y
39
Complete schedules
A schedule S of n transactions T1, T2, ……..,Tn is
said to be a complete schedule if the following
conditions hold:
1. The operations in S are exactly those operations
in T1, T2, …Tn including a commit or abort
operations as the last operation for each
transaction in the schedule
2. For any pair of operations from the same
transaction Ti, their order of appearance in S is the
same as their order of appearance in T
3. For any two conflicting operations, one of the two
must occur before the other in the schedule
(theoretically, it is not necessary to determine an
order b/n pair of non conflicting operations)
40
Complete schedules (cont…)
Condition (3) above allows for two non conflicting
operations to occur in the schedule without defining
which occurs first leading to the definition of partial
order of the operations in n tractions
41
Complete schedules (cont…)
In general, it is difficult to encounter complete
schedules in a transaction processing system,
because new transactions are continually being
submitted to the system
Hence, it is useful to define committed projection
C(S) of a schedule S, which includes only the
operations in S that belong to committed
transactions – that is transactions Ti whose
commit operation ci is in S
42
Characterizing Schedules based on Recoverability
43
Characterizing Schedules based on
Recoverability
Schedules classified based on recoverability:
Recoverable schedule:
to rollback T
The schedules that theoretically meet this criterion are called
recoverable and those that do not are non recoverable
A schedule S is recoverable if no transaction T in S commits until
all transactions T’ that have written an item that T reads have
committed
A transaction T2 reads from Transaction T1 in a schedule S if
some item X is first written by T1 and latter read by T2
In addition, T1 should not have been aborted before T2
problem
However, consider the two schedules S c and Sd below:
Sc:r1(x);w1(x);r2(x);r1(y);w2(x);c2;a1
Sc is not recoverable because T2 reads X from T1 and
then T2 commits before T1 commits.
If T1 aborts after the c2 operations in Sc, then the
value of x that T2 read is no longer valid and T2 must
be aborted after it had been committed, leading to a
schedule that is not recoverable
45
Recoverability (cont…)
For the above schedule to be recoverable, the c2
operation in Sc must be postponed until after T1
commits as shown in Sd
Sd:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);c1;c2
Recoverable
46
Recoverability (cont…)
If T1 aborts instead of committing, then T2 should also abort
as shown in Se because the X it read is no longer valid
Se:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);a1;a2 Recoverable
47
Cascadeless schedule:
One where every transaction reads only the items that are written by
committed transactions. Eg.
Sf: r1(X); w1(X); r1(Y); c1; r2(X); w2(X);w1(Y); c2;
Strict Schedules:
A schedule in which a transaction can neither read or write an item X
until the last transaction that wrote X has committed/aborted.
Eg. Sg: w1(X,5) ; c1; w2(x,8);
48
Characterizing Schedules based on Serializability
49
Characterizing Schedules based on
Serializability
Serial schedule:
A schedule S is serial if, for every transaction T
participating in the schedule, all the operations of T
are executed consecutively in the schedule
Otherwise, the schedule is called non serial
schedule.
Serializable schedule:
A schedule S is serializable if it is equivalent to
some serial schedule of the same n transactions
50
Characterizing Schedules based on
Serializability (cont….)
Being serializable is not the same as being serial
Being serializable implies that the schedule is a
correct schedule
It will leave the database in a consistent state.
The interleaving is appropriate and will result in a
state as if the transactions were serially executed,
yet will achieve efficiency due to concurrent
execution.
51
Characterizing Schedules based on
Serializability (cont…)
It’s difficult to determine when a schedule begins
and when it ends.
Hence, we reduce the problem of checking the
whole schedule to checking only a committed
projection of the schedule (i.e. operations from
only the committed transactions.)
Current approach used in most DBMSs:
Use of locks with two phase locking
52
– The concept of Serializable of schedule is used to identify which
schedules are correct when concurrent transactions executions have
interleaving of their operations in the schedule
Serial schedule:
A schedule S is serial if, for every transaction T participating in the
53
54
Characterizing Schedules based on
Serializability (cont….)
Result equivalent:
Two schedules are called result equivalent if they
Conflict equivalent:
Two schedules are said to be conflict equivalent if
55
cont..
Conflict serializable:
A schedule S is said to be conflict serializable if it is
56
Two schedules are said to be view equivalent if the
following three conditions hold:
1. The same set of transactions participates in S and
S’, and S and S’ include the same operations of
those transactions.
2. If Ti reads a value A written by Tj in S1 , it must also
read the value of A written by Tj in S2
3. for each data object A, the transaction that perform
the final write on x in S1 must also perform the final
write on A in S2
S’ S
T1: R(A) W(A) T1: R(A),W(A)
T2: W(A) view T2: W(A)
T3: W(A) T3: W(A)
57
Relationship between view and conflict equivalence:
The two are same under constrained write assumption
which assumes that if T writes X, it is constrained by the
value of X it read; i.e., new X = f(old X)
Conflict serializability is stricter than view serializability.
With unconstrained write (or blind write), a schedule that is
view serializable is not necessarily conflict serializable.
58
Consider the following schedule of three transactions
T1: r1(X), w1(X); T2: w2(X); and T3: w3(X):
Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
In Sa, the operations w2(X) and w3(X) are blind writes, since T1
and T3 do not read the value of X.
Sa is view serializable, since it is view equivalent to the
59
Determining conflict serializability
To determine serializability, first identify the pair of
conflicting operations and check if their order is preserved in
one of the possible serial schedules
schedule A:
r1(x);w1(x),r1(y);w1(y);r2(x);w2(x)- serial schedule
schedule B:
r2(x);w2(x); r1(x);w1(x),r1(y);w1(y)- serial schedule
schedule C:
r1(x);r2(x);w1(x);w2(x),w1(y)- (not serializable).
Schedule D:
r1(x);w1(x);r2(x);w2(x);r1(y);w1(y)-(serializable, equivalent to
schedule A).
60
Serializability (cont…)
Testing for conflict serializability with precedence graphs:
Algorithm
For each transaction Ti participating in Schedule S, create a node
has no cycles.
61
Testing serializability with Precedence Graphs
Serial
Serial
Not Serializable
Serializable
62
Transaction Support in SQL
A single SQL statement is always considered to be atomic.
Either the statement completes execution without error or it fails and
leaves the database unchanged.
Every transaction has three characteristics: Access mode, Diagnostic size
and isolation
i. Access mode:
READ ONLY or READ WRITE
If the access mode is Read ONLY , INSERT, DELET ,
UPDATE & CREATE commands cannot be executed on the
data base
The default is READ WRITE unless the isolation level of
READ UNCOMITTED is specified, in which case READ
ONLY is assumed.
ii. Diagnostic size n, specifies an integer value n, indicating the
number of error conditions that can be held simultaneously in the
diagnostic area.
iii. Isolation level can be
READ UNCOMMITTED,
READ COMMITTED,
REPEATABLE READ or
SERIALIZABLE. The default is SERIALIZABLE.
63
With SQL, there is no explicit Begin Transaction
statement.
Transaction initiation is done implicitly when
particular SQL statements are encountered.
Every transaction must have an explicit end
statement, which is either a COMMIT or
ROLLBACK.
64
Sample SQL transaction:
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTICS SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
65
Potential problem with lower isolation levels: Four types
66
iii. Overwriting Uncommitted Data: WW Conflicts
• A transaction T2 could overwrite the value of an object A,
which has already been modified by a transaction T1,
while T1 is still in progress.
T1: W(A), W(B), C
T2: W(A), W(B), C
iv. Phantoms:
New rows being read using the same read with a condition.
A transaction T1 may read a set of rows from a table,
perhaps based on some condition specified in the SQL
WHERE clause.
Now suppose that a transaction T2 inserts a new row that
also satisfies the WHERE clause condition of T1, into the
table used by T1.
If T1 is repeated, then T1 will see a row that previously did
not exist, called a phantom. 67
Transaction Support in SQL
Possible violation of serializabilty:
Type of Violation
68
Summary
Transaction and System Concepts
Desirable Properties of Transactions
Characterizing Schedules based on Recoverability
Characterizing Schedules based on Serializability
Transaction Support in SQL
69
Thank You
70