Chapter 3 Transaction Processing Concepts
Chapter 3 Transaction Processing Concepts
06/01/2023 1
Outline
Introduction to Transaction Processing
Transaction and System Concepts
Desirable Properties of Transactions
Characterizing Schedules based on Recoverability
Characterizing Schedules based on Serializability
Transaction Support in SQL
06/01/2023 2
Introduction to Transaction Processing
A Transaction:
Logical unit of database processing that includes one or more access operations.
Read operations (database retrieval, such as SQL SELECT)
Write operations (modify database, such as SQL INSERT, UPDATE, DELETE)
Example: Bank balance transfer of $100 dollars from a checking account to a
saving account in a BANK database
Note: Each execution of a program is a distinct transaction with different
parameters
Bank transfer program parameters: savings account number, checking account
number, transfer amount
Transaction boundaries: Begin and End transaction
Note: An application program may contain several transactions separated by
Begin and End transaction boundaries
06/01/2023 3
Introduction to Transaction Processing
Transaction processing Systems: Large multi-user database systems supporting thousands of
concurrent transactions (user processes) per minute
Two Modes of Concurrency
Interleaved processing: concurrent execution of processes is interleaved on a single CPU.
Parallel processing: processes are concurrently executed on multiple CPUs.
Basic transaction processing theory assumes interleaving
06/01/2023 7
Transaction and System Concepts
A transaction is an atomic unit of work that is either completed in its
entirety or not done at all.
Transaction passes through several states:
Active state (executing read, write operations)
Partially committed state (ended but waiting for system checks to determine
success or failure)
Committed state (transaction succeeded)
Failed state (transaction failed, must be rolled back)
Terminated State (transaction leaves system)
06/01/2023 8
Transaction and System Concepts
For recovery purposes, the system needs to keep track of when the
transaction starts, terminates, and commits or aborts.
Recovery manager keeps track of the following operations:
begin_transaction: start of transaction execution.
read or write: read or write operations on the database items that are
executed as part of a transaction.
end_transaction: specifies that read and write transaction operations
have ended.
System may still have to check whether the changes (writes)
introduced by transaction can be permanently applied to the database
(commit transaction); or whether the transaction has to be rolled
back (abort transaction) because it violates concurrency control or
06/01/2023
for some other reason 9
Transaction and System Concepts
Recovery manager keeps track of the following operations:
commit_transaction: signals successful end of the transaction
so that any changes (updates) executed by the transaction can be
safely committed to the database and will not be undone.
06/01/2023 10
Transaction and System Concepts
06/01/2023 12
Transaction and System Concepts
The System Log File:
It keeps track of all transaction operations that affect the values of
database items in the order in which they occurred.
This information may be needed to permit recovery from
transaction failures.
The log is kept on disk, so it is not affected by any type of failure
except for disk or catastrophic failure.
Log is periodically backed up to archival storage (tape) to guard
against such catastrophic failures.
06/01/2023 13
Transaction and System Concepts
Types of records (entries) in the log file:
[start_transaction,T]: Records that transaction T has started
execution.
[write_item,T,X,old_value,new_value]: T has changed the value
of database item X from old_value to new_value.
[read_item,T,X]: T has read the value of database item X.
[commit,T]: T has completed successfully, and affirms that its
effect can be committed (recorded permanently) to the database.
[abort,T]: T has been aborted.
06/01/2023 14
Transaction and System Concepts
Recovery using log records:
If the system crashes, we can recover to a consistent database state by
examining the log and using one of the techniques of DB recovery.
Because the log contains a record of every write operation that
changes the value of some database item, it is possible to undo the
effect of these write operations of a transaction T by tracing
backward through the log and resetting all items changed by a write
operation of T to their old_values.
We can also redo the effect of the write operations of a transaction
T by tracing forward through the log and setting all items changed by
a write operation of T (that did not get done permanently) to their
new_values.
06/01/2023 15
Transaction and System Concepts
Commit Point of a Transaction
Definition:
A transaction T reaches its commit point when all its operations that
access the database have been executed successfully and the effect of all
the transaction operations on the database has been recorded in the log.
Beyond the commit point, the transaction is said to be committed, and
its effect is assumed to be permanently recorded in the database.
The transaction then writes an entry [commit,T] into the log.
Roll Back of transactions:
Needed for transactions that have a [start_transaction,T] entry into the
log but no commit entry [commit,T] into the log.
06/01/2023 16
Transaction and System Concepts
Commit Point of a Transaction
Redoing transactions:
Transactions that have written their commit entry in the log must also have
recorded all their write operations in the log; otherwise they would not be
committed, so their effect on the database can be redone from the log entries.
(Notice that the log file must be kept on disk).
At the time of a system crash, only the log entries that have been written back
to disk are considered in the recovery process because the contents of main
memory may be lost.)
Force writing a log:
Before a transaction reaches its commit point, any portion of the log that has
not been written to the disk yet must now be written to the disk.
This process is called force-writing the log file before committing a
06/01/2023transaction. 17
Desirable Properties of Transactions
Called ACID properties:
Atomicity: A transaction is an atomic unit of processing; it is either
performed in its entirety or not performed at all.
Consistency preservation: A correct execution of the transaction must
take the database from one consistent state to another.
Isolation: A transaction should not make its updates visible to other
transactions until it is committed; this property, when enforced strictly,
solves the temporary update problem and makes cascading rollbacks of
transactions unnecessary.
Durability or permanency: Once a transaction is committed, its
changes (writes) applied to the database must never be lost because of
subsequent failure.
06/01/2023 18
Schedules of Transactions
Transaction schedule (or history):
When transactions are executing concurrently in an interleaved
fashion, the order of execution of operations from the various
transactions forms what is known as a transaction schedule (or
history)
Figure on next slide shows 4 possible schedules (A, B,C, D) of two
transactions T1 and T2:
Order of operations from top to bottom
Each schedule includes same operation
Different order of operations in each schedule
06/01/2023 19
Schedules of Transactions
06/01/2023 20
Schedules of Transactions
06/01/2023 22
Schedules of Transactions
Formal definition of a schedule (or history) S of n transactions T1, T2, ..., Tn:
An ordering of all the operations of the transactions subject to the constraint
that, for each transaction Ti that participates in S, the operations of Ti in S must
appear in the same order in which they occur in Ti.
Note: Operations from other transactions Tj can be interleaved with the
operations of Ti in S.
Some schedules are easy to recover from after a failure, while others are not.
Some schedules produce correct results, while others produce incorrect results.
Generally, characterize schedules by classifying them based on ease of
recovery (recoverability) and correctness (serializability).
06/01/2023 23
Characterizing Schedules based on Recoverability
06/01/2023 24
Characterizing Schedules based on Recoverability
Example: Schedule A below is non-recoverable because T2 reads the value of X
that was written by T1, but then T2 commits before T1 commits or aborts.
To make it recoverable, the commit of T2 (c2) must be delayed until T1 either
commits, or aborts (Schedule B)
If T1 commits, T2 can commit
If T1 aborts, T2 must also abort because it read a value that was written by T1;
this value must be undone (reset to its old value) when T1 is aborted
known as cascading rollback
Schedule A: r1(X); w1(X); r2(X); w2(X); c2; r1(Y); w1(Y); c1 (or a1)
Schedule B: r1(X); w1(X); r2(X); w2(X); r1(Y); w1(Y); c1 (or a1); ….
06/01/2023 25
Characterizing Schedules based on Recoverability
Recoverable schedules can be further refined:
Cascadeless schedule: A schedule in which a transaction T2 cannot read
an item X until the transaction T1 that last wrote X has committed
The set of cascadeless schedules is a subset of the set of recoverable
schedule
Schedules requiring cascaded rollback: A schedule in which an
uncommitted transaction T2 that read an item that was written by a failed
transaction T1 must be rolled back
06/01/2023 26
Characterizing Schedules based on Recoverability
Example: Schedule B below is not cascadeless because T2 reads the value of X that was written by T1
before T1 commits
If T1 aborts (fails), T2 must also be aborted (rolled back) resulting in cascading rollback
To make it cascadeless, the r2(X) of T2 must be delayed until T1 commits (or aborts and rolls back the
value of X to its previous value) – see Schedule C
Schedule B: r1(X); w1(X); r2(X); w2(X); r1(Y); w1(Y); c1 (or a1);
Schedule C: r1(X); w1(X); r1(Y); w1(Y); c1; r2(X); w2(X); ...
06/01/2023 28
Characterizing Schedules based on Serializability
06/01/2023 29
Characterizing Schedules based on Serializability
Serial schedules are not feasible for performance reasons:
No interleaving of operations
Long transactions force other transactions to wait
System cannot switch to other transaction when a transaction is waiting for
disk I/O or any other event
Serializable schedule: A schedule S is serializable if it is equivalent to some
serial schedule of the same n transactions
There are (n)! serial schedules for n transactions a serializable schedule can be
equivalent to any of the serial schedules
06/01/2023 30
Characterizing Schedules based on Serializability
06/01/2023 31
Characterizing Schedules based on Serializability
Practical approach:
Come up with methods (concurrency control protocols) to ensure
serializability
DBMS concurrency control subsystem will enforce the protocol rules
and thus guarantee serializability of schedules
Current approach used in most DBMSs:
Use of locks with two phase locking
06/01/2023 32
Characterizing Schedules based on Serializability
Testing for conflict serializability
Looks at only r(X) and w(X) operations in a schedule
Constructs a precedence graph (serialization graph) one node for
each transaction, plus directed edges
An edge is created from Ti to Tj if one of the operations in Ti appears
before a conflicting operation in Tj
The schedule is serializable if and only if the precedence graph has
no cycles.
06/01/2023 33
Characterizing Schedules based on Serializability
Algorithm: Testing Conflict Serializability of a Schedule S
1. For each transaction Ti participating in schedule S, create a node labeled Ti in the precedence
graph.
2. For each case in S where Tj executes a read_item(X) after Ti executes a write_item(X), create an
edge (Ti → Tj) in the precedence graph.
3. For each case in S where Tj executes a write_item(X) after Ti executes
a read_item(X), create an edge (Ti → Tj) in the precedence graph.
4. For each case in S where Tj executes a write_item(X) after Ti executes
a write_item(X), create an edge (Ti → Tj) in the precedence graph.
5. The schedule S is serializable if and only if the precedence graph has
no cycles.
06/01/2023 34
Characterizing Schedules based on Serializability
06/01/2023 35
Characterizing Schedules based on Serializability
06/01/2023 36
Transaction Support in SQL
A single SQL statement is always considered to be atomic.
Either the statement completes execution without error or it fails
and leaves the database unchanged.
With SQL, there is no explicit Begin Transaction statement.
Transaction initiation is done implicitly when particular SQL
statements are encountered.
Every transaction must have an explicit end statement, which is either
a COMMIT or ROLLBACK.
06/01/2023 37
Transaction Support in SQL
Characteristics specified by a SET TRANSACTION statement in SQL:
Access mode(characteristic):
READ ONLY or READ WRITE.
The default is READ WRITE unless the isolation level of
READ UNCOMITTED is specified, in which case READ
ONLY is assumed.
Diagnostic size n, specifies an integer value n, indicating the
number of conditions that can be held simultaneously in the diagnostic
area.
06/01/2023 38
Transaction Support in SQL
Characteristics specified by a SET TRANSACTION statement in SQL
(contd.):
Isolation level <isolation>, where <isolation> can be READ
UNCOMMITTED, READ COMMITTED, REPEATABLE READ or
SERIALIZABLE. The default is SERIALIZABLE.
With SERIALIZABLE: the interleaved execution of transactions
will adhere to our notion of serializability.
06/01/2023 39
Transaction Support in SQL
Sample SQL transaction:
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTICS SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
06/01/2023 40