Advanced DB Chapter-3
Advanced DB Chapter-3
TRANSACTION PROCESSING
1
Chapter Outline
1. Introduction to Transaction Processing
2. Transaction and System Concepts
3. Desirable Properties of Transactions
4. Characterizing Schedules based on Recoverability
5. Characterizing Schedules based on Serializability
6. Transaction Support in SQL
Slide 17- 2
1. Introduction to Transaction Processing
• Single-User System:
• At most one user at a time can use the system.
• Multiuser System:
• Many users can access the system concurrently.
• Concurrency
means allowing more than one transaction to run simultaneously on the same
database.
• Interleaved processing:
• Concurrent execution of processes is interleaved in a single CPU
• Parallel processing:
• Processes are concurrently executed in multiple CPUs.
Slide 17- 3
Definition of transactions
A transaction (set of operations) may be specified in SQL, or may
be embedded within a program.
7
Transaction and System Concepts
• A transaction is an atomic unit of work that is either completed in its entirety or
not done at all.
• For recovery purposes, the system needs to keep track of when the transaction
starts, terminates, and commits or aborts.
• Transaction states:
1.Active state:
The transaction is being executed. This is the initial state of every transaction.
2. Partially committed state:
A transaction goes into partially committed state after the end of a transaction and
at this point, some recovery protocol is needed.
3.Committed state:
a transaction executes all its operations successfully.
4.Failed state:
A transaction is said to be in a failed state if any of the checks made by the
database recovery system fails.
5.Terminated State :
It corresponds to the transaction leaving the system. The transaction information
that is maintained in system tables while the transaction has been running is
removed when the transaction terminates.
8
Cont...
Aborted − If any of the checks fails and the transaction has reached
a failed state, then the recovery manager rolls back all its write
operations on the database to its original state of the transaction.
The database recovery module can select one of the two
operations after a transaction aborts −
Re-start the transaction
Kill the transaction
9
State transition diagram illustrating the states
for transaction execution
Slide 17- 10
Cont...
• Recovery manager keeps track of the following operations:
• begin_transaction: This marks the beginning of transaction execution.
• read or write: These specify read or write operations on the database items that
are executed as part of a transaction.
• end_transaction: This specifies that read and write transaction operations have
ended and marks the end limit of transaction execution.
• At this point it may be necessary to check whether the changes introduced by
the transaction can be permanently applied to the database or whether the
transaction has to be aborted because it violates concurrency control or for
some other reason.
• commit_transaction: This signals a successful end of the transaction so that any
changes (updates) executed by the transaction can be safely committed to the
database and will not be undone.
• rollback (or abort): This signals that the transaction has ended unsuccessfully,
so that any changes or effects that the transaction may have applied to the
database must be undone.
Slide 17- 11
Cont...
• Recovery techniques use the following operators:
• undo: Similar to rollback except that it applies to a single
operation rather than to a whole transaction.
• redo: This specifies that certain transaction operations must be
redone to ensure that all the operations of a committed
transaction have been applied successfully to the database.
Slide 17- 12
Desirable Properties of Transactions (1)
Slide 17- 13
• Transaction: an indivisible unit of data processing
• All transactions must have the ACID properties:
• Atomicity: all or nothing
• Consistency: no constraint violations
• Isolation: no interference from other concurrent
transactions
• Durability: committed changes must not be lost due to
any kind of failure
14
Atomic transactions
• Fred wants to move $200 from his savings account to his checking
account.
Slide 17- 15
• Transactions must be atomic (indivisible) the DBMS must ensure
atomicity.
• everything happens, or nothing happens
• boundaries of transaction (in time) are generally set by the
application … the DBMS has no means of determining the
intention of a transaction.
16
Correct transaction
• Wilma tries to withdraw $1000 from account 387.
Slide 17- 17
• A transaction must leave the database in an valid or consistent
state.
• valid state == no constraint violations
• A constraint is a declared rule defining specifying database states
• Constraints may be violated temporarily …
but must be corrected before the transaction completes.
18
Concurrent transactions
• Fred is withdrawing $500 from account 543.
• Wilma’s employer is depositing $1983.23 to account 543.
• These transactions are happening at the same time.
Slide 17- 19
Transactions are isolated
• If two transactions occur at the same time, the cumulative effect must be
the same as if they had been done in isolation.
• Ensuring isolation is the task of concurrency control.
Slide 17- 20
Durable transactions
• Wilma deposits $50,000 to account 387.
• Later, the bank’s computer crashes due to a lightning storm.
Slide 17- 21
• Once a transaction's effect on the database state has been
committed, it must be permanent.
• The DBMS must ensure persistence, even in the event of system
failures.
• Sources of failure:
• computer or operating system crash
• disk failure
• fire, theft, power outage, earthquake, operator errors, …
22
Transaction Analysis
23
Transaction Analysis
• To ensure transaction properties, it is sufficient to define transactions
as sequences of operations of two types:
• read_item(X): access the current value of item X.
• write_item(X): modify the value of item X.
• Example Transaction:
read_item(acct_387_balance)
acct_387_balance := acct_387_balance - 300
write_item(acct_387_balance)
24
Concurrency Problem
acct387 1700.00
Wilma's Transaction w1: read_item(acct387)
w2: acct387 := acct387 + 900
w3: write_item(acct387)
acct387 2600.00 correct result
26
Possible Execution #2
acct387 2000.00
w1: read_item(acct387)
f1: read_item(acct387)
The two f2: acct387 := acct387 - 300
transactions w2: acct387 := acct387 + 900
overlap w3: write_item(acct387)
acct387 2900.00
f3: write_item(acct387)
incorrect result!
acct387 1700.00
28
Transaction Theory
Serializability and Equivalence of Transactions
29
Transaction Termination
31
Transaction Data Items
• Read and write operations also record the data item that was
involved.
• for simplicity, the same name is used for a data item in the
database, and for a copy of the data item in some application's
memory
• granularity = the size of the data items
• could be: an attribute, a record, a relation, a page/block
• granularity doesn't affect correctness of theory
• in practice, larger granularity leads to less concurrency, since
more conflict is (incorrectly) detected more often
33
Schedules and Recoverability
• Schedule: A particular ordering of the operations from a set of
concurrent transactions.
• Only complete schedules are considered, others are not valid
(atomicity).
• Complete schedule:
1. Includes all operations from every transaction
2. commit or abort is the last operation in each transaction
3. operations from the same transaction appear in the same
order in the schedule
Slide 17- 34
Characterizing Schedules based on Recoverability (1)
Slide 17- 35
Characterizing Schedules based on Recoverability (2)
Schedules classified on recoverability into four :
1.Recoverable schedule:
• One where no transaction needs to be rolled back.
• A schedule S is recoverable if no transaction T in S commits
until all transactions T’ that have written an item that T
reads have committed.
2. Cascadeless schedule:
• One where every transaction reads only the items that are
written by committed transactions.
3.Schedules requiring cascaded rollback:
• A schedule in which uncommitted transactions that
read an item from a failed transaction must be rolled
back.
4.Strict Schedules:
• A schedule in which a transaction can neither read or write an item X until the last
transaction that wrote X has committed or aborted.
Slide 17- 36
Characterizing Schedules based on Serializability (1)
Schedules based on Serializability classified into five
1.Serial schedule:
• A schedule S is serial if, for every transaction T participating in the schedule, all the
operations of T are executed consecutively in the schedule.
• Otherwise, the schedule is called non-serial schedule.
2.Serializable schedule:
• A schedule S is serializable ,if it is equivalent to some serial schedule of the same n
transactions.
3.Result equivalent:
• Two schedules are called result equivalent ,if they produce the same final state of the
database.
4.Conflict equivalent:
• Two schedules are said to be conflict equivalent ,if the order of any two conflicting
operations is the same in both schedules.
5.Conflict serializable:
• A schedule S is said to be conflict serializable ,if it is conflict equivalent to some serial
schedule S’.
Slide 17- 37
Cont...
38
Example Transactions
• Three concurrent transactions:
T1 T2 T3
begin begin begin
begin/end enclose
read(X) read(X) read(Y)
all reads and writes.
X = X-2 X = X+3 Y = Y+1
write(X) write(X) write(Y)
all end with a commit
read(Y) end end
or an abort.
Y = Y+2 commit commit
write(Y)
end
commit
40
Schedule Notation
• a more compact notation for schedules:
41
Serial Schedules
• A serial schedule is one in which the transactions do not overlap (in time).
b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
b2,r2(X),w2(X),e2,c2, These are all serial schedules
b3,r3(Y),w3(Y),e3,c3 for the three example transactions
42
Serial Schedules are Correct
• Serial schedules are correct schedules
• correct = transactions produce correct database states
• Since each transaction is consistency preserving, they each
must produce correct DB states when executed in isolation.
• In a serial schedule, transactions do not overlap, therefore
isolation must hold thus, a serial schedule is a correct
schedule.
43
Serializability
• If a schedule can be proven to be equivalent to some serial
schedule, then that schedule must be correct.
• serializable = equivalent to some serial schedule
• There are different concepts of equivalence, each leading to
different concepts of Serializability.
• We'll only consider the most conservative definition of
equivalence: conflict equivalence.
44
Conflict Equivalence
• Two schedules are conflict equivalent if the order of any two conflicting
operations is the same in both schedules.
• Two operations conflict
• they access the same data item (read or write)
• if they belong to different transactions
• at least one is a write
T1: b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
conflicting operations:
T2: b2,r2(X),w2(X),e2,c2 r1(X),w2(X)
w1(X), r2(X)
w1(X), w2(X)
• Find the conflicting operation?
45
Conflict Equivalence(cont...)
• The term "conflicting operations” can be misleading.
• The operations do not conflict in any particular schedule, rather
they will cause two schedules to be non-equivalent if their order
is different in the two schedules
• A better term might be "conflict causing operations"
• Two operations from the same transaction cannot conflict,
since their relative order must be the same in all complete
schedules.
46
Example: Conflict Equivalence
schedule 1:
b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1,
b2,r2(X),w2(X),e2,c2
schedule 2: r1(X) < w2(X), w1(X) < r2(X), w1(X) < w2(X)
b2,r2(X),w2(X),
b1,r1(X),w1(X),r1(Y),w1(Y),e1,c1, e2,c2
w2(X) < r1(X), r2(X) < w1(X), w2(X) < w1(X)
schedule 3:
b1,r1(X),w1(X), b2,r2(X),w2(X),e2,c2, r1(Y),w1(Y),e1,c1,
r1(X) < w2(X), w1(X) < r2(X), w1(X) < w2(X)
48
Testing for Conflict Serializability
• We could test a schedule against every possible serial schedule
for the same transactions.
• this is intractable for large numbers of transactions
• Precedence graphs are a more efficient test
• graph indicates a partial order on the transactions required
by the order of the conflicting operations.
• the partial order must hold in any conflict equivalent serial
schedule
• if there is a loop in the graph, the partial order is not
possible in any serial schedule
• if the graph has no loops, the schedule is conflict serializable
49
Cont...
50
Precedence Graph Examples: find the graph the conflict
operation between the transactions?
schedule 3:
b1,r1(X),w1(X),
b2,r2(X),w2(X),e2,c2, r1(Y),w1(Y),e1,c1,
Find the conflict operations ?
r1(X) < w2(X), w1(X) < r2(X), w1(X) < w2(X)
T1 T2
r2(X) < w1(X)
T1 T2
r2(X) < w1(X)
56