Transaction Management - PPTs
Transaction Management - PPTs
How does the DBMS enforce correct query execution when multiple queries and updates run in parallel? How can we improve performance by weakening consistency guarantees?
2
Transactions
Concurrency in a DBMS
Users submit transactions, and can think of each transaction as executing by itself.
Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.
DBMS will enforce all specified constraints. Beyond this, the DBMS does not really understand the semantics of the data. (E.g., it does not understand how the interest on a bank account is computed.)
A users program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database. A transaction is the DBMSs abstract view of a user program: a sequence of reads and writes.
3
Example
T1: T2: BEGIN A=A+100, B=B-100 END BEGIN A=1.06*A, B=1.06*B END
Isolation: Transaction semantics do not depend on other concurrently executed transactions Durability: Effects of successfully committed transactions should persist, even when crashes occur
5
T1 transfers $100 from Bs account to As account. T2 credits both accounts with a 6% interest payment. There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.
6
Example (Contd.)
Scheduling Transactions
Serial schedule: Schedule that does not interleave the actions of different transactions.
Easy for programmer, easy to achieve consistency Bad for performance
Equivalent schedules: For any database state, the effect (on the objects in the database) of executing the first schedule is identical to the effect of executing the second schedule. Serializable schedule: A schedule that is equivalent to some serial execution of the transactions.
Retains advantages of serial schedule, but addresses performance issue
Note: If each transaction preserves consistency, every serializable schedule preserves consistency.
8
More Anomalies
T1: T2:
R(A), W(A), C
Reading Uncommitted Data (WR Conflicts, dirty reads) Example: T1(A=A-100), T2(A=1.06A), T2(B=1.06B), C(T2), T1(B=B+100) T2 reads value A written by T1 before T1 completed its changes Notice: If T1 later aborts, T2 worked with invalid data
9
Unrepeatable Reads (RW Conflicts) T1 sees two different values of A, even though it did not change A between the reads Example: online bookstore
Only one copy of a book left Both T1 and T2 see that 1 copy is left, then try to order T1 gets an error message when trying to order Could not have happened with serial execution
10
Aborted Transactions
All actions of aborted transactions have to be undone Dirty read can result in unrecoverable schedule
T1 writes A, then T2 reads A and makes modifications based on As value T2 commits, and later T1 is aborted T2 worked with invalid data and hence has to be aborted as well; but T2 already committed
Overwriting Uncommitted Data (WW Conflicts) T1s B and T2s A persist, which would not happen with any serial execution Example: 2 people with same salary
T1 sets both salaries to 2000, T2 sets both to 1000 Above schedule results in A=1000, B=2000, which is inconsistent
11
DBMS can support concurrent transactions while preventing anomalies by using a locking protocol If a transaction wants to read an object, it first requests a shared lock (S-lock) on the object If a transaction wants to modify an object, it first requests an exclusive lock (X-lock) on the object Multiple transactions can hold a shared lock on an object At most one transaction can hold an exclusive lock on an object
13
14
Assume initially the youngest sailor is 20 years old T1 contains this query twice
SELECT rating, MIN(age) FROM Sailors
T1 cannot lock a tuple that T2 will insert but T1 could lock the entire Sailors table
Now T2 cannot insert anything until T1 completed
Now locking the entire Sailors table seems excessive, because inserting a new sailor with rating <> 8 would not create a problem
T1 can lock the predicate [rating = 8] on Sailors
15
Deadlocks
Performance of Locking
Locks force transactions to wait Abort and restart due to deadlock wastes the work done by the aborted transaction
In practice, deadlocks are rare, e.g., due to lock downgrades approach
Waiting for locks becomes bigger problem as more transactions execute concurrently
Allowing more concurrent transactions initially increases throughput, but at some point leads to thrashing Need to limit max number of concurrent transactions to prevent thrashing Minimize lock contention by reducing the time a Xact holds locks and by avoiding hotspots (objects frequently accessed)
20
Declaring Xact as READ ONLY increases concurrency Isolation level: trade off concurrency against exposure of Xact to other Xacts uncommitted changes
Isolation Level READ UNCOMMITTED READ COMMITTED REPEATABLE READ SERIALIZABLE Dirty Read Maybe No No No Unrepeatable Read Maybe Maybe No No Phantom Maybe Maybe Maybe No
SERIALIZABLE: obtains locks on (sets of) accessed objects and holds them until the end REPEATABLE READ: same locks as for serializable Xact, but does not lock sets of objects at higher level READ COMMITTED: obtains X-locks before writing and holds them until the end; obtains S-locks before reading, but releases them immediately after reading READ UNCOMMITTED: does not obtain S-locks for reading; not allowed to perform any writes
Does not request any locks ever
22
21
Summary
Concurrency control is one of the most important functions provided by a DBMS. Users need not worry about concurrency.
System automatically inserts lock/unlock requests and can schedule actions of different Xacts in such a way as to ensure that the resulting execution is equivalent to executing the Xacts one after the other in some order.