0% found this document useful (0 votes)
116 views

DBMS 5-Transaction Management SQL VTU

Module 5 - 21CS53

Uploaded by

Samanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
116 views

DBMS 5-Transaction Management SQL VTU

Module 5 - 21CS53

Uploaded by

Samanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 59
Module 5 Chapter 1: Transaction Processing 5.0 Introduction 5.1 Objectives 5.2 Introduction to Transaction Processing 5.2.1 Single-User versus Multiuser Systems 5.2.2 Transactions, Database Items, Read and Write Operations, and DBMS Buffers 5.2.3 Why Concurrency Control Is Needed 5.2.4 Why Recovery Is Needed 5.3 Transaction and System Concepts 5.3.1 Transaction States and Additional Operations 5.3.2 The System Log 5.3.3 Commit Point of a Transaction: 5.3.4 DBMS specific buffer Replacement policies 5.4 Desirable Properties of Transactions 5.5 Characterizing Schedules Based on Recoverability 5.6 Characterizing Schedules Based on Serializability 5.6.1 Testing conflict serializability ofa Schedule $ 5.7 Transaetion Support in SQL 5.8 Introduction to Coneurrency Control 5.9 Two-Phase Locking Techniques for Concurreney Control 5.9.1 Types of Locks and System Lock Tables 5.9.2 Guaranteeing Serializability by Two-Phase Locking 5.10 Variations of Two-Phase Locking 5.11 Dealing with Deadlock and Starvation 5.11 Deadlock Detection. 5.13 Concurrency Control Based on Timestamp Ordering 5.13.1 Timestamps 5.13.2 The Timestamp Ordering Algorithm 5.14 Multiversion Concurrency Control Techniques 5.14.1 Multiversion Technique Based on Timestamp Ordering 5.14.2 Multiversion Two-Phase Locking Using Certify Locks 5.15 Validation (Optimistic) Concurrency Control Techniques 5.16 Granularity of Data Items and Multiple Granularity Locking 5.16.1 Granularity Level Considerations for Locking §.16.2 Multiple Granularity Level Locking 5.17 Recovery Concepts 5.17.1 Recovery Outline and Categorization of Recovery Algorithms 5.17.2 Caching (Buffering) of Disk Blocks 5.17.3 Write-Ahead Logging, Steal/No-Steal, and Force/No-Force 5.17.4 Checkpoints in the System Log and Fuzzy Checkpointing 5.17.5 Transaction Rollback and Cascading Rollback 5.17.6 Transaction Actions That Do Not Affect the Database 5.18 NO-UNDO/REDO Recovery Based on Deferred Update 5.19 Recovery Techniques Based on Immediate Update 5.20 Shadow Paging 5.21 The ARIES Recovery Algorithm Database Backup and Recovery from Catastrophie Failures 5.23 Assignment Questions Expected Outcome Further Reading 5.0 Introduction The concept of transaction provides a mechanism for describing logical units of database processing. Transaction processing systems are systems with large databases and hundreds of concurrent users executing database transactions. Examples + airline reservations + banking * credit card processing + online retail purchasing + Stock markets, supermarket checkouts, and many other applications These systems require high availability and fast response time for hundreds of concurent users. A transaction is typically implemented by a computer program which includes database commands such as retrievals, insertions, deletions, and updates. 5.1 Objectives 4 To study transaction properties 4 To study creation of schedule and maintaining schedule equivalence. “ To check whether the given schedule is serailizable or not. + To study protocols used for locking objects Differentiating between 2PL and Strict 2PL 5.2 Introduction to Transaction Processing 5.2.1 Single-User versus Multiuser Systems * One criterion for classifying a database system is according to the number of users who can use the system concurrently Single-User versus Multiuser Systems * ADBMS is + single-user ~at most one user at a time can use the system -Eg: Personal Computer System + multiuser ~ many users can use the system and hence access the database concurrently - Eg: Airline reservation database Concurrent access is possible because of Multiprogramming. Multiprogramming can be achieved by: + interleaved execution + Parallel Processing Multiprogramming operating systems execute some commands from one process, then suspend that process and execute some commands from the next process, and so on A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU again Hence, concurrent execution of processes is actually interleaved, a¢ jiustrated ir Figure 21.1 Figure 21.4 Interleaved process ing versus parallel processing of con ‘current tansactons. Figure 21.1, shows two processes A anc B, executing concurrently in an interleaved fashion Interleaving keeps the CPU busy when a process requires an input or output (I/O) ‘operation, such as reading a block from disk The CPU is switched to execute another process rather than remaining idle during /O time Interleaving also prevents a long process from delaying other processes If the computer system has multiple hardware processors (CPUs), parallel processing of multiple processes is possible, as illustrated by processes C and D in Figure 21.1 Most of the theory concerning concurrency control in databases is developed in terms of interleaved concurrency In a multiuser DBMS, the stored date items are the primary resources thal may be accessed concurrently by interactive users or application programs, which are constantly retrieving information from and modifying the database. 5.2.2 Transactions, Database Items, Read and Write Operations, and DBMS Buffers * A Transaction an executing program that forms ¢ logical unit of database processing + Itincludes one or more DB access operations such as insertion, deletion modification or retrieval operation. * It car be either embedded within an application program using begin transaction and end transaction statements Or specifiec interactively via a high level query language such as SQL + Transaction which do not update database are known as read only transactions. * Transaction which do update database are known as read write transactions. + A database is basically representec as a collection of namec date items The size of 2 date ter is called its granularity * A data item can ve a database record, but it can also be a larger unit such as a whole disk block, or even @ smaller unit such as an individual field (attribute) value of some recore in the database = Each data item has a unique name * Basic DB access operations that a transaction can include are: m(X): Reads @ DE tem named X into a program variable. + write_item(X): Writes the value of a program variable inte the DB item named X + Executing read_item(X) include the following steps: 1. Find the address of the disk block that contains tem X 2. Copy the block into a bufferin main memory 3. Copy the item X from the buffer to program variable named X + Executing write_item(X) include the following steps: 1. Find the address of the disk block that contains item X 2. Copy the disk block into a outfer in main memory 3. Copy item X from program variable named X into its correct location in ouffer 4, Store the updated disk block from buffer back to disk (either immediately orlater) + Decision of when to store @ modified disk block is handled by recovery manager of the DBMS in cooperation with operating system + ADB cache includes a number of data buffers * When the buffers are all occupied a buffer replacement policy is used to choose one of the buffers to be replaced. EG: LRU "A transaction includes read_item and write_item operations to access and update DB. @ ) Th Figure 21.2 Two sample transac sm(X); read_item(X); tions. (@) Transaction X=X-N; X=X+M; T,.(b) Transaction Ty write_item(X); write_item(X); item(Y); +N; write_item(Y); + The read-set of a transaction is the set of all items that the transaction reads + The write-setis the set of all tems that the transaction writes + For example, the read-set of T’ in Figure 21.2 is (X, Y} and its write-set is also (X. Y} 5.2.3 Why Concurrency Control Is Needed "Several problems can occur when concurrent transactions execute in an uncontrolled + Bape: + We consider an Airline reservation DB + Each records is stored for an airline flight which includes Number of reserved seats among other information. + Types of problems we may encounter: The Lost Update Problem 2. The Temporary Update (or Dirty Read) Problem 3, The Incorrect Summary Problem 4 The Unrepeatable Read Problem Th qh read_item(X); read_item(X); X=X+M, X=X-N; write_item(X); write_item(X); read_item(Y); Y=Y+N; write_item(Y); + Transaction T? + transfers N reservations from one flight whose number of reservec seats is stored in the database tem named X to another fight whose number of reserved seats is storec in the database item named Y. + Transaction T2 + reserves M seats on the first fight (X) 1. The Lost Update Problem “occurs when two transactions that access the same DB tems have their operations interleaved in a way that makes the value of some DB item incorrect * Suppose that transactions T1 and T2 are submitted at approximately the same time, anc suppose that their operations are interleaved as shown in Figure below qh I read_item(X); read_item(X); X=X4M; Time | | write item(x); read_item(Y); sce Item X has an incorrect value because ee its update by 7, is Jost (overwritten). Yan; write_item(Y); * Fina value of item X's incorrect because T2 reads the value of X before T1 changes it in the database, and hence the updated value resulting from T" is lost + Forexample X = 80 at the start (there were 80 reservations on the flight) N=5 (T1 transfers 5 seat reservations from the flight corresponding toX to the flight corresponding to Y) M=4 (12 reserves 4 seats on X) The final result should be X = 79. + The interleaving of operations shown in Figure 's X = 84 because the update in T* that removed the five seats from X was lost 2, The Temporary Update (or Dirty Read) Problem "occurs when one transaction updates a database item anc ther the transaction fails for some reason « Meanwhile the updated iterr is accessec by another transaction before it is changed back to its original value q I read _item(X); X=X-M write_item(X); Time read_item(X); X=XeM; waite iter); Transaction T, fails and must change read _item(Y); the value of X back to its old value; meanwhile T, has read the temporary incorrect value of X 3. The Incorrect Summary Problem + If one transaction is calculating an aggregate summary function on @ number of db items while other transactions are updating same of these items, the aggregate function may calculate some values before they are updated and others after they are updated Ti ie um = 0; read torial ‘sum = sum + 4 read item) Tyreads X after is subtracted and reads J-=— Ybetore Ni added a wrong summary isthe roc (of by Ni). read som) raven we en) 4. The Unrepeatable Read Problem * Transaction T reads the same item twice anc gets different values on eact read, since the item was modified by another transaction T° between the two reads. = for example, if during an airline reservatior transaction, a customer inquires about seat availability on several flights "When the customer decides on a particular fight, the transaction then reads the number of seats on thal flight a second time before completing the reservation, and it may end up reading a different value for the item. 5.2.4 Why Recovery Is Needed + Whenever a transactior 's submitted to a DBMS for execution, the system is responsible for making sure that either 1. All the operations in the transaction are completed successfully and their effect is recorded permanently in the database or 2The transactior does not have any effect on the database or any other transactions + In the first case, the transaction is said to he committed, whereas in the seconc case, the transaction is aborted «If a transaction fails after executing some of its operations out before executing all of them, the operations already executed must be undone and have no lasting effect Types of failures 1. A computer failure (system crash): + A hardware, software, or network error occurs in the computer system during transaction executior + Hardware crashes are usually mecia failures—for example, main memory failure. 2. Atransaction or system error: + Some operation in the transaction may cause it tofail, such as integer overflow or division by zero + Also occur because of erroneous parameter values 3. Local errors or exception conditions detected by the transacti + During transaction execution, certair conditions may occur that necessitate cancellation of the transaction + For example, data for the transaction may not be found 4, Concurrency control enforcement: +The concurrency control may decide to abort a transactior because itviolates serializability or several transactions are in a state of deadlock 5. Disk failure: + Some disk blocks may lose their data because of a reac or write malfunction or because of a disk readiwrite head crash 6. Physical problems and catastrophes: + refers to an endless list of problems that includes power or air-conditioning failure, fire, theft, overwriting disks or tapes by mistake + Failures of types 1, 2, 3, and 4 are more common than those of types 5 or 6 «Whenever a failure of type 1 through 4 occurs, the system must keep sufficient information tc quickly recover from the failure. * Disk failure or other catastrophic failures of type 5 or 6 do not happen frequently, if they do occur, recovery is @ major task 5.3 Transaction and System Concepts 5.3.1 Transaction States and Additional Operations "A transaction is an atomic unit of work that should either be completed in its entirety or not done at all. For recovery purposes, the system keeps track of start of a transaction, termination, commit or aborts. + BEGIN_TRANSACTION: marks the beginning of transaction executior + READ or WRITE: specify read or write operations on the database items thal are executed as part of a transaction + END_TRANSACTION: specifies thal READ and WRITE transaction operations have ended and marks the end of transaction execution + COMMIT_TRANSACTION: signals a successful end of the transaction so that any changes (updates) executed by the transaction can be safely committed to the database and will not be undone + ROLLBACK: signals that the transaction has ended unsuccessfully, sc thal any changes or effects that the transaction may have applied tc the database must be undone 10 ee) Figure: State transition diagram illustrating the states for transaction execution A transactior goes into active state immediately after it starts execution and can execute read and write operations. When the transaction ends it moves to partially committed state At this end additional checks are done to see if the transaction can be committed or not If these checks are successful the transaction is said to have reached commit point anc enters committed state. All the changes are recorded permanently in the db. A transactior can go to the failed state if one of the checks fails or f the transaction is aborted during its active state. The transaction may then have tc be rolled ack to unde the effect of its write operation. Terminated state corresponds to the transactior leaving the system. All the information about the transaction is removed from system tables. 5.3.2 The System Log Log or Journal keeps track of all transactior operations that affect the values of database items This information may be needed to permit recovery from transaction failures, The log 's kept on disk, so it is not affectec by any type of failure except for disk or catastrophic failure one (or more) main memory buffers nold the last part of the log file, sc that log entries are first added to the main memory buffer Wher the log buffer is filed, or when certain other conditions occur, the log buffer is appended to the end of the log file on disk. u * In addition, the log is periodcally oackec Up to archival storage (tape) to guarc against ‘such catastrophic failures + The following are the types of entries—called log records—that are written to the log file and the corresponding action for each log record. + In these entries, T -efers to 2 unique transaction-id thal is generated automatically by the system for each transaction and that is usec to identify each transaction: 1. [start_transaction, T]. Indicates that transaction T has started execution. 2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has changec the value of database item X from old_value to new_value 3. [read_item, T, X]. Indicates that transactior T has read the value of database item X. 4. [commit, T]. Indicates that transaction T has completed successfully, anc affirms that its effect can be committed (recorded permanently) to the database. 5. [abort T]. Indicates that transaction T has been aborted 5.3.3 Commit Point of a Transaction: + Definition a Commit Point: — A transaction T reaches its commit point when all its operations that access the database nave beer executed successfully and the effect of all the transactior operations on the database has been recordec in the log. = Beyond the commit point, the transaction is said to be committed, ane its effect is assumed to be permanently recorded in the database. — The transaction then writes an entry {commit,T] into the log, + Roll Back of transactions: - Needed for transactions that have a [start_transaction,7] entry into the log out no commit entry [commit,T] into the log. 5.3.4 DBMS specific buffer Replacement policies Domain Separation(DS) method + DBMS cache is divided into separate domains, each handles one type of disk sages and replacements within each domain are handled via basic LRU page replacement. + LRUis a static algorithm and does not adopts to dynamically changing loads because the number of available buffers for each domain is predetermined + Group LRU adds dynamically load balancing feature since it gives each domain a oriority and selects pages from lower priority level domain first for replacement. 2 Hot Set Method: = This is usefu in queries that have to scan a set of pages repeatedly. = The hot set method determines for each db processing algorithm the set of disk pages that will be accessec repeatedly and it does not replace therr until their processing is completed. The DBMIN method = uses a model known as QLSM (Query Locality sel model), which predetermines the gattern of page references for each algorithm for a particular db operation = Depending on the type of access method, the file characteristics and the algorithm used the QLSWV will estimate the number of mair memory buffers needec for each file nvolvec in the operation. 5.4 Desirable Properties of Transactions + Transactions should possess several properties, often called the ACID properties, A Atomicity: a transaction is an atomic unit of processing and itis either performec entirely or not at all C Consistency Preservation: a transaction should be consistency preserving that is it must take the database from one consistent state to another. | Isolation/Independence: A transaction should appear as thouat it is being exeoutec in isolation from other transactions, even though many transactions are executed concurrently D Durability (or Permanency): fa transaction changes the database and is committed, the changes must never be lost because of any failure. The atomicity property requires that we execute a transaction to completion. It is the responsibility of the transaction recovery subsystem of a DBMS to ensure atomicity + The preservation of consistency is generally considered to be the responsibility of the orogrammers who write the database programs or of the DBMS module that enforces integrity constraints. + The isolation property is enforced by the concurrency control subsystem of the DBMS. If every transaction does not make its updates (write operations) visible to other transactions until it is committed, one form of isolation is enforced that solves the temporary update problem and eliminates cascading rollbacks + Durability is the responsibility of recovery subsystem 5.5 Characterizing Schedules Based on Recoverability * schedule {or history): the order of execution of operations from all the various transactions * Schedules (Histories) of Transactions: A schedule $ of n transactions Ti, Tz,.....Tn is a sequential ordering of the operations of the n transactions. — The transactions are interleaved + Two operations in a schedule are said to conflict if they satisfy all three of the following conditions: (1) they belong to different transactions; (2) they access the same item X; and (3) at least one of the operations is a write_item(X) * Conflicting operations: + 14(X) conflicts with w.(X) Read write conflict + 20%) conflicts with wi(X) + wi(X) conflicts with we(X) Write conflict + 5X) do not conflicts with r2(X) Schedules classified on recoveral " Recoverable schedule = RUBANRE CS RASA CPAS PAREEIES T4548 commits until all transactions T that have written an iter thal T reads have committed, = Example’ © Se nO); wi(X); BOX); (Y); wa(X); C2) ar; + Si 1%); Wil) 200}; (YY, WAL): LY): CH C2 = Cascadeless schedule: — One where every transaction reads only the items that are written by committed 1» Scheddléee@iriig cascaded rollback: - A schedule in which uncommitted transactions that read an item from a failec transaction must be rolled oack. + Strict Schedules - Acschedule in which a transactior can neither read or write an item X until tre last transaction that wrote X has committed. 4 5.6 Characterizing Schedules Based on Serializability "schedules that are always considerec to be correc! when concurrent transactions are executing are known as serializable schedules * Suppose that two users—for example, two airline reservations agents—submit tc the DBMS transactions T+ and T2 at approximately the same time. If no interleaving of operations is permitted, there are only two possible outcomes 1. Execute all the operations of transaction T1 (in sequence) followed by all the operations of transaction T2 (in sequence) 2. Execute all the operations of transaction 72 (in sequence) followed by all the operations of transaction T1 (in sequence) Figuro 21.5 Examples of serial and nonserial schadulas invahing transactions T; and Ta (@) Satial schedule A:T, followed! by Ty. (tb) Seri schedule. B: Tp falowed by 7 (©) Two nonserial schedules C and D with intereaving of operations. ®) T te ® T te read tem(X) read tam) K=X- WM xaXaM. write_item{X); write _item(X); sme | | fad tem); sone | | teat; mm YN; 7 X=X-N; write_item(Y); write_item(X); read_item(X); asenn KexeM; yeven writer) write_item( Schedule A Schedule B © iA Ty T th read namin) foad_itam(x}; Rex xx # Tread tom): pin a X=XEM; tes time | | ite temoxy Time read.itom() road itor) KeXeMy ‘wt ern) Y=Y4N; ba read _item(¥);, wie tert) vaVany we item( Schedule © Schedule D 15 Serial schedule: — A schedule § is serial f, for every transaction T participating in the schedule all the operations of T are executed consecutively in the schedule. + Otherwise, the schedule is called nonserial schedule. Serializable schedule: — Aschedule S is serializable i itis equivalent to some serid schedule of the same n transactions. Result equivalent: = Two schedules are called result equivalent if they produce the same final state of the database Conflict equivalent: = Two schedules are said to be conflict equivalent if the order of any two conflicting operations is the same in both schedules. Conflict serializabl = Asschedule S is said to be confit serializable if itis conflict equivalent to some serial schedule S! Being serializable is not the same as being serial Being serializable implies that the schedule is a correct schedule. — Itwillleave the database in a consistent state. The interleaving is appropriate and will resultin a state as ifthe transactions were serially executed, yet will achieve efficiency due to concurrent execution 5.6.1 Testing conflict serializability of a Schedule S For each transaction Ti participating in schedule S,create a node labelec Ti in the precedence graph For each case in S where T| executes a read_item(X) after Ti executes a write_item(X), create an edge (Ti->T]) in the precedence graph, For eact case in S where Tj executes a write_iten(X! after Ti executes a read_i ,oreate an edge (Ti->7}) in the precedence graph For eact case in S where Tj executes a write_iten(X) after Ti executes a write_item(X), create an edge (Ti->T]) in the precedence graph The schedule $ is serializable if and only ifthe precedence graph has no cycles, sm (X) 16 @ (b) x © x @ QO ® ‘ Fig: Constructing the precedence graphs for schedules A and D from fig 21.5 to test for conflict serializability. (a) Precedence graph for serial schedule A (b) Precedence graph for serial schedule B. (c) Precedence graph for schedule C (not serializable). (d) Precedence graph for schedule D (serializable, equivalent to schedule A). = Another example of serializability testing. (a) The READ and WRITE operations of three transactions T;, Ta, an¢ Ts a transaction T, transaction Tz transaction Ts read item (X); read item (Z); read item (Y); write item (X); read item (Y); tead item (Z); read_item (Y); vwrite_item (); write_item (¥); write_item (Y); read item (X); write_iter (2); write_item (X); v7 ) Time © Time transaction T; read _item (X); vwaite_item (X); read_item (Y/); write_item (Y); read _itom (X); write_iter (X); read_item (Y); wrte_tem (Y); transaction Tz read_item (Z); read_item (Y); write_item (Y); read_item (X); write_item (X); Schedule E read_item (Z); read_item (Y); vwrite_item (Y); read _item (X); write_ite (X); ‘Schedule F transaction Ty read _item (Y); read_item (2); write_item (Y); write_item (Z); transaction T; read item (Y); read item (Z); vwrite_item (Y); vite_item (2); = Precedence graph for schedule E cara e vz oa le X(T, 979, 1 97) 960 XC. V2, Ta. MTT) = Precedence graph for schedule F Equivalent serial schedules h-hh 5.7 Transaction Support in SQL "The basic definition of an SQL transaction is, itis a logica unit of work anc is guaranteed to be atomic "A single SQL statement is always considered to be atomic—either it completes execution without an error or fails and leaves the database unchanged * With SQL, there is no explici! Begin_Transaction statement. Transaction initiation is done implicitly when particular SQL statements are encountered = Every transaction must have an explicit enc statement, which is either a COMMIT or a ROLLBACK * Every transaction has certain characteristics attributed to it an¢ are specifiec by a SET TRANSACTION statement in SQL 19 + The characteristics are + The access mode - can be specified as READ ONLY or READ WRITE ~ The defaull is READ WRITE = Amode of READWRITE allows select, update, insert, delete, and create commands to be executed - Amode of READ ONLY, as the name implies is simply for data retrieval + The diagnostic area size = DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the umber of conditions that can be held simultaneously in the diagnostic aree - These conditions supply feedback information (errors or exceptions) to the user or program on the n most recently executed SQL statement + The isolation level + specified using the statement ISOLATION LEVEL , where the value for can be RFAN UNCOMMITTED, RFAN COMMITTED, REPEATABLE READ, or SERIALIZABLE = The default isolation leve is SERIALIZABLE ~ The use of the term SERIALIZABLE here is based on not allowing violations that cause dirty read, unrepeatable read, and phantoms - Ifa transaction executes at 2 lower ‘solation level than SERIALIZABLE, then one or more of the following three violations may occur: 1. Dirty read. A transaction T1 may read the update of a transactior T2, which has not yet committed. If T2 fails and is aborted, then T1 would have read a value that does not exist anc is incorrect. 2, Nonrepeatable read. A transaction T1 may read a given value from a table. If another transaction 72 later updates that value and 71 reads that value again T1 will see a different value. 3. Phantoms, A transaction Ti may read a set of rows from a table, perhaps based on some condition specifies in the SQL WHERE-clause. Now suppose that a transactior TZ inserts a new row that also satisfies the WHERE-clause condition usec in T1, into the table used by 71. If Tt is repeated, then T+ will see a phantom, a row that previously did not exist 20 Table 21.1 Possible Violations Based on Isolation Levels as Defined in SOL Typo of Violation Isolation Level Read Nonrepestable Read READ UNCOMMITTED Yes READ COMMITTED Yes REPEATABLE READ No ‘SERIALIZABLE No EXEC SQL WHENEVER SQLERROR GOTO UNDO; EMEC SQL SET TRANSACTION READ WRITE DIAGNOSTIC SIZE 5 ISOLATION LEVEL SERIALIZABL; EXEC SQL INSERT INTO EMPLOYEE (Frame, Inane, Ssn, Dno, Salary) VALUES (‘Robert', ‘smith’, *991004321", 2, 35000); EXEC SQL UPDATE EMPLOYEE Se? Salary = Salary * 1.1 WHERE Dno EXEC SQL COMMIT; GOTO THE_END; UNDO: EXEC SQL ROLLBACK; ‘THE_END: The transaction consists of first inserting a new row in the EMPLOYEE table and then Updating the salary of al employees who work in department 2 Ifan error occurs on any of the SQL statements, the entire transaction is rolled back This mplies that any updatec salary (by this transaction) woul ne restorec to its orevious value and that the newly inserted row would be removed 21 Chapter 2: Concurrency Control in Databases 5.8 Introduction to Concurrency Control + Purpose of Concurrency Control = To enforce Isolation (through mutual exclusion) among conflicting transactions, - To preserve database consistency through consistency preserving execution of transactions, = To resolve read-write and write-write conficts. + Example: — In concurrent execution environment if T1 conflicts with T2 over a date item A, ther the existing concurrency contro decides if T1 or T should get the A and if the other transaction is rolled-back or waits. 5.9 Two-Phase Locking Techniques for Concurrency Control = The concept of locking data items is one of the main techniques used for controlling the concurrent execution of transactions. = Alock's a variable associatec with a date iter in the database. Generally there is a lock for each data item in the database. = A lock describes the status of the date iterr with respect to possible operations that can be applies te that item * Itis used for synchronizing the access by concurrent transactions to the database items = A transaction locks an object before using it = When an object is locked by another transaction, the requesting transaction must wait 5.9.1 Types of Locks and System Lock Tables 4. Binary Locks = Abinary lock can have two states or values: locked and unlocked (or 1 and 0). = If the value of the lock on X is 1, item X cannot be accessed by a database operation that requests the item 22 If the value of the lock on X is, the item can be accessed when requested, and the lock value is changed to 1 We refer to the current value (or state) of the lock associated with item X as lock(X). Two operations, lock_item and unlock_item, are usec with binary locking. A transaction requests access to anitem X by first issuing a lock_item(X) operation If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) =O, it is set to 1 (the transactior locks the item) and the transaction is allowed to access item X’ When the transaction is through using the item, itissues an unlock_item(X) operation, which sets LOCK(X) back to 0 (unlocks the item) so that X may be accessed by other transactions Hence, a binary lock enforces mutual exclusion on the dataitem. lock_item(X): B: if LOCK(X) = 0 (* item is unlocked *) then LOCK(X) 1 (*lockthe item *) else begin wait (until LOCK(X) = 0 and the lock manager wakes up the transaction); gotoB end; unlock_item(X): LOCK(X) + 0; (* unlock the item *) if any transactions are waiting then wakeup one of the waiting transactions; Fig: 2.1.1 Lock and unlock operations for binary licks. The lock_item and unlock_jtem operations must be mplemented as indivisible units that is. nc interleaving shoulc be allowed once a lock or unlock operatior is started until the operation terminates or the transaction waits The wait command within the lock_item(X) operation is usually implemented by outing the transactior in a waiting queue for iterr X until X is unlocked and the transaction can be granted access to it Other transactions that also want to access X are placec in the same queue.Hence, the wait commanc is considered to be outside the lock_item operation. It is quite simple to implement a binary lock’ all that is needec is a binary-valued vatiable, LOCK, associated with each date item X in the database In its simplest form, each lock can be @ record with three fields’ plus a queue for transactions that are waiting to access the tem If the simple binary locking scheme described here is used, every transaction must obey the following rules: 4. A transaction T must ‘ssue the operation lock_item(x) before any read_item(X) or write_item(X) operations are performed in T. 2. A transaction T must issue the operation unlock_item(X) after all read_item(X) and write_item(X) operations are completed in T. 3. A transaction T will not issue a lock_item(X) operation if t already holds the lock on item X. 4. A transaction Twill not issue an unlock item(X) operation unless it already holds the lock on item X. . Shared/Exclusive (or Read/Write] Locks inary locking scheme is too restrictive for database items because at most one transaction can hold a lock on a given iterr should allow several transactions to access the same item X if they all access X for reading purposes only f a transaction is to write an iter X, it must have exclusive access to X For this purpose, a different type of lock called a multiple-mode lock is used In this scheme—called shared/exclusive or read/write locks—there are three locking operations: read_lock(X), write_lock(X), and unlock(X). 24 * A read-locked item is also called share-locked because other transactions are allowed to read the item, whereas a write4ocked item is callec exclusive-locked because 2 single transaction exclusively holds the lock on the tem + Methoc to implement read/write lockis to = keep track of the number of transactions that hold a shared (read) lock on an iter in the locktable ~ Each record in the lock table will have four fields: . "If LOCK(X)=write-locked, the value of locking transaction(s) is a single transaction thal nolds the exclusive (write) lock on X + If LOCK(X)=read-locked, the value of locking transaction(s) is a list of one or more transactions that hold the shared (read) lock on X. read lock(X): B: if LOCK(X) = “unlocked” thenbegin LOCK(x) < “read-locked”; no_of reads(X) —1 end else if LOCK(X) = “read-locked" then no_of_reads(X) — no_of_reads(X) +1 else begin wait (until LOCK(0 = “unlocked” and the lock manager wakes up the transaction); gotoB en write lock(X): B: if LOCK(X) ="unlocked” then LOCK(X) < “writelocked” else begin wait (until LOCK() = “unlocked” and the lock manager wakes up the transaction); gotoB end; 25 unlock (0: if LOCK(X) = “writelocked” then begin LOCK(¥) <—“unlocked"; wakeup one of the waiting transactions, if any end else it LOCK(X) ="teadlocked” thenbegin no_of_reads(X) <—no_of reads(X)—1; if no_of reads(X) =0 thenbegin LOCK(X) = “unlocked”, wakeup one of the waiting transactions, if any end end; = When we use the shared/exclusive locking scheme, the system must enforce the following ‘ules 1. A transaction T must issue the operation read_lock(X) or write_lock(X) before any read_item(X) operation is performed in T. 2. A transaction T must issue the operation write_lock(X) before any write_item(X) operation is performed in T. 3A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X) operations are completed in T.3 4, A transaction T will not issue a read_lock(X) operation if t already holds a reac (shared) lock or a write (exclusive) lock onitem X. Conversion of Locks * A transaction that already holds 2 lock on item Xs allowed under certain conditions te convert the lock from one locked state to another + Forexample, it's possible for a transaction Ttoissue a read_lock(X) and then later to upgrade the lock by issuing a write_lock(X) operation - If Tis the only transaction holding a reac lock on X at the time it issues the write_lock(X) operation, the lock can be upgraded:otherwise, the transaction must wait 5.9.2 Guaranteeing Serializability by Two-Phase Locking 26 = A transaction is saic to follow the two-phase locking protocol if all locking operations (read_lock, write_lock) precede the first unlock operation in the transaction = Such a transaction can be divided into two phases: = Expanding or growing (first) phase, during which new locks on tems can be acquired but none can be released = Shrinking (second) phase, during which existing locks can be released but no new locks can be acquirec + Iflock conversion is allowed, then upgrading of locks (from read-locked to write-locked) must be done during the expanding phase, and downgrading of locks (from write-locked to read-locked) must be done in the shrinking phase. * Transactions Tt and T2 in Figure 22.3(a) de not follow the two-phase locking protocol because the write_lock(X) operation follows the unlock(Y) operation in Tt, anc similarly the write_lock(Y) operation follows the unlock(X) operation in T2 @ o Time Th ie read_lock(Y); | —read_lockiX) read item(Y); | read item(X) uniock( 1); unlock); write Jock(X); | write tock): read item; | read item(), XmX4 hi 4 wnte_tem(X: uniockX), unlock), ty Th read lock(Y); read_iter(Y); unioek(); read _lock(X); read _item\X): unlock(X}; write_lock{¥), read item|¥): YaX+ ¥; vwrite_tem(Y); unlock’); write took); read item), XeX4Y wrte_tem(0); unlock); (b) Initial values: X=20, Y=30 Result serial schedule T, followed by T,: X50, Y=80 Result of serial schedule T, Figure 21.3 Transactions that de not obey two-phase locking (a) Twe transactions T1 and 72 (b) Results of possible serial schedules of 71 and T2 (0) A nonserializable schedule S that uses locks 27 = If we enforce two-phase locking, the transactions can be rewritten as 1’ and T2'as shown in Figure 22.4. * Now the schedule shown in Figure 22.3(c) is not permitted for T1_ and T2_ (with their modified order of locking and unlocking operations) under the rules of locking because T1_ wil ‘sue its write_lock(X) before it unlocks tem Y. consequently when T2_ ssues its read_lock(X), itis forced to wait until T1_ releases the lock by issuing an unlock (X) in the schedule. qT Th read_lock(Y); read_item(Y); Li write_lock(X); write_lock(¥); unlock(Y) unlock(X) Figure 22.4 , read_item(X); read_item(Y); Transartinns T," and T,’, which ara the XaX+ Y=X+Y; same as T, and 7) in Figure 22.3, but wiite_item(X);| | write item(v); follow the two-phase locking protocol unlock(x); unlock(Y); Note that they can produce a deadlock. + If every transaction in a schedule follows the two-phase locking protocol, schedule guaranteed to be serializable + Two-phase locking may limit the amount of concurrency that can occur in a schedule * Some seridizable schedules will be prohibited by two-phase locking protocol 5.10 Variations of Two-Phase Locking * Basic 2PL = Technique described previously * Conservative (static) 2PL = Requires a transaction tc lock all the items it accesses before the transaction begins execution by predectaring read-set and wrte-set = Its Deadlock-free protocol 28 + Strict 2PL — guarantees strict schedules = Transaction does not release exclusive locks unt after it commits or aborts = 0 other transactior can read or write an tem that is written by T unless T has committed, leading to a strict schedule for recoverability = Strict 2PL is not deadlock-free + Rigorous 2PL = guarantees strict schedules — Transaction does not release any locks until after it commits or aborts — easier to implement than strict 2PL 5.11 Dealing with Deadlock and Starvation * Deadlock occurs wher each transactior T in a set of two or mare transactions is waiting for some item that is locked by some other transaction T in the set * Hence, each transactior in the set is in a waiting queue. waiting for one of the other transactions in the set to release the lock on ar item. * But because the other transactior is also waiting, it will never release the lock +A simple example is shown in Figure 22.5(a), where the two transactions Tt’ and T2_'are deadlockec in a partial schedule; 71’ is in the waiting queue for X_ which is locked by 72’, while 72’ is in the waiting queue for Y, which is locked by 1’. Meanwitile, neither Tt’ nor 72’ nor any other transaction can access items Xand Y (a) qT Tr read_lock(Y); read_item( wees read_lock(X); read_item(X); write_lock(X); write_lock(Y); Figure 22.5 llustrating the deadlock problem (a) A partial schedule of T1' and 72'thattis in a state of deadlock (b) A wait-for graph for the partial schedule in (a) 29 Deadlock prevention protocols * One way to prevent deadiock is to use a deadlock prevention protocol + One deadlock prevention protocol, which is used in conservative two-phase locking, requires that every transactior lock all the items it needs in advance. If any of the items cannot be obtained none of the items are locked Rather, the transaction waits and ther tries again tc lockall the items it needs. * A second protocol, which alsc limits concurrency, involves ordering all the items in the database and making sure thal a transaction that needs several items will lock them according to that order. This requires that the programmer (or the system) is aware of the chosen order of the items = Both approaches impractical * Some of these techniques use the concept of transaction timestamp TS(7), which is a unique dentifier assigned to each transactior = The timestamps are typically based on the order in which transactions are started; hence, if transaction T1 starts before transaction Ta, then TS(T1) < TS(T2). "The older transaction (which stars frst) has the smaller timestamp valve. + Protocols based on a timestamp + Wait-cie + Wound-wait * Suppose that transaction Titries to lock an item X but is not able to because Xis locked by some other transaction Tj with a conflicting lock. The ‘ules followed by these schemes are = Wait-die. If TS(Ti) < TS(7), then (Tiolder thar Tj) Tis allowed to wait otherwise (Ti younger than Tj) abort Ti (Ti dies) and restarit later with the same timestamp. = Wound-wait. If TS(Ti) < TS(7j, then (Ti older than Tj) abort Tj (Ti wounds Tj) anc restart it later with the same timestamp; otherwise (Ti younger than T)) Tis allowed te wait. + In wait-die, an older transaction is allowed to wait for a younger transaction, whereas a younger transactior requesting an item helo by an older transaction is abortec anc restarted. +The wound. approach does the opposite: A younger transactior is allowac to wail for an older one, whereas an older transaction requesting an item helc by a younger transaction preempts the younger transaction by aborting it, 30 Both schemes end up aborting the younger of the two transactions (the transaction that startec later) that may be involved in a deadlock, assuming that this will waste less processing * It car be showr thal these two techniques are deadlock-free, since in wait-die, transactions only wait for younger transactions so no cycle is created = Similarly, in wound-wait, transactions only wait for older transactions so ne cycle is created = Another group of protocols that orevent deadlock da not require timestamps. These include the + n0 waiting (NW) anc + cautious waiting (CW) algorithms + No waiting algorithm ~ if @ transaction is unable to oblair a lock, it is immediately aborted and then restarted after a certain time delay without checking whether a deadlock will actually occur or not — no transaction ever waits, so no deadlock will occur ~ this scheme can cause transactions to abort anc restart needlessly "cautious waiting ~ tty to reduce the number of needless aborts/restarts — Suppose that transaction Titries to lock an item X but is not able to do so because X's locked by some other transaction Tjwith a conflicting lock — The cautious waiting rules are as follows: * If Tjis not dlockec (not waiting for some other locked item), then Ti is blocked and dlowed to wait; otherwise abort Ti — It.can be shown that cautious waiting is deadlock-free, because no transaction will ever wait for another blocked transaction. 5.12 Deadlock Detection. "A second, more practical approach to dealing with deadlock is deadlock detection, where the system checks if a state of deadlock actually exists. "This solution is attractive if we «now there will be little interference among the transactions—that is, if different transactions will rarely access the same items at the same time a1 This can nappen if the transactions are short anc each transaction locks only a few tems or f the transactior loac is light. On the other hand, f transactions are long and eact transaction uses many items or if the transaction loac is quite heavy, it may be advantageous ta use a deadlock prevention scheme. A simple way to detect a state of deadlock ‘s for the system to construct and maintain a wait-for graph. One node is created in the wait-for graph for each transaction that is currently executing. Whenever a transaction Ti is waiting to lock an item X that is currently locked by a transaction Tj, a directed edge (Ti Tj)is createc in the wait-for graph. When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is dropped from the wait-for graph. We nave a state of deadlock if and only if the wait-for graph has a cycle One problem with this approach is the matter of determining when the syste should check for a deadlock. One possibilty is to check for a cycle every time an edge is added to the wait-for graph but this may cause excessive overhead, Criterie such as the number of currently executing transactions or the perioc of time several transactions have been waiting to lock items may be used instead to check for a cycle, Figure 22.5(b) shows the wait-for graph for the (partial) schedule shown in Figure 225(a) If the system is in a state of deadlock, some of tha transactians causing the deadlock must be aborted. Choosing which transactions to aborl is known as victim selection. The algorithm for victim selection should generally avoid selecting transactions that have een running for a long time and that nave performed many updates ane it should try instead to select transactions that have not made many changes (younger transactions). Timeouts: + Another simple scheme to ded with deadlock is the use of timeouts. + This methoc is practical because ofits low overhead and simplicity, + In this method, if a transactior waits for a perioc longer than a system-defined timeout period, the system assumes that the transaction may be deadlocked and aborts itregardless of whether a deadlock actually exists oF not. 32 Starvation. — Another problem thal may occur when we usé locking is starvation, which occurs when a transaction cannot proceed for an indefinite oerioc of time while other transactions in the system continue normally. - This may occur if the waiting scheme for locked items is unfair giving priority to some transactions over others = One solution for starvation is to have a fair waiting scheme, such as using a first- come-first-served queue; transactions are enabled to lock an item in the order in which they originally requested the lock — Another scheme allows some transactions to have priority over others out increases the priority of a transaction the longer it waits, until it eventually gets the highest priority and proceeds. Starvation can also occur because of victim selection if the algorithm selects the same transactior as victim repeatedly thus causing it to abort anc never finish execution The algorithm can use higher priorities for transactions that have been aborted multiple times to avoid this problem. 5.13 Concurrency Control Based on Timestamp Ordering guarantees serializability using transaction timestamps to order transaction execution for an equivalent serial schedule 5.13.1 Timestamps timestamp is a unique identifier created by the DBMS tc identify a transaction Typically, timestamp values are assigned in the order in which the transactions are submitted to the system, $0 a timestamp can be thought of as the transaction start time. We will refer to the timestamp of transaction Tas TS(7). Concurrency control techniques basec on timestamp ordering do nol use locks:hence, deadlocks cannot occur. Timestamps can be generatec in severa ways — One possibilty is to use a counter thal is incrementec each time its value is assigned to a transaction. The transaction timestamps are numberec 1, 2, 3, 33 in this scheme. A computer counter has @ finite maximum value, so the system must periodically reset the counter to zero wher no transactions are executing for some short period of time. — Anotherway te implement timestamps is to use the current date/time value of the system clock anc ensure that no two timestamp values are generatec during the same tick of the clock. 5.13.2 The Timestamp Ordering Algorithm * The dea for this scheme is to order the transactions basec on their timestamps. * A schedule in which the transactions participate is then serializable, and the only equivalent serial schedule permitted has the transactions in order of their timestamp values. This is called timestamp ordering (TO). * The algorithm must ensure that for each item accessec by conflicting Operations in the schedule. the order in which the tem is accessed does not violate the timestamp order. * To do this, the algorithm associates with each database tem X two timestamp (TS) values 1, read_TS(X). The read timestamp of item Xis the largest timestamp among all the timestamps of transactions thal have successfully read item X—thal is, read_TS(X) = TS(7), where T's the youngest transaction that has read X successfully, 2. write_TS(X). The write timestamp of item Xs the largest of all the timestamps of transactions that have successfully written item X— that is, write_TS(X) = TS(7), where Tis the youngest transaction thal has written X successfully. Basic Timestamp Ordering (TO). "Whenever some transactior T tries to issue a read_item(X) ar a write_item(X) operation, the basic TO algorithm compares the timestamp of T with read_TS(X) and write_TS(X) to ensure that the timestamp order of transaction execution is not violated. "If this order is violated, ther transaction Tis aborted anc resubmitted to the system as @ new transaction with a new timestamp. = If Tis aborted and railed back, any transaction T’ that may have used a value written by T must also be rolled back. 34 + Similarly, any transaction 72 that may have used a value wrtter by T1 must also be rolled back, anc 8¢ on. This effect is known as cascading rollback and is one of the problems associated with basic TO, since the schedules produced are not guaranteec to be recoverable * Ar additional protocol must be enforced to ensure that the schedules are recoverable, cascadeless, or strict. + The basic TO algorithm : + The concurrency control algorithm must check whether conflicting operations violate the timestamp ordering in the following two cases 1. Whenever a transactior T ssues a write_item(X) operation, the following is checked! a. If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then abort and roll back T and ‘eject the operation. This should be done because some younger transactior with a timestamp greater than TS(T}—and hence affer T in the timestamp ordering—has already read or written the value of item Xbefore Thad a chance to write X, thus violating the timestamp ordering. b.If the condition in part (a) does not occur, then execute the write_item(X) operation of T and set write_TS(X) to TS(T). 2. Whenever a transaction Tissues a read_item(X) operation, the following is checked a. If write_TS() > TS(7), then abort anc roll back T anc reject the operation. This should be done because some younger transaction with timestamp greater thar TS(T)—and hence after T in the timestamp ordering—has already written the value of item X before Thad a chance to read X b. If write_TS(X) = TS(7), then execute the read_item(X) operation of T anc set read_TS(X) to the larger of TS(T) and the current read_TSQ%. * Whenever the basic TO algorithm detects twe conflicting operations that occur in the incorrect order, it rejects the later of the two operations oy aborting the transaction that issuec it, The schedules produced by basic TO are nence guaranteed to be confict serializable Strict Timestamp Ordering (TO) + A variation of basic TO called strict TO ensures that the schedules are both strict (for easy recoverability) and (conflict) serializable, 35,

You might also like