Unit 5 DBMS
Unit 5 DBMS
When several transactions execute concurrently without any rules and protocols, various problems arise that
may harm the data integrity of several databases. These problems are known as concurrency control problems.
Therefore, several rules are designed, to maintain consistency in the transactions while they are executing
concurrently which are known as concurrency control protocols.
A transaction is a single reasonable unit of work that can retrieve or may change the data of a database.
Executing each transaction individually increases the waiting time for the other transactions and the overall
execution also gets delayed. Hence, to increase the throughput and to reduce the waiting time, transactions are
executed concurrently.
Example: Suppose, between two railway stations, A and B, 5 trains have to travel, if all the trains are set in a
row and only one train is allowed to move from station A to B and others have to wait for the first train to reach
its destination then it will take a lot of time for all the trains to travel from station A to B. To reduce time all the
trains should be allowed to move concurrently from station A to B ensuring no risk of collision between them.
When several transactions execute simultaneously, then there is a risk of violation of the data integrity of
several databases. Concurrency Control in DBMS is a procedure of managing simultaneous
transactions ensuring their atomicity, isolation, consistency and serializability.
The dirty read problem occurs when a transaction reads the data that has been updated by another transaction
that is still uncommitted. It arises due to multiple uncommitted transactions executing simultaneously.
Example: Consider two transactions A and B performing read/write operations on a data DT in the database
DB. The current value of DT is 1000: The following table shows the read/write operations in A and B
transactions.
Time A B
T1 READ(DT) ------
T2 DT=DT+500 ------
T3 WRITE(DT) ------
T4 ------ READ(DT)
T5 ------ COMMIT
T6 ROLLBACK ------
Transaction A reads the value of data DT as 1000 and modifies it to 1500 which gets stored in the temporary
buffer. The transaction B reads the data DT as 1500 and commits it and the value of DT permanently gets
changed to 1500 in the database DB. Then some server errors occur in transaction A and it wants to get rollback
to its initial value, i.e., 1000 and then the dirty read problem occurs.
Unrepeatable Read Problem
The unrepeatable read problem occurs when two or more different values of the same data are read during the
read operations in the same transaction.
Example: Consider two transactions A and B performing read/write operations on a data DT in the database
DB. The current value of DT is 1000: The following table shows the read/write operations in A and B
transactions.
Time A B
T1 READ(DT) ------
T2 ------ READ(DT)
T3 DT=DT+500 ------
T4 WRITE(DT) ------
T5 ------ READ(DT)
Transaction A and B initially read the value of DT as 1000. Transaction A modifies the value of DT from 1000
to 1500 and then again transaction B reads the value and finds it to be 1500. Transaction B finds two different
values of DT in its two different read operations.
In the phantom read problem, data is read through two different read operations in the same transaction. In the
first read operation, a value of the data is obtained but in the second operation, an error is obtained saying the
data does not exist.
Example: Consider two transactions A and B performing read/write operations on a data DT in the database
DB. The current value of DT is 1000: The following table shows the read/write operations in A and B
transactions.
Time A B
T1 READ(DT) ------
T2 ------ READ(DT)
T3 DELETE(DT) ------
T4 ------ READ(DT)
Transaction B initially reads the value of DT as 1000. Transaction A deletes the data DT from the database DB
and then again transaction B reads the value and finds an error saying the data DT does not exist in the database
DB.
The Lost Update problem arises when an update in the data is done over another update but by two different
transactions.
Example: Consider two transactions A and B performing read/write operations on a data DT in the database
DB. The current value of DT is 1000: The following table shows the read/write operations in A and B
transactions.
Time A B
T1 READ(DT) ------
T2 DT=DT+500 ------
T3 WRITE(DT) ------
T4 ------ DT=DT+300
T5 ------ WRITE(DT)
T6 READ(DT) ------
Transaction A initially reads the value of DT as 1000. Transaction A modifies the value of DT from 1000 to
1500 and then again transaction B modifies the value to 1800. Transaction A again reads DT and finds 1800 in
DT and therefore the update done by transaction A has been lost.
The Incorrect summary problem occurs when there is an incorrect sum of the two data. This happens when a
transaction tries to sum two data using an aggregate function and the value of any one of the data get changed
by another transaction.
Example: Consider two transactions A and B performing read/write operations on two data DT1 and DT2 in
the database DB. The current value of DT1 is 1000 and DT2 is 2000: The following table shows the read/write
operations in A and B transactions.
Time A B
T1 READ(DT1) ------
T2 add=0 ------
T3 add=add+DT1 ------
T4 ------ READ(DT2)
T5 ------ DT2=DT2+500
T6 READ(DT2) ------
T7 add=add+DT2 ------
Transaction A reads the value of DT1 as 1000. It uses an aggregate function SUM which calculates the sum of
two data DT1 and DT2 in variable add but in between the value of DT2 get changed from 2000 to 2500 by
transaction B. Variable add uses the modified value of DT2 and gives the resultant sum as 3500 instead of
3000.
Lock-Based Protocols
To attain consistency, isolation between the transactions is the most important tool. Isolation is achieved if we
disable the transaction to perform a read/write operation. This is known as locking an operation in a transaction.
Through lock-based protocols, desired operations are freely allowed to perform locking the undesired
operations.
Shared Lock(S): The locks which disable the write operations but allow read operations for any data in a
transaction are known as shared locks. They are also known as read-only locks and are represented by 'S'.
Exclusive Lock(X): The locks which disable both the write and read operations for any data in a
transaction are known as exclusive locks. They are represented by 'X'.
Pre-claiming Lock Protocol: According to the pre-claiming lock protocol initially, an assessment of the
operations that are going to be performed is conducted. Then a list is prepared to contain the data items on
which locks will be imposed. The transaction requests the system all the locks before starting the execution of
the operations. If all the locks are provided then the operations in the transaction run smoothly and then locks
are returned to the system on completion. The transaction rolls back if all the locks are not provided.
Two-phase Locking Protocol: This protocol consists of three phases. The transaction starts its execution with
the first phase, where it asks for the locks. Once the locks are granted, the second phase begins, where the
transaction contains all the locks. When the transaction releases the first lock, the third phase begins where all
the locks are getting released after the execution of every operation in the transaction.
Strict Two-Phase Locking Protocol: The strict 2PL is almost similar to 2PL. The only difference is that the
strict 2PL does not allow releasing the locks just after the execution of the operations, but it carries all the locks
and releases them when the commit is triggered.
Time-based Protocols
According to this protocol, every transaction has a timestamp attached to it. The timestamp is based on the
time in which the transaction is entered into the system. There is read and write timestamps associated with
every transaction which consists of the time at which the latest read and write operations are performed
respectively.
The timestamp ordering protocol uses timestamp values of the transactions to resolve the conflicting pairs of
operations. Thus, ensuring serializability among transactions. Following are the denotations of the terms used to
define the protocol for transaction A on the data item DT:
Terms Denotations
Timestamp of transaction A TS(A)
Read time-stamp of data-item DT R-timestamp(DT)
Write time-stamp of data-item DT W-timestamp(DT)
Read phase: In this phase, the transaction stores all the values of data in its local buffer that occurs after the
execution of every operation in the transaction. There is no modification done in the database.
Validation phase: In this phase, validation tests are performed that check whether the values of data present in
the local buffer can replace the original value of the database without causing any harm to serializability.
Validation Test: Validation tests have performed on transaction A executing concurrently with transaction B
such that TS(A)<TS(B). The transactions must follow one of the following conditions:
1. Finish(A)<Start(B): The operations in transaction A are finished its execution before transaction B
starts. Consider two transactions A and B executing its operations. Hence serializability order is
maintained.
2. Start(B)<Finish(A)<Validate(B): The list of data items written by transaction A during its write
operation should not intersect with the read of the transaction B.
Write phase: If the transaction passes the tests of the validation phase, then the values get copied to the
database, otherwise the transaction rolls back.
Example: Consider two transactions A and B performing read/write operations on two data DT1 and DT2 in
the database DB. The current value of DT1 is 1000 and DT2 is 2000: The following table shows the read/write
operations in A and B transactions.
Time A B
T1 READ(DT1) ------
T2 ------ READ(DT1)
T3 ------ DT1=DT1-100
T4 ------ READ(DT2)
T5 ------ DT2=DT2+500
T6 ------ READ(DT2)
T7 ------
T8 PRINT(DT2-DT1) ------
T9 ------
T10 ------ WRITE(DT1)
T11 ------ WRITE(DT2)
The schedule passes the validation test of the validation phase due to the timestamp transaction B being less
than transaction A. It should be observed that the write operations are implemented after the validation of both
transactions. All the operations before the final write are performed in the local buffer.
Conclusion
● Concurrency Control in DBMS is a procedure of managing simultaneous transactions ensuring their
atomicity, isolation, consistency, and serializability.
● Several problems that arise when numerous transactions execute simultaneously in a random manner are
referred to as concurrency control problems.
● The dirty read problem occurs when a transaction reads the data that has been updated by another
transaction that is still uncommitted.
● The unrepeatable read problem occurs when two or more different values of the same data are read
during the read operations in the same transaction.
● The phantom read problem occured when the read data got deleted by another transaction and on
applying read operation to it shows errors.
● The Lost Update problem arises when an update in the data is done over another update but by two
different transactions.
● The Incorrect summary problem occurs when there is an incorrect sum of the two data.
● To maintain consistency and serializability during the execution of concurrent transactions some rules
are made. These rules are known as concurrency control protocols.
● Lock-based protocol disables a transaction to perform read or write operations.
● Timestamp-based protocol associates timestamp with the transactions and execute according to the
order of timestamp.
● Validation based protocol initially makes the list of values of the data and then decides whether the
value should be written in the database or not.