Unit 5
Unit 5
Transaction Processing
Concurrency Control in DBMS
Concurrency control is a very important concept of DBMS which
ensures the simultaneous execution or manipulation of data by
several processes or user without resulting in data inconsistency.
Concurrency Control deals with interleaved execution of more
than one transaction.
What is Transaction?
A transaction is a collection of operations that performs a single
logical function in a database application. Each transaction is a unit
of both atomicity and consistency. Thus, we require that
transactions do not violate any database consistency constraints.
That is, if the database was consistent when a transaction started,
the database must be consistent when the transaction successfully
terminates. However, during the execution of a transaction, it may
be necessary temporarily to allow inconsistency, since either the
debit of A or the credit of B must be done before the other. This
temporary inconsistency, although necessary, may lead to difficulty
if a failure occurs.
It is the programmer’s responsibility to define properly the various
transactions, so that each preserves the consistency of the
database. For example, the transaction to transfer funds from the
account of department A to the account of department B could be
defined to be composed of two separate programs: one that debits
account A, and another that credits account B. The execution of
these two programs one after the other will indeed preserve
consistency. However, each program by itself does not transform
the database from a consistent state to a new consistent state.
Thus, those programs are not transactions.
The concept of a transaction has been applied broadly in database
systems and applications. While the initial use of transactions was in
financial applications, the concept is now used in real-time
applications in telecommunication, as well as in the management of
long-duration activities such as product design or administrative
workflows.
A set of logically related operations is known as a transaction. The
main operations of a transaction are:
Read(A): Read operations Read(A) or R(A) reads the value of A
from the database and stores it in a buffer in the main memory.
Write (A): Write operation Write(A) or W(A) writes the value back
to the database from the buffer.
(Note: It doesn’t always need to write it to a database back it just
writes the changes to buffer this is the reason where dirty read
comes into the picture)
Let us take a debit transaction from an account that consists of the
following operations:
1. R(A);
2. A=A-1000;
3. W(A);
Assume A’s value before starting the transaction is 5000.
The first operation reads the value of A from the database and
stores it in a buffer.
the Second operation will decrease its value by 1000. So buffer
will contain 4000.
the Third operation will write the value from the buffer to the
database. So A’s final value will be 4000.
But it may also be possible that the transaction may fail after
executing some of its operations. The failure can be because
of hardware, software or power, etc. For example, if the debit
transaction discussed above fails after executing operation 2, the
value of A will remain 5000 in the database which is not acceptable
by the bank. To avoid this, Database has two important operations:
Commit: After all instructions of a transaction are successfully
executed, the changes made by a transaction are made
permanent in the database.
Rollback: If a transaction is not able to execute all operations
successfully, all the changes made by a transaction are undone.
For more details please refer Transaction Control in DBMS article.
Properties of a Transaction
Atomicity: As a transaction is a set of logically related
operations, either all of them should be executed or none. A
debit transaction discussed above should either execute all three
operations or none. If the debit transaction fails after executing
operations 1 and 2 then its new value of 4000 will not be updated in
the database which leads to inconsistency.
Consistency: If operations of debit and credit transactions on the
same account are executed concurrently, it may leave the database
in an inconsistent state.
For Example, with T1 (debit of Rs. 1000 from A) and T2 (credit of
500 to A) executing concurrently, the database reaches an
inconsistent state.
Let us assume the Account balance of A is Rs. 5000. T1 reads
A(5000) and stores the value in its local buffer space. Then T2
reads A(5000) and also stores the value in its local buffer space.
T1 performs A=A-1000 (5000-1000=4000) and 4000 is stored in
T1 buffer space. Then T2 performs A=A+500 (5000+500=5500)
and 5500 is stored in the T2 buffer space. T1 writes the value
from its buffer back to the database.
A’s value is updated to 4000 in the database and then T2 writes
the value from its buffer back to the database. A’s value is
updated to 5500 which shows that the effect of the debit
transaction is lost and the database has become inconsistent.
To maintain consistency of the database, we need concurrency
control protocols which will be discussed in the next article.
The operations of T1 and T2 with their buffers and database have
been shown in Table 1.
T1’s buffer T2’s Buffer
T1 space T2 Space Database
A=5000
W(A); A=5500
Isolation: The result of a transaction should not be visible to others
before the transaction is committed. For example, let us assume
that A’s balance is Rs. 5000 and T1 debits Rs. 1000 from A. A’s new
balance will be 4000. If T2 credits Rs. 500 to A’s new balance, A will
become 4500, and after this T1 fails. Then we have to roll back T2
as well because it is using the value produced by T1. So transaction
results are not made visible to other transactions before it commits.
Durable: Once the database has committed a transaction, the
changes made by the transaction should be permanent. e.g.; If a
person has credited $500000 to his account, the bank can’t say that
the update has been lost. To avoid this problem, multiple copies of
the database are stored at different locations.
What is a Schedule?
A schedule is a series of operations from one or more transactions. A
schedule can be of two types:
Serial Schedule: When one transaction completely executes
before starting another transaction, the schedule is called a serial
schedule. A serial schedule is always consistent. e.g.; If a schedule S
has debit transaction T1 and credit transaction T2, possible serial
schedules are T1 followed by T2 (T1->T2) or T2 followed by T1 ((T2-
>T1). A serial schedule has low throughput and less resource
utilization.
Concurrent Schedule: When operations of a transaction are
interleaved with operations of other transactions of a schedule, the
schedule is called a Concurrent schedule. e.g.; the Schedule of debit
and credit transactions shown in Table 1 is concurrent. But
concurrency can lead to inconsistency in the database. The above
example of a concurrent schedule is also inconsistent.
Difference between Serial Schedule and
Serializable Schedule
Serial Schedule Serializable Schedule
Serial schedule are less efficient. Serializable schedule are more efficient.
Transaction in DBMS
In Database Management Systems (DBMS), a transaction is a
fundamental concept representing a set of logically related
operations executed as a single unit. Transactions are essential
for handling user requests to access and modify database contents,
ensuring the database remains consistent and reliable despite
various operations and potential interruptions.
In this article, we will discuss what a transaction means, various
operations of transactions, transaction states, and properties of
transactions in DBMS.
What does a Transaction mean in DBMS?
Transaction in Database Management Systems (DBMS) can
be defined as a set of logically related operations.
It is the result of a request made by the user to access the
contents of the database and perform operations on it.
It consists of various operations and has various states in its
completion journey.
It also has some specific properties that must be followed to keep
the database consistent.
Operations of Transaction
A user can make different types of requests to access and modify
the contents of a database. So, we have different types of
operations relating to a transaction. They are discussed as follows:
i) Read(X)
A read operation is used to read the value of X from the
database and store it in a buffer in the main memory for further
actions such as displaying that value.
Such an operation is performed when a user wishes just to see
any content of the database and not make any changes to it. For
example, when a user wants to check his/her account’s balance,
a read operation would be performed on user’s account balance
from the database.
ii) Write(X)
A write operation is used to write the value to the database from
the buffer in the main memory. For a write operation to be
performed, first a read operation is performed to bring its value in
buffer, and then some changes are made to it, e.g. some set of
arithmetic operations are performed on it according to the user’s
request, then to store the modified value back in the database, a
write operation is performed.
For example, when a user requests to withdraw some money
from his account, his account balance is fetched from the
database using a read operation, then the amount to be deducted
from the account is subtracted from this value, and then the
obtained value is stored back in the database using a write
operation.
iii) Commit
This operation in transactions is used to maintain integrity in the
database. Due to some failure of power, hardware, or software,
etc., a transaction might get interrupted before all its operations
are completed. This may cause ambiguity in the database, i.e. it
might get inconsistent before and after the transaction.
To ensure that further operations of any other transaction are
performed only after work of the current transaction is done, a
commit operation is performed to the changes made by a
transaction permanently to the database.
iv) Rollback
This operation is performed to bring the database to the last
saved state when any transaction is interrupted in between due
to any power, hardware, or software failure.
In simple words, it can be said that a rollback operation does
undo the operations of transactions that were performed before
its interruption to achieve a safe state of the database and avoid
any kind of ambiguity or inconsistency.
Transaction Schedules
When multiple transaction requests are made at the same time, we
need to decide their order of execution. Thus, a transaction
schedule can be defined as a chronological order of execution of
multiple transactions.
There are broadly two types of transaction schedules discussed as
follows:
i) Serial Schedule
In this kind of schedule, when multiple transactions are to be
executed, they are executed serially, i.e. at one time only one
transaction is executed while others wait for the execution of the
current transaction to be completed. This ensures consistency in
the database as transactions do not execute simultaneously.
But, it increases the waiting time of the transactions in the queue,
which in turn lowers the throughput of the system, i.e. number of
transactions executed per time.
To improve the throughput of the system, another kind of
schedule are used which has some more strict rules which help
the database to remain consistent even when transactions
execute simultaneously.
ii) Non-Serial Schedule
To reduce the waiting time of transactions in the waiting queue
and improve the system efficiency, we use nonserial schedules
which allow multiple transactions to start before a transaction is
completely executed. This may sometimes result in inconsistency
and errors in database operation.
So, these errors are handled with specific algorithms to maintain
the consistency of the database and improve CPU throughput as
well.
Non-serial schedules are also sometimes referred to as parallel
schedules, as transactions execute in parallel in these kinds of
schedules.
Serializable
Serializability in DBMS is the property of a nonserial schedule that
determines whether it would maintain the database consistency
or not.
The nonserial schedule which ensures that the database would be
consistent after the transactions are executed in the order
determined by that schedule is said to be Serializable Schedules.
The serial schedules always maintain database consistency as a
transaction starts only when the execution of the other
transaction has been completed under it.
Thus, serial schedules are always serializable.
A transaction is a series of operations, so various states occur in
its completion journey. They are discussed as follows:
i) Active
It is the first stage of any transaction when it has begun to
execute. The execution of the transaction takes place in this
state.
Operations such as insertion, deletion, or updation are performed
during this state.
During this state, the data records are under manipulation and
they are not saved to the database, rather they remain
somewhere in a buffer in the main memory.
ii) Partially Committed
This state of transaction is achieved when it has completed most
of the operations and is executing its final operation.
It can be a signal to the commit operation, as after the final
operation of the transaction completes its execution, the data has
to be saved to the database through the commit operation.
If some kind of error occurs during this state, the transaction goes
into a failed state, else it goes into the Committed state.
iii) Commited
This state of transaction is achieved when all the transaction-related
operations have been executed successfully along with the Commit
operation, i.e. data is saved into the database after the required
manipulations in this state. This marks the successful completion of
a transaction.
iv) Failed
If any of the transaction-related operations cause an error during
the active or partially committed state, further execution of the
transaction is stopped and it is brought into a failed state. Here,
the database recovery system makes sure that the database is in
a consistent state.
v) Aborted
If the error is not resolved in the failed state, then the transaction is
aborted and a rollback operation is performed to bring database to
the the last saved consistent state. When the transaction is aborted,
the database recovery module either restarts the transaction or kills
it.
The illustration below shows the various states that a transaction
may encounter in its completion journey.
Transaction in DBMS
Properties of Transaction
As transactions deal with accessing and modifying the contents of
the database, they must have some basic properties which help
maintain the consistency and integrity of the database before and
after the transaction. Transactions follow 4 properties,
namely, Atomicity, Consistency, Isolation, and Durability.
Generally, these are referred to as ACID properties of
transactions in DBMS. ACID is the acronym used for transaction
properties. A brief description of each property of the transaction
is as follows.
i) Atomicity
This property ensures that either all operations of a transaction
are executed or it is aborted. In any case, a transaction can never
be completed partially.
Each transaction is treated as a single unit (like an atom).
Atomicity is achieved through commit and rollback operations,
i.e. changes are made to the database only if all operations
related to a transaction are completed, and if it gets interrupted,
any changes made are rolled back using rollback operation to
bring the database to its last saved state.
ii) Consistency
This property of a transaction keeps the database consistent
before and after a transaction is completed.
Execution of any transaction must ensure that after its execution,
the database is either in its prior stable state or a new stable
state.
In other words, the result of a transaction should be the
transformation of a database from one consistent state to another
consistent state.
Consistency, here means, that the changes made in the database
are a result of logical operations only which the user desired to
perform and there is not any ambiguity.
iii) Isolation
This property states that two transactions must not interfere with
each other, i.e. if some data is used by a transaction for its
execution, then any other transaction can not concurrently access
that data until the first transaction has completed.
It ensures that the integrity of the database is maintained and we
don’t get any ambiguous values. Thus, any two transactions are
isolated from each other.
This property is enforced by the concurrency control subsystem
of DBMS.
iv) Durability
This property ensures that the changes made to the database
after a transaction is completely executed, are durable.
It indicates that permanent changes are made by the successful
execution of a transaction.
In the event of any system failures or crashes, the consistent
state achieved after the completion of a transaction remains
intact. The recovery subsystem of DBMS is responsible for
enforcing this property.
Introduction to Transaction Processing
Single user system :
In this at-most, only one user at a time can use the system.
Multi-user system :
In the same, many users can access the system concurrently.
Concurrency can be provided through :
1. Interleaved Processing –
In this, the concurrent execution of processes is interleaved in a single
CPU. The transactions are interleaved, meaning the second transaction is
started before the primary one could finish. And execution can switch
between the transactions. It can also switch between multiple
transactions. This causes inconsistency in the system.
2. Parallel Processing –
It is defined as the processing in which a large task into various smaller
tasks and smaller task also executes concurrently on several nodes. In
this, the processes are concurrently executed in multiple CPUs.
Transaction :
It is a logical unit of database processing that includes one or more access
operations. (read-retrieval, write-insert or update). It is a unit of program
execution that accesses and if required updates various data items.
A transaction is a set of operations that can either be embedded within an
application program or can be specified interactively via a high-quality
language such as SQL.
Example –
Consider a transaction that involves transferring $1700 from a customer’s
savings account to a customer’s checking account. This transaction involves
two separate operations: debiting the savings account by $1700 and
crediting the checking account by $1700. If one operation succeeds but the
other doesn’t, the books of the bank will not balance.
Transaction boundaries :
Begin and end boundaries. In this, you can say an application program may
have several transactions and transactions separated by the beginning and
end of the transaction in an application program.
Granularity of data :
The size of data item is called its granularity.
A data item can be an individual field (attribute), value of some record, a
record, or a whole disk block.
Concepts are independent of granularity
Advantages :
Batch processing or real-time processing available.
Reduction in processing time, lead time and order cycle time.
Reduction in inventory, personnel and ordering costs.
Increase in productivity andcustomer satisfaction
Disadvantages :
High setup costs.
Lack of standard formats.
Hardware and software incompatibility.
1 lock-X(B)
S.No T1 T2
2 read(B)
3 B:=B-50
4 write(B)
5 lock-S(A)
6 read(A)
7 lock-S(B)
8 lock-X(A)
9 …… ……
1. Deadlock
In deadlock consider the above execution phase. Now, T1 holds an
Exclusive lock over B, and T2 holds a Shared lock over A. Consider
Statement 7, T2 requests for lock on B, while in Statement
8 T1 requests lock on A. This as you may notice imposes a deadlock
as none can proceed with their execution.
Deadlock
2. Starvation
Starvation is also possible if concurrency control manager is badly
designed. For example: A transaction may be waiting for an X-lock
on an item, while a sequence of other transactions request and are
granted an S-lock on the same item. This may be avoided if the
concurrency control manager is properly designed.