12 - Concurrency - Recovery 12-4-2024
12 - Concurrency - Recovery 12-4-2024
Database Recovery
1
Logistics
Final report: Due on 12/15/24
Extra points
Teaching evaluation screenshot (ONLY SHOW
SUBMISSION and NO SCORE)
Final exam
12/16/24 online _ 6 to 8 pm
2
Topics
Transaction processing
Concurrency control
Database recovery
3
Why Need Understanding
Transaction Processing
Multi-user database
More than one user processing the database at
the same time
Controlling read and update order
Issues
How can we prevent users from interfering with
each other’s work ?
How can we safely process transactions on the
database without corrupting or losing data ?
If there is a problem (e.g., power failure or system
crash), how can we recover without losing all of
our data ?
4
Transaction Processing
A transaction is a set
of read and write operations that must
either commit or abort.
Commit point (Success)
all actions are permanently saved in the database
Not commit: dirty
Abort (Failure)
none of the actions are saved
Such a transaction is rolled back or undone.
5
ACID Properties of Transactions
: relational database VS NoSQL
database
Atomicity ‘All or nothing’ property. Commit or
abort
Consistency Must transform database from one
consistent state to another.
Isolation Partial effects of incomplete
transactions should not be visible to other
transactions.
Durability Effects of a committed transaction
are permanent and must not be lost because of
later failure.
Dirty read problem: isolation
Recovery tries to satisfy which property?
6
Example
Two users executing similar transactions
Case 1:
User A User B
Read Salary for emp 101 Read Salary for emp 101
Multiply salary by 1.03 Multiply salary by 1.04
Write Salary for emp Write Salary for emp
101 101
Case2:
User A User B
Read inventory for Prod Read inventory for Prod
P200 P200
Decrement inventory by Decrement inventory by
5 7
7 Write inventory for Prod Write inventory for Prod
P200 P200
Four General classes of problems
with Transaction Processing
The Lost Update Problem
Successfully completed update is overridden by
another user.
The Dirty Read Problem (also called uncommitted
dependency)
Occurs when one transaction can see intermediate
results of another transaction before this
transaction has committed.
The Incorrect Analysis Problem
Occurs when transaction reads several values but
second transaction updates some of them during
execution of first.
The Non-Repeatable Read Problem
nonrepeatable read occurs when a transaction reads the
8 same row twice but gets different data each time
Serial Schedules and
Serializability
Can we insist only one transact at a
time?
Transaction throughput: The number of
transactions we can perform in a given time
period. Often reported as Transactions per
second or TPS.
Serializable
More concurrent transactions = serial order (one
after another).
Schedule: Sequence of reads/writes by set of
concurrent transactions.
10
Concurrency Control and Locking
Concurrency Control is a method for
controlling or scheduling the operations in
such a way that concurrent transactions can
be executed.
Concurrent transactions need to be serialized
Locking
Locking is done to data items in order to reserve
them for future operations.
A lock is a logical flag set by a transaction to alert
other transactions the data item is in use.
Timestamping.
11
Characteristics of Locks
Two types:
Implicit Locks are applied by the DBMS
Explicit Locks are applied by application
programs.
Lock granularity
a single data item (value)
an entire row of a table
a page (memory segment) (many rows worth)
an entire table or database
Requirements of transaction
An Exclusive Lock (XL) prevents any other
transaction from reading or modifying the locked
item.
12
A Shared Lock (SL) allows another transaction to
Locking - Basic Rules
If transaction has shared lock on item, can
read but not update item.
If transaction has exclusive lock on item, can
both read and update item.
Reads cannot conflict, so more than one
transaction can hold shared locks
simultaneously on same item.
Exclusive lock gives transaction exclusive
access to that item.
13
Two phase of locking
2PL has two phases:
A transaction acquires locks on data items it will
need to complete the transaction. This is called
the growing phase.
Once one lock is released, all no other lock may be
acquired. This is called the shrinking phase.
14
Amy and Bill are two employees who will get a 5% raise this year.
Carl will get a 3% raise this year.
15
Deadlock
Updating two different data items (P200 and
P300).
16
Solution of Dead Lock
Two main ways to deal with deadlock.
Prevent it in the first place by giving each
transaction exclusive rights to acquire all locks
needed before proceeding.
Allow the deadlock to occur, then break it by
aborting one of the transactions.
17
Timestamping
Ordering transaction globally
Smaller timestamps, get priority in the event
of conflict.
Conflict is resolved by rolling back and
restarting transaction.
No locks so no deadlock.
18
Timestamping
Timestamp
A unique identifier created by DBMS that
indicates relative starting time of a transaction.
Can be generated by using system clock at
time transaction started, or by incrementing
a logical counter every time a new
transaction starts.
19
Timestamping
Read/write proceeds only if last update on
that data item was carried out by an older
transaction.
Otherwise, transaction requesting read/write
is restarted and given a new timestamp.
Also timestamps for data items:
read-timestamp - timestamp of last transaction to
read item;
write-timestamp - timestamp of last transaction
to write item.
20
Transaction Interruption
There are many situations in which a
transaction may not reach a commit or abort
point.
An operating system crash
The DBMS can crash
The system might lose power
A disk may fail or other hardware may fail.
Human error can result in deletion of critical data.
21
Database Recovery
The process of restoring the database and the
data to a consistent state.
This may include restoring lost data up to the
point of the event (e.g. system crash).
Which ACID state?
Durablity: committed (saved) won’t be lost
Two types of storage
volatile (main memory) and nonvolatile.
22
Recovery Facilities
DBMS should provide following facilities to assist
with recovery:
Backup mechanism
Backup mechanism, which makes periodic backup
copies of database.
Logging facilities
Logging facilities, which keep track of current state of
transactions and database changes.
Checkpoint facility
Checkpoint facility, which enables updates to
database in progress to be made permanent.
Recovery manager, which allows DBMS to
restore database to consistent state following
a failure.
Pearson Education © 2014 23
Recovering approach (Manual
Reprocessing)
Manual Reprocessing
Database is periodically backed up (a
database save) and all transactions applied since
the last save are recorded
When system crashes
the latest database backup set is restored to the point
just before the crash.
Several shortcomings to the Manual
Reprocessing approach:
Time required to re-apply transactions
Transactions might have other (physical)
consequences
Re-applying concurrent transactions in the same
24 original sequence may not be possible.
Recovering approach (Automated
Recovery)
Log file (journal)
a file separate from the data that records all of the
changes made to the database by transactions.
Contains information about all updates to
database:
Transaction records.
Checkpoint records.
Often used for other purposes (for example,
auditing).
Log information
Before Image: A copy of the table record (or data
item) before it was changed by the transaction.
After Image: A copy of the table record (or data
25 item) after it was changed by the transaction.
Sample Log File
26
Automated Recovery process
Operations
Rollback: Undo any partially completed
transactions (ones in progress when the crash
occurred) by applying the before images to the
database.
Rollforward: Redo the transactions by applying
the after images to the database. This is done for
transactions that were committed before the
crash.
Automated Recovery process
rollback and rollforward to restore the database
27
Checkpointing
Checkpoints
Point of synchronization between database and log
file.
between database saves
The DBMS flushes all pending transactions and
writes all data to disk and transaction log.
Database can be recovered from the last
checkpoint in much less time.
28
Example
Physical backup of the data just before
Transaction A begins
Amy = 45
Bill = 38
Carl = 51
R_R = .05
29
Transaction A Begin
Transaction B Begin
Transaction C Begin
Transaction B Write: Bill Before: 38 After: 39.9
Transaction A Write: Amy Before: 45 After: 47.7
Transaction B Commit
Transaction C Write: R_R Before .05 After: .03
30