Ch-5 Recovery_Systems
Ch-5 Recovery_Systems
Failure Classification
Storage Structure
Recovery and Atomicity
Log-Based Recovery
Remote Backup Systems
2
Failure Classification
Transaction failure :
Logical errors: transaction cannot complete due to some internal
error condition such as bad input, data not found, overflow,…
System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
Can be re-executed at a later time.
System crash: a power failure or other hardware or software failure
causes the system to crash.
Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
Database systems have numerous integrity checks to prevent
corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of
disk storage
3
Recovery Algorithms
Recovery algorithms are techniques to ensure database consistency
and transaction atomicity and durability despite failures
Focus of this chapter
Recovery algorithms have two parts:
1. Actions taken during normal transaction processing to ensure
enough information exists to recover from failures
2. Actions taken after a failure to recover the database contents to a
state that ensures atomicity, consistency and durability
4
Storage Structure
Volatile storage:
does not survive system crashes
examples: main memory, cache memory
Nonvolatile storage:
survives system crashes
examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM (NVRAM)
Stable storage:
a mythical form of storage that survives all failures
approximated by maintaining multiple copies on distinct nonvolatile
media
5
Data Access
Blocks are the units of data transfer to and from disk, and may contain several data
items.
Transactions input information from disk to main memory, and then output the
information back onto the disk.
Physical blocks are those blocks residing on the disk.
Buffer blocks are the blocks residing temporarily in main memory.
Disk buffer -> the area of memory where blocks reside temporarily
Block movements between disk and main memory are initiated through the
following two operations:
input(B) transfers the physical block B to main memory.
output(B) transfers the buffer block B to the disk, and replaces the appropriate
physical block there.
Each transaction Ti has its private work-area in which local copies of all data items
accessed and updated by it are kept.
The system creates this work area when the transaction is initiated; the system
removes it when the transaction either commits or aborts.
6
Data Access (Cont.)
Transaction transfers data items between system buffer blocks and its
private work-area using the following operations :
read(X) assigns the value of data item X to the local variable xi.
write(X) assigns the value of local variable xi to data item {X} in
the buffer block.
both these commands may necessitate the issue of an input(BX)
instruction before the assignment, if the block BX in which X
resides is not already in memory.
Transactions
Perform read(X) while accessing X for the first time;
All subsequent accesses are to the local copy.
After last access, transaction executes write(X).
output(BX) need not immediately follow write(X). System can perform
the output operation when it deems fit.
7
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)
x2
x1
y1
memory disk
8
Recovery and Atomicity
Modifying the database without ensuring that the transaction will commit
may leave the database in an inconsistent state.
Consider transaction Ti that transfers $50 from account A to account B;
goal is either to perform all database modifications made by Ti or none
at all. (To ensure Atomicity)
Several output operations may be required for Ti (to output A and B).
A failure may occur after one of these modifications have been made but
before all of them are made.
9
Recovery and Atomicity (Cont.)
To ensure atomicity despite failures, we first output information
describing the modifications to stable storage without modifying the
database itself.
There are two approaches:
log-based recovery, and
shadow-paging
We assume (initially) that transactions run serially, that is, one after the
other.
10
Log-Based Recovery
A log is kept on stable storage.
The log is a sequence of log records, and maintains a record of update
activities on the database.
When transaction Ti starts, it registers itself by writing a
<Ti start>log record
Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is
the value of X before the write, and V2 is the value to be written to X.
Log record notes that Ti has performed a write on data item Xj
Xj had value V1 before the write, and will have value V2 after the write.
When Ti finishes its last statement, the log record <Ti commit> is written.
<Ti abort> , Transaction Ti has aborted.
We assume for now that log records are written directly to stable storage (that
is, they are not buffered)
Two approaches using logs to ensure transaction atomicity despite failures:
Deferred database modification
Immediate database modification
11
Deferred Database Modification
The deferred database modification scheme records all modifications
to the log, but defers all the writes to after partial commit.
Do not physically update the database on disk until after a transaction
reaches its commit point; then the updates are recorded in the
database.
Before commit, the updates are recorded persistently in the log, and
then after commit, the updates are written to the database on disk.
Transaction starts by writing <Ti start> record to log.
A write(X) operation results in a log record <Ti, X, V> being written,
where V is the new value for X
Note: old value is not needed for this scheme
The write is not performed on X at this time, but is deferred(postpone).
When Ti partially commits, <Ti commit> is written to the log
Finally, the log records are read and used to actually execute the
previously deferred writes.
12
Deferred Database Modification (Cont.)
During recovery after a crash, a transaction needs to be redone if and only if both
<Ti start> and<Ti commit> are there in the log.
If the system crashes before the transaction completes its execution, or if it aborts,
then the information on the log is simply ignored.
Redoing a transaction Ti ( redoTi) sets the value of all data items updated by the
transaction to the new values.
Crashes can occur while
the transaction is executing the original updates, or
while recovery action is being taken
example transactions T0 and T1 (T0 executes before T1)
let A=1000,B=2000, and C= 700:
T0: read (A) T1 : read (C)
A: - A - 50 C:- C- 100
Write (A) write (C)
read (B)
B:- B + 50
write (B)
13
Deferred Database Modification (Cont.)
Below we show the log as it appears at three instances of time.
14
Immediate Database Modification
The immediate database modification scheme allows database
updates of an uncommitted transaction to be made as the writes are
issued( while the transaction is still in the active state)
since undoing may be needed, update logs must have both old
value and new value
Update log record must be written before database item is written
Output of updated blocks can take place at any time before or after
transaction commit
Order in which blocks are output can be different from the order in
which they are written.
15
Immediate Database Modification Example
<T0 start>
<T0, A, 1000, 950>
<To, B, 2000, 2050>
A = 950
B = 2050
<T0 commit>
<T1 start>
<T1, C, 700, 600>
C = 600
BB, BC
<T1 commit>
BA
Note: BX denotes block containing X.
16
Immediate Database Modification (Cont.)
18
Remote Backup Systems
Remote backup systems provide high availability by allowing transaction
processing to continue even if the primary site is destroyed.
19
Remote Backup Systems (Cont.)
Detection of failure: Backup site must detect when primary site has failed
to distinguish primary site failure from link failure maintain several
communication links between the primary and the remote backup.
With independent modes of failure between the primary and the remote
backup.
Heart-beat messages
Transfer of control:
To take over control backup site first perform recovery using its copy of
the database and all the log records it has received from the primary.
Thus, completed transactions are redone and incomplete
transactions are rolled back.
When the backup site takes over processing it becomes the new
primary
To transfer control back to old primary when it recovers, old primary
must receive redo logs from the old backup and apply all updates
locally.
20
Remote Backup Systems (Cont.)
21
Remote Backup Systems (Cont.)
22