Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19
Chapter 5: Database Recovery Techniques
Database systems, like any other computer system, are
subject to failures but the data stored in it must be available as and when required. There are both automatic and non-automatic ways for both, backing up of data and recovery from any failure situations. The techniques used to recover the lost data due to system crash, transaction errors, viruses, catastrophic failure, incorrect commands execution etc. are database recovery techniques. Recovery techniques are heavily dependent upon the existence of a special file known as a system log. The log keeps track of all transaction operations that affect the values of database items. Prepared by: Elisaye B.@WSU-DTC 1 Recovery technique Based on Deferred Update
The idea behind deferred update is to defer or postpone any
actual updates to the database on disk until the transaction completes its execution successfully and reaches its commit point. It is also called NO-UNDO/REDO technique. It is used for the recovery of the transaction failures which occur due to power, memory or OS failures Whenever any transaction is executed, the updates are not made immediately to the database. They are first recorded on the log file and then those changes are applied once commit is done. This is called “Re-doing” process. Once the rollback is done none of the changes are applied to the database and the changes in the Prepared by: Elisaye log file are also discarded. B.@WSU-DTC 2 If commit is done before crashing of the system, then after restarting of the system the changes that have been recorded in the log file are thus applied to the database. During transaction execution, the updates are recorded only in the log and in the cache buffers. After the transaction reaches its commit point and the log is force-written to disk, the updates are recorded in the database. Deferred Update in a single-user system There is no concurrent data sharing in a single user system. The data update goes as follows: A set of transactions records their updates in the log. At commit point under WAL (write-A head Log) scheme these updatesPrepared areby:saved on database disk. Elisaye B.@WSU-DTC 3 Deferred Update with concurrent users This environment requires some concurrency control mechanism to guarantee isolation property of transactions. In a system recovery transactions which were recorded in the log after the last checkpoint were redone. Two tables are required for implementing this protocol: I. Active table: All active transactions are entered in this table. II. Commit table: Transactions to be committed are entered in this table. Prepared by: Elisaye B.@WSU-DTC 4 During recovery, all transactions of the commit table are redone and all transactions of active tables are ignored since none of their AFIMs (new value) reached the database. It is possible that a commit table transaction may be redone twice but this does not create any inconsistency because of a redone is “idempotent”, that is, one redone for an AFIM is equivalent to multiple redone for the same AFIM.
Prepared by: Elisaye B.@WSU-DTC 5
Recovery Techniques Based on Immediate Update It is a technique for the maintenance of the transaction log files of the DBMS. It is also called UNDO/REDO technique. It is used for the recovery of the transaction failures which occur due to power, memory or OS failures. Whenever any transaction is executed, the updates are made directly to the database and the log file is also maintained which contains both old and new values. Once commit is done, all the changes get stored permanently into the database and records in log file are thus discarded. Once rollback is done the old values get restored in the database and all the changes made to the database are also discarded. This is called “Un-doing” process. If commit is done before crashing of the system, then after restarting of the system the changes Prepared are stored permanently in the by: Elisaye B.@WSU-DTC 6 Undo/Redo Algorithm (Single-user environment) Recovery schemes of this category apply undo and also redo for recovery. In a single-user environment no concurrency control is required but a log is maintained under WAL. Note that at any time there will be one transaction in the system and it will be either in the commit table or in the active table. The recovery manager performs: Undo of a transaction if it is in the active table. Redo of a transaction if it is in the commit table. Undo/Redo Algorithm (Concurrent execution) Recovery schemes of this category applies undo and also redo to recover the database from failure. In concurrent execution environment a concurrency control is required and log is maintained under WAL. Commit table records transactions to be committed and active table records active transactions. The recovery performs: Undo of a transaction if it is in the active table. 7 Prepared by: Elisaye B.@WSU-DTC Difference between Deferred update and Immediate update:
S.NO. Deferred Update Immediate Update
In deferred update, the changes are In immediate update, the changes are 1. not applied immediately to the applied directly to the database. database. The log file contains all the changes The log file contains both old as well 2. that are to be applied to the database. as new values. In this method once rollback is done In this method once rollback is done all the records of log file are the old values are restored into the 3. discarded and no changes are applied database using the records of the log to the database. file. Concepts of buffering and caching Concept of shadow paging is used in 4. are used in deferred update method. immediate update method. The major disadvantage of this The major disadvantage of this method is that there are frequent I/O 5. method is that it requires a lot of time operations while the transaction is for recovery in case of system failure. active. Prepared by: Elisaye B.@WSU-DTC 8 Shadow Paging This recovery scheme does not require the use of a log in a single-user environment. In a multiuser environment, a log may be needed for the concurrency control method. Shadow Paging is recovery technique that is used to recover database. Shadow Paging is recovery technique that is used to recover database. In this recovery technique, database is considered as made up of fixed size of logical units of storage which are referred as pages. Pages are mapped into physical blocks of storage, with help of the page table which allow one entry for each logical page of database. This method uses two page tables named current page table and shadow page table. Prepared by: Elisaye B.@WSU-DTC 9 The entries which are present in current page table are used to point to most recent database pages on disk. Another table i.e., Shadow page table is used when the transaction starts which is copying current page table. After this, shadow page table gets saved on disk and current page table is going to be used for transaction. Entries present in current page table may be changed during execution but in shadow page table it never get changed. After transaction, both tables become identical. This technique is also known as Cut-of-Place updating.
Prepared by: Elisaye B.@WSU-DTC 10
Example of shadow paging
Prepared by: Elisaye B.@WSU-DTC 11
Advantages of shadow paging: This method require fewer disk accesses to perform operation. In this method, recovery from crash is inexpensive and quite fast. There is no need of operations like- Undo and Redo. Disadvantages of shadow paging : Due to location change on disk due to update database it is quite difficult to keep related pages in database closer on disk. During commit operation, changed blocks are going to be pointed by shadow page table which have to be returned to collection of free blocks otherwise they become accessible. The commit of single transaction requires multiple blocks which decreases execution speed. To allow this technique to multiple transactions concurrently it is difficult. Prepared by: Elisaye B.@WSU-DTC 12 Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) is based on the Write Ahead Log (WAL) protocol. Every update operation writes a log record which is one of the following : 1. Undo-only log record: Only the before image is logged. Thus, an undo operation can be done to retrieve the old data. 2. Redo-only log record: Only the after image is logged. Thus, a redo operation can be attempted. 3. Undo-redo log record: Both before images and after Prepared images by: Elisaye are logged. B.@WSU-DTC 13 In it, every log record is assigned a unique and monotonically increasing log sequence number (LSN). Every data page has a page LSN field that is set to the LSN of the log record corresponding to the last update on the page. WAL requires that the log record corresponding to an update make it to stable storage before the data page corresponding to that update is written to disk. For performance reasons, each log write is not immediately forced to disk. A log tail is maintained in main memory to buffer log writes. The log tail is flushed to disk when it gets full. A transaction cannot be declared committed until the commit log record makes it to disk. Once in a while the recovery subsystem writes a checkpoint record to the log. The checkpoint record contains theB.@WSU-DTC Prepared by: Elisaye transaction table and the dirty 14 On restart, the recovery subsystem reads the master log record to find the checkpoint’s LSN, reads the checkpoint record, and starts recovery from there on. The recovery process actually consists of 3 phases: 1. Analysis: The recovery subsystem determines the earliest log record from which the next pass must start. It also scans the log forward from the checkpoint record to construct a snapshot of what the system looked like at the instant of the crash. 2. Redo: Starting at the earliest LSN, the log is read forward and each update redone. 3. Undo: The log is scanned backward and updates corresponding to Prepared by: Elisaye B.@WSU-DTC 15 loser transactions are undone. Recovery in Multi-database Systems So far, we have implicitly assumed that a transaction accesses a single database. In some cases, a single transaction, called a multi database transaction, may require access to multiple databases. These databases may even be stored on different types of DBMSs; for example, some DBMSs may be relational, whereas others are object-oriented, hierarchical, or network DBMSs. In such a case, each DBMS involved in the multi database transaction may have its own recovery technique and transaction manager separate from those Prepared by: Elisaye B.@WSU-DTC 16 To maintain the atomicity of a multi database transaction, it is necessary to have a two-level recovery mechanism. A global recovery manager, or coordinator, is needed to maintain information needed for recovery, in addition to the local recovery managers and the information they maintain (log, tables). The coordinator usually follows a protocol called the two-phase commit protocol, whose two phases can be stated as follows:
Prepared by: Elisaye B.@WSU-DTC 17
Phase 1. When all participating databases signal the coordinator that the part of the multi database transaction involving each has concluded, the coordinator sends a message prepare for commit to each participant to get ready for committing the transaction. Each participating database receiving that message will force-write all log records and needed information for local recovery to disk and then send a ready to commit or OK signal to the coordinator. If the force-writing to disk fails or the local transaction cannot commit for some reason, the participating database sends a cannot commit or not OK signal to the coordinator. If the coordinator does not receive a reply from the database within a certain time out interval, it assumes a not OK response. Prepared by: Elisaye B.@WSU-DTC 18 Phase 2. If all participating databases reply OK, and the coordinator’s vote is also OK, the transaction is successful, and the coordinator sends a commit signal for the transaction to the participating databases. Because all the local effects of the transaction and information needed for local recovery have been recorded in the logs of the participating databases, recovery from failure is now possible. Each participating database completes transaction commit by writing a [commit] entry for the transaction in the log and permanently updating the database if needed. On the other hand, if one or more of the participating databases or the coordinator have a not OK response, the transaction has failed, and the coordinator sends a message to roll back or UNDO the local effect of the transaction to each participating database. This is done by undoing the transaction operations, using the log. Prepared by: Elisaye B.@WSU-DTC 19