Backup & Recovery
Backup & Recovery
Additional Factors
• DBMS Architecture
• Activity During Image Copy Process
– Read Only
– Read/Write
• Balancing Duration of Recovery Against the Time Required to Take Image Copy
Additional Factors
• Additionally, recovery duration depends on the architecture of the DBMS.
For example, mainframe DB2 keeps track of log range information and reads only
the required log files for any recovery operation. However, some DBMSs require
that all the log files be read to scan for information needed for recovery.
• Keep in mind that database backups taken while there is read-only activity,
or no activity, can be restored back to that point in time using only the
backup—no log files are required. This can simplify and minimize the cost
of a recovery.
• In general, the more often you make an image copy, the less time recovery takes
(less log files in between which leads to faster recovery). However, the amount of
time required to make an image copy backup must be balanced against the need
for concurrent processing during the backup process.
How Many Backups?
• The DBA must decide how many complete generations of backups (for both
database object copies and log copies) to keep.
• Keeping extra generations can help you recover from a media failure during
recovery by switching to an older backup.
– At a minimum, the retention period should be at least two full cycles.
– The number of copies you decide to keep must be tempered by the number of
associated logs that must also be maintained for the backups to remain viable
1.
Full vs. Incremental
• A full image copy backup is a complete copy of all the data in the database
object at the time the image copy was run.
• An incremental image copy backup (aka differential backup) contains only the
data that has changed since the last full or incremental image copy was made.
– The advantage of taking an incremental rather than a full backup is that it
can sometimes be made more quickly, and requires less space on disk (or
tape).
– The disadvantage is that recovery based on incremental copies can take
longer because, in some cases, the same row is updated several times
before the last changes are restored.
• For example, suppose you took a full image copy of a database object early
Monday morning at 2:00 A.M. and then took an incremental image copy at the
same time the following three mornings. The full image copy plus all three
incremental image copies need to be applied to recover the tablespace. If the
same column of the same row was updated on Tuesday to "A", Wednesday to
"B", and Thursday to "C", the recovery process would have to apply these three
changes before arriving at the final, accurate data. If a full image copy were taken
each night, the recovery process would only need to apply the latest image copy
backup, which would contain the correct value.
Incremental Versus Full Image Copy Backups
• Favor full image copies for small database objects.
– The definition of “small” will vary from site to site and DBMS to DBMS.
• Consider using incremental image copies to reduce the batch processing
window for very large database objects that are minimally modified in
between image copy backups.
– The DBA should base the full-versus-incremental decision on the
percentage of blocks of data that have been modified, not on the
number of rows that have been modified.
• Some scenarios are not compatible with incremental image copy backups.
– Some DBMSs permit the user to disable logging during some operations
and utilities. Whenever an action is taken that adds or changes data
without logging, a full image copy is required.
• Some DBMSs provide the capability to analyze a database object to
determine if a full or incremental backup is recommended or required.
• This is typically accomplished using an option of the copy utility. If such an
option exists, the DBA can run the copy utility to examine the amount of
data that has changed since the last image copy backup was taken.
Furthermore, the DBA can set a threshold such that a full image copy is
taken when more than a specified amount of data has changed; an
incremental image copy is taken when the amount of data that has
changed is less than the threshold
• Some DBMSs permit the user to disable logging during some operations
and utilities. Whenever an action is taken that adds or changes data
without logging, a full image copy is required.
Merging Incremental Image Copies
• A merge utility, sometimes referred to as MERGECOPY, can be used to combine
multiple incremental image copy backups into a single incremental copy backup,
or to combine a full image copy backup with one or more incremental image copy
backups to create a new full backup.
• If your DBMS supports merging incremental copies, consider running the merge
utility to create a new full image copy directly after the creation of an incremental
copy.
• If you wait until recovery is required to run the merge, downtime will be
increased because the merge (or similar processing) will occur during the recovery
process while the database object is unavailable.
Copying Indexes
• Some DBMSs support making backup copies of indexes.
– Indeed, some DBMSs require indexes to be backed up,
whereas index backup is optional for others.
– Index backup can be optional because the DBMS can
rebuild an index from the table data.
• You will need to examine the trade-offs of copying
indexes.
• Be sure to perform data and index backups at the same
time if you choose to back up rather than rebuild your
indexes.
– Failure to do so can result in indexes that do not match the
recovered data• As a DBA, though, you will need to examine the trade-
offs of copying indexes if your DBMS supports index
backup. The question DBAs must answer for each index
is "Rebuild or recover?" The more data that must be
indexed, the longer an index rebuild will require in a
recovery situation. For larger tables, backing up the
index can result in a much quicker recovery—although
at the expense of the increased time required for
backup. When multiple indexes exist on the large table,
backing them up, again, leads to faster recovery.
However, keep in mind that index backups will require
additional time to execute during your regular backup
process.
DBMS Control
• The degree of control the DBMS asserts over the backup
and recovery process differs from DBMS to DBMS.
– Some DBMSs record backup and recovery information in the
system catalog.
– That information is then used by the recovery process to
determine the logs, log backups, and database backups required
for a successful recovery.
• The more information the DBMS maintains about image
copy backups, the more the DBMS can control proper usage
during recovery.
• If your DBMS does not record backup and recovery
information in the system catalog then the DBA must track
image copy backup files and assure their proper usage
during a recovery.
The DB2 COPY Utility
• The COPY utility is used by DB2 for OS/390 to create image copy
backups. This utility maintains a catalog of image copy information
in the system catalog. Every successful execution of the COPY utility
causes DB2 to record information in the system catalog indicating
the status of the image copy, the image copy data set name and file
details, the date and time of the backup, and log information. This
information is read by the DB2 RECOVER utility to enable
automated tablespace and index recovery. Only valid image copies,
recorded in the system catalog, can be used by DB2 for recovery.
• As time passes, image copy backups become obsolete. New backup
copies are made and database objects are recovered to various
points in time. The DB2 DBA must maintain the information in the
system catalog because outdated and unnecessary backup rows in
the system catalog can slow down the recovery process. Backup
information in the system catalog is removed by the DBA using the
DB2 MODIFY utility.
RMAN
• Oracle provides a comprehensive method for
managing backup and recovery called RMAN.
RMAN, which stands for Recovery Manager, is
a utility that establishes a connection with a
server session and manages the data
movement for backup and recovery
operations.
Using Oracle RMAN for Backup and Recovery
• RMAN is a powerful program for managing the backup and recovery of Oracle
data.
The DBA can use RMAN to specify files or archived logs to be backed up using the
RMAN BACKUP command. Doing so causes RMAN to create a backup set as
output.
A backup set is one or more data files, control files, or archived redo logs that are
written by RMAN in proprietary format. The only way to recover using the backup
set is to use the RMAN RESTORE command. Of course, the DBA can choose to use
the COPY command instead. This creates an image copy of a file that is usable
outside the scope of RMAN.
• RMAN accesses backup and recovery information from either the control file or
the optional recovery catalog. The recovery catalog is similar to the DBMS system
catalog, but it contains only backup and recovery metadata.
• RMAN generally is preferable to other Oracle backup and recovery methods
because it is easier to use and more functional. For example, RMAN provide the
ability to create incremental backups. Only full image copy backups are available
when using traditional Oracle backup and recovery methods.
Concurrent Access Issues
• Concurrent write access allows you to keep the data online during
the backup process, but it will slow down any subsequent recovery
because the DBMS has to examine the database log to ensure
accurate recovery.
• Change accumulation creates an up-to-date image copy backup by
merging existing image copies with data from the database logs.
This is similar to the merging of incremental image copies.
• Some image copy backup techniques allow only read access to the
database object. Backups that allow only read access provide faster
recovery than those that allow concurrent read-write because the
database log is not needed to ensure a proper recovery.
– However, they are more disruptive to normal processing.
• Some image copy backup techniques require the database object to
be stopped, or completely offline. This type of copy provides fast
backup because there is no contention for the tablespace.
– This is even more disruptive to normal application processing.
Backup Planning Considerations
• The need for concurrent access and
modification during the backup process
• The amount of time available for the backup
process and the impact of concurrent access
on the speed of backing up data
• The speed of the recovery utilities
• The need for access to the database logs
• The difference between a hot backup and cold
backup.
Hot vs. Cold Backup
• A cold backup is accomplished by shutting down the
database instance and backing up the relevant database
files.
• A hot backup is performed while the database instance
remains online, meaning that concurrent access is possible.
• Depending on the capabilities of the DBMS you are using,
hot backups can be problematic because:
– They can be more complex to implement.
– They can cause additional overhead in the form of higher CPU,
additional I/O, and the additional database log archivals.
– They can require the DBA to create site-specific scripts to
perform the hot backup.
– They require extensive testing to ensure that the backups are
viable for recovery.
Backup Consistency
• Be sure your backup plan creates a consistent recovery
point for the database object.
– You need to be aware of all relationships between the
database objects being backed up and other database
objects including:
• Application-enforced relationships
• Referential constraints
• Triggers
• If you use an image copy backup to recover a database
object to a previous point in time, you will need to
recover any related database objects to the same point
in time.
– Failure to do so will most likely result in inconsistent data.
Quiesce
• If your DBMS provides a QUIESCE utility:
– Use it to establish a point of consistency for all related database
objects prior to backing them up.
– QUIESCE halts modification requests to the database objects to
ensure consistency and record the point of consistency on the
database log.
• If the DBMS does not provide a QUIESCE option:
– You will need to take other steps to ensure a consistent point for
recovery.
– For example, you can place the database objects into a read-
only mode, take the database objects offline, or halt application
processes—at least those application processes that update the
related database objects.
• Some recovery options/products can find quiet points without
requiring quiesce points during backup.
When to Create a Point of Consistency
• The DBA should create a point of consistency
during daily processing.
– Before archiving the active log.
– Before copying related database objects.
– Just after creating an image copy backup.
– Just before heavy database modification.
– During quiet times.
Log Archiving and Backup
• All database changes are logged by the DBMS to a log file
commonly called the transaction log or database log.
– Log records are written for every SQL INSERT, UPDATE, and DELETE
statement that is successfully executed and committed.
• The database log to which records are currently being written is
referred to as the active log. As the number of database changes
grows, the database log will increase in size.
– When the active database log is filled, the DBMS invokes a process
known as log archival or log offloading.
– When a database log is archived, the current active log information is
moved offline to an archived log file, and the active log is reset.
• The DBA typically controls the frequency of the log archival process
by using a DBMS configuration parameter.
– Most DBMSs also provide a command to allow the DBA to manually
request a log archival process.
– And remember, each DBMS performs log archival and backup
differently.
Determining Your Backup Schedule
Not all data is created equal.
• How much daily activity occurs against the data?
• How often does the data change?
• How critical is the data to the business?
• Can the data be recreated easily?
• What kind of access do the users need?
– Is 24/7 access required?
• What is the cost of not having the data available
during a recovery?
– What is the dollar value associated with each minute of
downtime?Volatility of the data
Static
Dynamic
4
2
3
1
Criticality and Volatility Grading
Recovery
• Database recovery can be a very complex task.
• Recovery involves much more than simply restoring an image
of the data as it appeared at some earlier point in time.
• A database recovery involves bringing the data back to its
state at (or before) the time of the problem.
– Often a recovery involves restoring databases and then reapplying the
correct changes that occurred to that database, in the correct
sequence.
• Simply stated, a successful recovery is one where you get the
application data to the state you want it—whether that state
is how it was last week, yesterday, or just a moment ago.
– If you planned your backup strategy appropriately, you should be able
to recover from just about any type of failure you encounter.
Determining Recovery Options
• What type of failure has occurred:
media, transaction, or database
instance?
• What is the cause of the failure?
• How did the database go down: abort,
crash, normal shutdown?
• Did any operating system errors occur?
• Was the server rebooted?
• Are there any errors in the operating
system log?
• Are there any errors in the alert log?
• Was a dump produced?
• Were any trace files generated?
• How critical is the lost data?
• Have you attempted any kind of
recovery so far? If so, what steps have
already been performed?
• What types of backups exist: full,
incremental, both?
• What needs to be recovered: the full
database, a tablespace, a single table,
an index, or combinations thereof?
• Does your backup strategy support the
type of recovery required (recover-to-
current vs. point-in-time)?
• If you have cold backups, how was the
database shut down when the cold
backups were taken?
• Are all of the archived database logs
available for recovery?
• Do you have recent logical backup
(EXPORT or UNLOAD)?
• What concurrent activities were
running when the system crashed?
• Can you bring the DBMS instance up?
• Can you access the database objects?
• What are your system availability
requirements?
• How much data must be recovered?
• Are you using raw files?
DBMS Version Migration and Recovery
• DBMS version migration can impact recoverability.
• Sometimes the DBMS vendors change the format of
image copy backup files, rendering any backups using
the old format unusable. The same could be true for
the log file—the format may have changed for a new
version, rendering
– Depending on the DBMS and the particulars of the new
version, a backup taken in a prior release may not be
usable for recovery after migration.
– Alternately, a backup taken after migration that is trying to
be used after falling back to an older version of the DBMS
also may not be usable for recovery.
General Steps for Database Object Recovery
At the very basic level, every database recovery will
involve most of these seven steps:
1. Identify the failure.
2. Analyze the situation.
3. Determine what needs to be recovered.
4. Identify dependencies between the database
objects to be recovered.
5. Locate the required image copy backup(s).
6. Restore the image copy backup(s).
7. Roll forward through the database log(s).
Types of Recovery
• Recovery to Current
• Point-in-Time (PiT) Recovery
• Transaction Recovery
Recovery to Current
• To successfully recover to current, the recovery process must be
able to reset the contents of the database to the way it looked just
at (or right before) the point of failure. To recover to current, the
recovery process must find a valid, full image copy backup and
restore that image copy. Then the recovery will roll forward through
the database log, applying all of the database changes.
• If the last full image copy is lost or destroyed, it may still be possible
to recover if a previous image copy exists. The recovery process
could start with the older backup copy, apply any incremental
copies, and then roll forward through the archived and active logs.
Of course, more database logs will be required in such a case, so
the recovery process will take longer.
• If no image copy is available as a starting point, it may be possible
to recover the database object using just the database log. If the
data was loaded and the load process was logged, recovery may be
able to proceed simply by applying log records.
PIT recovery
• Another traditional type of recovery is point-in-time (PIT)
recovery, which is usually done to deal with an application-
level problem. PIT recovery is sometimes referred to as
partial recovery because only part of the existing data will
remain after recovery. Recovery to a point in time removes
the effects of all transactions that have occurred since that
specified point in time.
• To perform a PIT recovery, an image copy backup is
restored and then changes are applied by rolling forward
through the database log (or log backups). However, only
the log records up to the specified time are processed.
Sometimes the recovery point is specified as an actual date
and time; sometimes it is specified using a relative byte
address on the database log.
Transaction Recovery
• Transaction recovery is a third type of
recovery; it addresses the shortcomings of the
traditional types of recovery: downtime and
loss of good data. Thus, transaction recovery is
an application recovery whereby the effects of
specific transactions during a specified
timeframe are removed from the database.
Third-party software is required to perform a
transaction recovery .
• Traditional types of recovery, both recovery to
current and PIT, recover at the database object
level. In direct contrast to this level of granularity,
transaction recovery allows a user to recover a
specific portion of the database based on user-
defined criteria. This can be at a transaction or
application program level. In this context, a
transaction is defined by the user's view of the
process.
• The important point is that there may or may not
be a correlation between the transactions you
are trying to fix and transactions (or units of
recovery) in the DBMS.
• Examples of user-level transaction definitions
might be
– All database updates performed by a userid since
last Wednesday at 11:50 A.M.
– All database deletes performed by the application
program named PAYROLL since 8:00 P.M. yesterday.Transaction recovery
• Once you have identified the transaction to recover,
you have three recovery options:
• PIT recovery. You can try to identify all of the database
objects impacted by the application and perform
traditional point-in-time recovery to remove the effects
of the transactions. You would then manually rerun or
reenter work that was valid.
• UNDO recovery. Remove only the effects of the bad
transactions.
• REDO recovery. Remove all the transactions after a
given point in time, and then redo the good
transactions only.Transaction recovery
• Let's first examine an UNDO recovery. UNDO recovery is the
simplest version of SQL-based transaction recovery because
it involves only SQL. To accomplish an UNDO recovery, the
database logs must be scanned for the identified
transaction and anti-SQL is produced. Anti-SQL reverses the
affect of SQL by
• Converting inserts into deletes
• Converting deletes into inserts
• • Reversing the values of updates (e.g., UPDATE "A" to "X"
becomes UPDATE "X" to "A")
• However, certain applications may need to be brought
down for the duration of the UNDO recovery to eliminate
the potential for data anomalies causing additional failures.UNDO SQL,
generated from the database log,
can be used to get rid of bad transactions. And
the database can remain online.
Apply UNDO SQL
Bad Transaction
Good Transaction 1
Generate UNDO SQL
UNDO Bad Transactions
Good Transaction 2
Recovery started
UNDO Transaction Recovery