an_introduction_to_vmware_disaster_recovery_and_business_continuity
an_introduction_to_vmware_disaster_recovery_and_business_continuity
©️ VMware LLC.
An Introduction to VMware Disaster Recovery and Business Continuity
Table of contents
Introduction ............................................................................................................................................... 3
Audience ............................................................................................................................................... 3
Overview ............................................................................................................................................... 3
Why Disaster Recovery? .......................................................................................................................... 3
Architecture ........................................................................................................................................... 5
Use Cases .............................................................................................................................................. 5
On-premises to Cloud (hybrid) ................................................................................................................. 5
Cloud to Cloud(C2C) ............................................................................................................................... 5
Advantages of a Disaster Recovery Site in the Cloud .................................................................................. 6
Alternate Uses of Recovery Site/SDDC ...................................................................................................... 6
Sizing .................................................................................................................................................... 6
Recovery Plan Considerations .................................................................................................................. 7
Access Management ............................................................................................................................... 7
Terminology .............................................................................................................................................. 8
©️ VMware LLC.
An Introduction to VMware Disaster Recovery and Business Continuity
Overview
Business continuity is a process to ensure that business operations are not affected and in case of a disaster, the downtime to
operations is minimized. Disaster recovery is a part of business continuity planning. This document highlights the disaster recovery
planning aspect of business continuity with respect to infrastructure components.
Disaster recovery should be one of the primary factors that must be taken into consideration when you are planning an SDDC
deployment in either on-premises or in the cloud. Datacenter availability is a major factor when planning a new datacenter or
migrating an existing datacenter to the cloud. Disaster recovery is the process to get the business up and running when a disaster
strikes. With a carefully planned solution, we can execute a proactive failover and avoid the disaster altogether and in the event of
a disaster, we can perform recovery while minimizing data loss and downtime.
Design Considerations
Consider the following features and properties before you begin designing a recovery solution:
Architecture
VMware currently offers two distinct disaster recovery solutions:
VMware Site Recovery (VSR): This is a fully-managed DRaaS solution that is delivered with vSphere replication and
VMware Site Recovery. With this service, along with enabling the add on, you deploy a vSphere appliance in your on-
premises vSphere environment. You can then pair sites and replicate your critical VMs running in the on-premises
environment to a Cloud SDDC. The figure below shows an example of architecture where the recovery site is hosted on
VMware Cloud.
VMware Cloud Disaster Recovery (VCDR): This is VMware's on-demand disaster recovery service that is delivered as an
easy-to-use SaaS solution and offers cloud economics to help keep your disaster recovery costs under control. The target
SDDC can be created immediately before performing a recovery and not upfront, while also supporting the replications in
the steady-state. The DRaaS connector is deployed as a virtual appliance that replicates the data to a Scale-Out Cloud File
System (SCFS). This volume is mounted when we perform recovery as a live-mount datastore on the SDDC. Since the VMs
are already in an ESXi supported format recovery is handled at ease. This offering is currently available only with VMC on
AWS.
Use Cases
On-premises to Cloud (hybrid)
The majority of enterprise workloads currently run in an on-premises datacenter. While the idea to migrate to the cloud is gaining
momentum, there can be various reasons to continue using the on-premises datacenter. For workloads which are currently running
on-premises, the disaster recovery SDDC can still exist in the cloud making it a hybrid use case. Currently, only hypervisor-based
replication is supported in this model.
Cloud to Cloud(C2C)
Migrating the workload to the cloud does not eliminate the need for a disaster recovery site. Having a disaster recovery SDDC is
still as critical as when running workloads on-premises. There are several high availability features configurable when running
production workloads on cloud SDDC enabling resiliency from hardware/zone failure. However, none of these can be assumed to
Sizing
When designing a recovery SDDC, you must account for the compute, storage, and network requirements that are necessary to
keep the critical applications up while the primary site is recovered. Recovery sites can also be used in a distributed model, which
means the resources in the recovery site do not need to sit idle all time. Depending on the available resources, we can run the on-
premises non-production workload on the recovery site.
Note - If you are running the recovery site in distributed mode, you must account for additional resources that will be required for
saving the replicated data.
See the table below for the different site considerations for SDDC planning:
Access Management
Creating solution-specific roles with access boundaries is a good security practice. Create roles and permissions to ensure only
specific users have the right privileges can execute certain tasks on the shared service during recovery. For example, test-
recovery, DNS record updation and planned failover.
Terminology
Recovery Time Objective (RTO): RTO is the targeted duration of time and a service level in which a business process must
be restored as a result of an IT service or data loss issue, such as a natural disaster.
Recovery Point Objective (RPO): RPO defines the maximum acceptable age that the data that can be recovered from the
recovery storage in case of a disaster. The lower the RPO, the closer the replica's data is to the original. However, lower
RPO requires more bandwidth between the source and target locations, and more storage capacity in the target location
depending on the Point-in-time configured on VM.
Point-in-Time Instance: You define multiple recovery points (point-in-time instances or PIT instances) for each virtual
machine so that when a virtual machine has data corruption, data integrity, ransomware-encrypted data, or host OS
infections, administrators can recover and revert to a recovery point before the compromising issue occurred.