Vmware Site Recovery Manager 8.2 Theory and Lab
Vmware Site Recovery Manager 8.2 Theory and Lab
2:
Install, Configure, Manage
Lecture Manual
VMware Site Recovery Manager 8.2
CONTENTS
Contents i
2-18 Disaster Recovery Topologies...............................................................................32
2-19 Learner Objectives................................................................................................33
2-20 About Disaster Recovery Topologies....................................................................34
2-21 Active-Passive Data Centers .................................................................................36
2-22 Bidirectional Protection........................................................................................37
2-23 Shared Recovery Sites ..........................................................................................39
2-24 Active-Active Data Centers with Stretched Storage .............................................40
2-25 VMware Site Recovery (1) ...................................................................................42
2-26 VMware Site Recovery (2) ...................................................................................43
2-27 Review of Learner Objectives ...............................................................................44
2-28 Key Points .............................................................................................................45
ii Contents
3-20 About the Site Recovery Manager Configuration Service ....................................69
3-21 Service Registration with vCenter Server .............................................................71
3-22 About the Site Recovery Manager Solution User .................................................73
3-23 vCenter Server Registration Workflow (1) ...........................................................74
3-24 vCenter Server Registration Workflow (2) ...........................................................75
3-25 vCenter Server Registration Workflow (3) ...........................................................77
3-26 Starting and Stopping Services .............................................................................78
3-27 Updating the Site Recovery Manager Appliance..................................................79
3-28 Configuring Security Access..................................................................................80
3-29 Accessing the Site Recovery Manager Client .......................................................81
3-30 About Site Pairing .................................................................................................82
3-31 Configuring Site Pairing (1) ...................................................................................83
3-32 Configuring Site Pairing (2) ...................................................................................84
3-33 Importing and Exporting Configurations ..............................................................85
3-34 Lab: Configuring the Site Recovery Manager Appliance and Pairing the
Sites ......................................................................................................................87
3-35 Review of Learner Objectives ...............................................................................88
3-36 Key Points .............................................................................................................89
Contents iii
4-13 Placeholder VMs.................................................................................................106
4-14 Placeholder Datastores ......................................................................................108
4-15 Recovery Site Changes .......................................................................................109
4-16 Lab: Configuring Inventory Mappings ................................................................110
4-17 Review of Learner Objectives .............................................................................111
4-18 Key Points ...........................................................................................................112
iv Contents
6-3 Module Lessons ..................................................................................................139
6-4 vSphere Replication Overview............................................................................140
6-5 Learner Objectives..............................................................................................141
6-6 vSphere Replication Architecture.......................................................................142
6-7 About vSphere Replication Appliances ..............................................................143
6-8 About vSphere Replication Servers ....................................................................145
6-9 About the vSphere Replication Service ..............................................................147
6-10 About the vSphere Replication Filter .................................................................149
6-11 vSphere Replication Use Cases (1) .....................................................................151
6-12 vSphere Replication Use Cases (2) .....................................................................152
6-13 vSphere Replication Use Cases (3) .....................................................................153
6-14 vSphere Replication: System Requirements ......................................................155
6-15 vSphere Replication: Advantages .......................................................................156
6-16 vSphere Replication: Bandwidth Requirements.................................................158
6-17 Review of Learner Objectives .............................................................................160
6-18 Deploying vSphere Replication...........................................................................161
6-19 Learner Objectives..............................................................................................162
6-20 Verifying vCenter Server FQDN Values ..............................................................163
6-21 vSphere Replication Appliance OVF Files ...........................................................164
6-22 Deploying vSphere Replication Appliances ........................................................165
6-23 Configuring vSphere Replication Appliances ......................................................166
6-24 vSphere Replication: Database Options .............................................................168
6-25 Key vSphere Replication Services .......................................................................169
6-26 Enabling SSH on vSphere Replication Appliances ..............................................170
6-27 Pairing vSphere Replication Management Servers ............................................171
6-28 Deploying Additional vSphere Replication Servers ............................................172
6-29 Registering vSphere Replication Servers ............................................................173
6-30 Lab: Deploying a vSphere Replication Appliance and an Additional Server ......174
6-31 Review of Learner Objectives .............................................................................175
6-32 Key Points ...........................................................................................................176
Contents v
Module 7 Replicating VMs Using vSphere Replication 177
7-2 Importance .........................................................................................................178
7-3 Learner Objectives..............................................................................................179
7-4 Data Replication with vSphere Replication ........................................................180
7-5 About Replica Data Transfers .............................................................................181
7-6 VM Replica Status ...............................................................................................183
7-7 vSphere Replication Support for Encrypted VMs ...............................................184
7-8 Configuring Replication for VMs ........................................................................185
7-9 Choosing Target vSphere Replication Servers ....................................................186
7-10 Datastore Mapping .............................................................................................187
7-11 Using Seed Disks .................................................................................................189
7-12 Replication Settings: RPO ...................................................................................190
7-13 Replication Scheduling .......................................................................................192
7-14 About RPO Violations .........................................................................................194
7-15 Replication Settings: MPIT Instances .................................................................196
7-16 RPO and MPIT Policies ........................................................................................197
7-17 Configuring MPIT ................................................................................................198
7-18 Recovering VMs with MPIT ................................................................................199
7-19 Replication Settings: Additional Settings ...........................................................200
7-20 Replicating VM Files ...........................................................................................202
7-21 Disabling Replication ..........................................................................................203
7-22 Lab: Enabling Replication on Virtual Machines ..................................................204
7-23 Review of Learner Objectives .............................................................................205
7-24 Key Points ...........................................................................................................206
vi Contents
8-6 Array-based Protection Groups ..........................................................................213
8-7 vSphere Replication Protection Groups .............................................................215
8-8 About Storage Policies........................................................................................217
8-9 Storage Policy Protection Groups.......................................................................219
8-10 Creating Protection Groups ................................................................................221
8-11 Creating Array-Based Replication Protection Groups ........................................222
8-12 Creating vSphere Replication Protection Groups ...............................................223
8-13 Viewing Placeholder VMs in the Inventory ........................................................224
8-14 Configuring Protection for VMs..........................................................................225
8-15 Editing Protection Groups ..................................................................................226
8-16 Lab: Building a Protection Group .......................................................................227
8-17 Review of Learner Objectives .............................................................................228
8-18 Key Points ...........................................................................................................229
Contents vii
9-18 Configuring VMs in a Recovery Plan...................................................................249
9-19 Configuring VM Priority Groups .........................................................................250
9-20 Configuring VM Dependencies ...........................................................................251
9-21 Shutdown Actions...............................................................................................253
9-22 Startup Actions ...................................................................................................254
9-23 Configuring Power-On Options ..........................................................................255
9-24 vMotion Eligibility ...............................................................................................256
9-25 IP Customization Options ...................................................................................257
9-26 Manual IP Customization ...................................................................................258
9-27 Recovery Steps ...................................................................................................260
9-28 Adding Custom Recovery Steps ..........................................................................261
9-29 Suspending VMs During Recovery......................................................................262
9-30 Deleting Recovery Plans .....................................................................................263
9-31 Lab: Building Recovery Plans ..............................................................................264
9-32 Review of Learner Objectives .............................................................................265
9-33 Key Points ...........................................................................................................266
viii Contents
10-15 Interaction with vSphere Technologies During Recovery ..................................282
10-16 Linked-Clone and Snapshot Recovery Limitations .............................................283
10-17 Review of Learner Objectives .............................................................................284
10-18 Recovery Plan Test and Cleanup ........................................................................285
10-19 Learner Objectives..............................................................................................286
10-20 Initiating Recovery Plan Tests.............................................................................287
10-21 Test and Cleanup Workflows..............................................................................288
10-22 Replicating Recent Changes During Tests ..........................................................289
10-23 Storage During Recovery Plan Tests ...................................................................290
10-24 Recovery Plan Test Steps ....................................................................................292
10-25 Canceling Recovery Plan Tests ...........................................................................293
10-26 Recovery Plan Test Cleanup ...............................................................................294
10-27 Force Cleanup .....................................................................................................295
10-28 Recovery Plan Cleanup Steps .............................................................................296
10-29 Lab: Performing a Recovery Plan Test and Cleanup ...........................................297
10-30 Review of Learner Objectives .............................................................................298
10-31 Recovery Plan Execution ....................................................................................299
10-32 Learner Objectives..............................................................................................300
10-33 Running a Site Recovery Manager Recovery Plan ..............................................301
10-34 Confirming Recovery Plan Execution..................................................................302
10-35 Recovery Plan Execution Steps...........................................................................304
10-36 Forced Recovery .................................................................................................306
10-37 Enabling the Forced Recovery Option ................................................................308
10-38 Lab: Performing a Planned Migration ................................................................309
10-39 Review of Learner Objectives .............................................................................310
10-40 Reprotection and Failback ..................................................................................311
10-41 Learner Objectives..............................................................................................312
10-42 Site Recovery Manager: Reprotect Overview ....................................................313
10-43 Site Recovery Manager: Failback Overview .......................................................314
10-44 Reprotect and Failback Workflows ....................................................................315
10-45 Reprotection Example ........................................................................................316
Contents ix
10-46 Preconditions for Reprotection ..........................................................................317
10-47 Initiating a Reprotect Operation ........................................................................318
10-48 Reprotect Steps ..................................................................................................319
10-49 Reprotect and Storage Policy Protection ...........................................................321
10-50 Reprotect States .................................................................................................323
10-51 Failback Overview...............................................................................................324
10-52 Failback Example ................................................................................................326
10-53 Lab: Reprotecting and Failing Back ....................................................................327
10-54 Review of Learner Objectives .............................................................................328
10-55 Key Points ...........................................................................................................329
x Contents
Module 1
Course Introduction
Digital badges contain metadata with skill tags and accomplishments, and are based on Mozilla's
Open Badges standard.
Site Recovery Manager is a disaster recovery management product that leverages the inherent
disaster recovery capabilities of the vSphere platform and replicated storage.
vSphere is deployed on protected and recovery sites, and storage replication is established between
the two sites. Administrators use Site Recovery Manager to create disaster recovery plans that
designate failover priority instructions.
If a disaster occurs, administrators are notified and must decide whether to start a failover.
Site Recovery Manager complements vSphere and integrates tightly with vCenter Server to
automate disaster protection and planned site migrations for all applications.
The diagram shows the architecture of Site Recovery Manager, including associations between
components. Systems installed with Site Recovery Manager and vCenter Server software must be
able to communicate with each other, between sites.
Site Recovery Manager 8.2 is deeply integrated with vCenter Single Sign-On, using vCenter
Single Sign-On for authentication and SAML token acquisition, for example. This integration with
vCenter Single Sign-On allows the external certificate requirements to be relaxed significantly.
The integration of Site Recovery Manager and the separation of Platform Services Controller from
vCenter Server creates several topology possibilities. With the integration of Site Recovery
Manager with vCenter Single Sign-On, vCenter Server, and Platform Services Controller, time
synchronization among these components is important.
If vCenter Server instances are connected with vCenter Server instances in linked mode, you
install the same Site Recovery Manager license on both vCenter Server instances.
The ESXi hosts at the recovery site can use different processors (Intel or AMD) than the ESXi
hosts at the protected site because replication performs a restart of the VMs.
Failover tests are an important feature of Site Recovery Manager because you can perform
nondisruptive testing of your recovery plans.
By testing and resolving errors observed during tests, you help ensure that disaster recovery, if
required, is performed without issues. You can also use test failovers to create a complete, isolated
replica of your data center for patch and upgrade testing.
Planned migrations are useful for disaster avoidance and data center maintenance scenarios.
Before failover and recovery, planned migration efficiently shuts down protected VMs and
performs a storage synchronization. The migration pauses if errors are encountered, allowing you
to resolve the errors.
You use disaster recovery and forced recovery options in the case of real disaster recovery
scenarios that provide orchestrated failover of your environment to meet the most efficient
recovery time objective (RTO) possible.
When an administrator wants to return workloads to the original site after a failover event, Site
Recovery Manager can automate the steps to reverse the storage replication.
Automating replication direction reversal and site mappings saves time and allows for faster
operations. Automated failback eliminates the need to set up new recovery plans by replaying
existing recovery plans.
Automated failback does not apply if the protected site is physically lost.
Each Site Recovery Manager server host supports up to a certain number of VMs, protection
groups, datastore groups, and concurrent recoveries.
The total number of array-based and vSphere Replication protection groups cannot exceed 500.
The mix of array-based replication and vSphere Replication, and the number of replicated storage
devices, significantly affects scalability and recovery time.
The maximum values for replicated datastore groups and running recovery plans are suggested
values. These values are not enforced.
For details about the operational limits for Site Recovery Manager, see Operational Limits of Site
Recovery Manager at https://docs.vmware.com/en/Site-Recovery-Manager.
You can run array-based protection groups alongside vSphere Replication protection groups in
the same Site Recovery Manager server instance. The total number of protection groups cannot
exceed 500 for both replication types combined.
However, if you have 350 array-based protection groups, you can create an additional 150
vSphere Replication protection groups, to make a total of 500 protection groups.
Similarly, in a setup that combines array-based replication and vSphere Replication, you can
protect a maximum of 5,000 VMs, even if you combine replication types.
The protection limit for array-based replication is 5,000 VMs. The protection limit for vSphere
Replication is 2,000 VMs. However, the maximum number of VMs that you can protect by
using a combination of array-based and vSphere Replication is still 5,000 VMs, and not 7,000.
Site Recovery Manager APIs provide automation capabilities that enable you to programmatically
protect VMs. For example, you can automatically protect a VM during provisioning by associating
the VM with existing protection groups and recovery plans.
The Site Recovery Manager APIs are SOAP-based. Additional VMware products, such as
VMware PowerCLI™, VMware vRealize® Orchestrator™ and VMware vRealize®
Automation™, provide mechanisms to leverage these APIs.
For information about Site Recovery Manager APIs, see the Site Recovery Manager API
documentation at https://code.vmware.com/web/sdk/8.1/site-recovery-manager.
Integration with vRealize Orchestrator makes it easier for you to create, update, and remove
disaster recovery protection on VMs. This integration allows you to configure Site Recovery
Manager disaster protection when you provision VMs with vRealize Orchestrator.
The vRealize Orchestrator plug-in for Site Recovery Manager allows Site Recovery Manager
administrators to simplify the management of their Site Recovery Manager infrastructure by
extending the robust workflow automation platform of vRealize Orchestrator. You build
workflows by using the drag-and-drop capability of the workflow editor in the vRealize
Orchestrator client. vRealize Orchestrator uses the plug-in to access the functionality of Site
Recovery Manager and the Site Recovery Manager API. The prebuilt workflows simplify the
process to get you started with custom workflow creation.
NSX Data Center for vSphere 6.2 and later support a single NSX domain that spans multiple
vCenter Server instances. Therefore, you can use NSX Data Center for vSphere and Site Recovery
Manager to simplify the creation, testing, and execution of recovery plans, as well as accelerate
recovery times.
NSX Data Center for vSphere supports universal logical switches, which allow the creation of
layer 2 networks that span vCenter Server boundaries. When using universal logical switches with
NSX Data Center for vSphere, a virtual port group is at both the protected and recovery sites that
connect to the same layer 2 network.
When VMs are connected to port groups that are backed by universal logical switches, Site
Recovery Manager, versions 6.1 and later, automatically recognizes that manual network mapping
between the protected and recovery locations is not required.
Site Recovery Manager detects that the NSX universal logical switch is the same network on both
sites and automatically links the protected and recovery networks.
The Site Recovery Manager appliance is new for version 8.2 and provides you with a simple way
to deploy and configure Site Recovery Manager. The appliance also reduces the total cost of
ownership by removing the requirement for a Microsoft Windows-based server and its associated
licensing costs.
The Site Recovery Manager appliance is preinstalled with all the required services, including the
embedded PostgreSQL database.
Maintenance and upgrades are simplified with upgrade paths that allow for simultaneous upgrades
of both the appliance operating system and the Site Recovery Manager software, as required.
You can use Site Recovery Manager in a number of different failover scenarios:
Bidirectional: Site Recovery Manager can provide bidirectional failover protection so that you
can run active production workloads at both sites, and failover to the opposite site in either
direction. The spare capacity at the other site is used to run the VMs that are failed over.
Shared recovery sites: Although less common, some customers need to fail over in a given
site or campus. Examples of such circumstances include when a storage array failure occurs
or when building maintenance forces the movement of workloads to a different campus
building. In these cases, customers use Site Recovery Manager to perform failovers.
Active-active data centers: This new topology is supported with metro-distance stretched
storage solutions. Production applications run at both sites. The stretched storage provides
VMware Site Recovery for VMware Cloud on AWS protects workloads between on-premises
data centers and VMware Cloud on AWS, as well as between different instances of VMware
Cloud on AWS. The new service lets you take advantage of a consistent, vSphere-based
infrastructure and operating environment that extends from on-premises to VMware Cloud on
AWS.
Active-passive data centers provide a legacy failover model that is not popular because of the
inherently high cost of maintaining a secondary idle data center. With this model, the secondary
data center requires sufficient resources to run any workloads that are configured to fail over in the
recovery plan.
The bidirectional protection model is the most common Site Recovery Manager topology in
production. It is more cost-effective than the active-passive model because Site Recovery Manager
allows for the suspension of noncritical workloads in the recovery site during a disaster recovery.
Cost-effectiveness is dependent on how many resources are used in each site. For example, if all
workloads are deemed critical and cannot be suspended in the case of a disaster recovery, each site
must reserve sufficient resources to allow the opposite site's workloads to run. Assuming both sites
are running similar workloads, each site needs to reserve 50% of its capacity, resulting in no cost
savings over the active-passive layout. Cost-effectiveness is only a perceived value because both
data centers actively host workloads.
However, if some VMs can be suspended in a disaster recovery event, each site can manage higher
consumption rates, while still providing failover capability. For example, if both data centers are
equally sized and consume 75% of their capacity with standard workloads, the recovery site will
be overprovisioned by 50% (75% + 75%) during a disaster recovery. But if the number of
protected VMs configured to fail over is reduced to include only critical workloads, and
noncritical workloads on the recovery site are suspended during a disaster recovery, one data
The following considerations are important when you deploy Site Recovery Manager in a multi-
site topology:
You can register a maximum of 10 Site Recovery Manager servers to a single vCenter Server
instance.
By adhering to these guidelines, you can deploy Site Recovery Manager with a number of
different multi-site topologies to suit your requirements.
For information about other supported multi-site topologies, see
https://blogs.vmware.com/virtualblocks/2016/07/28/srm-multisite/.
The primary design consideration when implementing Site Recovery Manager with an active-
active stretched storage solution is maintaining continuous availability of workloads, while
minimizing the amount of service downtime during a disaster. Stretched storage solutions are
typically limited to a metro distance because of the requirement for extremely low latency between
storage sites.
Stretched storage implementations allow for read/write access to replicating devices from both
sites. ESXi hosts in both sites effectively have access to the same VMFS file system in either site
simultaneously, even though the file system might be hosted on different physical devices and
different storage arrays. Such highly available file systems are known as stretched datastores and
allow a VM's files to be read from both locations at any time.
During planned maintenance, VMs on a stretched datastore can be migrated using vSphere
vMotion without incurring downtime. When used in conjunction with NSX stretched layer 2
networks, end users and application owners will notice no impact. Even during real disaster
scenarios, vSphere vMotion can be used if both sites remain available during the recovery.
Automation and orchestration of failover to the cloud from on-premises data centers or other
cloud availability zones
VMware Site Recovery allows you to manage costs by removing the need for an on-premises
secondary site with its associated management costs. With VMware Cloud on AWS, you only pay
for what you need. Additional resources are provisioned on demand during a failover. Resources
in VMware Cloud are managed in exactly the same way as your on-premises ESXi hosts and
allow for seamless integration with your existing data center.
Site Recovery Manager 8.x is compatible with multiple versions of vCenter Server, allowing for
flexibility in on-premises deployments and compatibility with VMware Site Recovery.
For information about compatibility with vSphere versions, see the VMware Product
Interoperability Matrices at
http://partnerweb.vmware.com/comp_guide2/sim/interop_matrix.php?.
Platform Services Controller plays a key role in a Site Recovery Manager deployment because it
provides the authentication services Site Recovery Manager requires to communicate securely
with vCenter Server. By integrating with vCenter Single-Sign On, you can log in to vSphere
services, such as vSphere Client, and Site Recovery Manager with the same credentials. Once
authenticated against one service, you are authenticated for all services.
Platform Services Controller provides this functionality by maintaining a directory of users,
services, and machine accounts that participate in the vSphere domain. This domain is managed by
vSphere and is dedicated to VMware services and products. vCenter Single-Sign On can also
integrate with external domain services, such as Microsoft Active Directory, for user
authentication. Once a user is authenticated, the Security Token Service issues a digital token to
the user client session. This token is used to grant access to other VMware services. As part of the
VMware directory service, details about all registered services are maintained in the Lookup
Service. This allows services to locate and communicate with other registered services.
The Platform Services Controller is also responsible for storing and applying licenses to VMware
assets, including vCenter Server, ESXi hosts, and Site Recovery Manager.
The simplest way to deploy a Platform Services Controller instance and a vCenter Server instance
is as a single appliance known as vCenter Server with embedded Platform Services Controller. In
this deployment model, all services provided by both server types are accessible on the same
appliance via the same IP address or Fully Qualified Domain Name (FQDN).
This model is the simplest to deploy and also to maintain and secure. In some releases of vSphere,
the model prevents you from configuring Enhanced Linked Mode because replication cannot be
configured between embedded Platform Services Controllers. Later releases of vSphere support
embedded Platform Services Controller replication, and this deployment model is recommended
in most cases.
By using a separate, external Platform Services Controller instance, you can deploy multiple
instances of vCenter Server that all register to the same vSphere domain for ease of management.
Each vCenter Server instance uses the same shared services and vCenter Single Sign-On directory
structure.
To prevent Platform Services Controller instances from becoming a single point of failure, you
can configure a highly available deployment with the use of a supported load balancer and two, or
more, replicating Platform Services Controller instances. By using a load balancer, the vCenter
Server systems can access services on either Platform Services Controller instance via a single
access point (the load balancer IP address or FQDN). The load balancer ensures that the request is
forwarded to the available Platform Services Controller instance. In the event of a failure of one of
the Platform Services Controller instances, all the requests are redirected to the remaining
Platform Services Controller instances.
Configuration of the load balancer and endpoint certificates can become very complex and prone
to configuration errors.
A simpler deployment removes the requirement for a load balancer but requires manual
intervention if a Platform Services Controller instance suffers a permanent failure. By deploying
more than one replicating Platform Services Controller, you can manually repoint a vCenter
Server instance to another available Platform Services Controller. This option reduces the
Enhanced Linked Mode is available when all vCenter Server instances belong to the same vSphere
domain. You configure Enhanced Linked Mode by registering all vCenter Server instances to the
same Platform Services Controller instance or, as in the image, by configuring replication between
all Platform Services Controller instances. By configuring replication, each Platform Services
Controller instance shares information about the VMware directory structure with all other
Platform Services Controller instances. Replication typically occurs every 30 seconds so any
changes to the directory structure, such as when a new service is registered, are made in minutes to
the rest of the domain.
When configuring Platform Services Controller replication, you must consider the path that
replications need to take in order to replicate to all other instances. For example, in the image, if a
change is made on one of the end servers, the change must replicate to the middle server first
before being replicated to the other end server because no direct replication path is possible
between the two end (left and right) servers. You can use command-line options to create
additional replication agreements between Platform Services Controller instances to avoid a single
point of failure in the replication paths.
The simplest and most common way to include Site Recovery Manager in a vSphere deployment
is to have two independent sites with their own Platform Services Controller and vCenter Server
instances. The Platform Services Controller instances can be replicated or not, or can be either
embedded or external.
A Site Recovery Manager instance is registered on each site and configured as a site pair.
In more complex deployment types, such as the one shown on the slide, every Site Recovery
Manager instance must register with a vCenter Server instance. Site Recovery Manager instances
always operate in pairs and pair only with one other Site Recovery Manager instance.
A Site Recovery Manager instance pairs only with another Site Recovery Manager instance
registered to a different vCenter Server instance. However, having both vCenter Server instances
registered with the same Platform Services Controller instance does not prevent the Site Recovery
Manager pairing operation.
The Site Recovery Manager appliance is deployed as an OVF template in the same way as all
other VMware appliances.
Updates for the Site Recovery Manager appliance include both operating system and Site
Recovery Manager software upgrades and patching. All upgrades and patches are tested together
to ensure compatibility before shipping. Using Site Recovery Manager for Windows requires the
user to maintain and patch the Windows environment separately from the Site Recovery Manager
software.
The kernel and other aspects of the Photon OS are built with an emphasis on security.
Unnecessary services and network ports are disabled by default.
The Linux kernel is tuned for performance when Photon OS runs on vSphere. Photon OS also
includes the Docker daemon to allow for the use of containerized software components, such as
storage replication adapters (SRAs).
The Site Recovery Manager Appliance Management page is organized using the following tab
structure:
Summary: Use this tab to register or unregister Site Recovery Manager to a vCenter Server
instance, download a log bundle for VMware support, and view a summary of the current Site
Recovery Manager system configuration.
Monitor Disks: Use this tab to check on the status and usage of the disk partitions used by the
appliance. The appliance is deployed with one 16 GB virtual disk. This disk is divided into two
partitions, a 2 GB swap partition and a 14 GB root partition, where the OS files and Site Recovery
Manager database are stored. A second 4 GB virtual disk is used for the support partition, where
log files and core dumps are stored.
Access: Use this tab to enable or disable SSH access, change appliance and database passwords, or
perform certificate renewal actions. You can create a new self-signed server certificate or generate
a certificate signing request.
vCenter Server registration is an important part of the Site Recovery Manager installation process.
Without a secure and reliable connection to vCenter Server, Site Recovery Manager cannot
perform the operations required to protect and recover VMs. Site Recovery Manager must be able
to obtain information about registered VMs, datastores, clusters, virtual networks, and so on.
In the registration wizard, you are prompted for the address of the Platform Services Controller
instance that manages the vCenter Server instance to which you intend to register Site Recovery
Manager. Site Recovery Manager must query the Lookup Service on Platform Services Controller
for a list of registered vCenter Server instances. The Lookup Service also provides details about
the URL and SSL certificate information that must be used to communicate with vCenter Server.
Once the details are found, you are prompted to choose from a list of vCenter Server instances.
Site Recovery Manager uses the information provided by the Lookup Service to communicate
securely with the chosen vCenter Server instance.
As part of the registration process, Site Recovery Manager adds a plug-in to vSphere Client. You
use this plug-in to connect to the Site Recovery Manager user interface. The plug-in is registered
During the vCenter Server registration process, you are prompted for the Platform Services
Controller address and credentials for an administrative account on the vCenter Single Sign-On
domain (typically [email protected]). Site Recovery Manager requires these credentials
so that it can create a new user in the domain specifically for Site Recovery Manager operations.
Site Recovery Manager generates a certificate that is used as the credentials for the solution user
account and provides this certificate to Platform Services Controller when the user is created.
Site Recovery Manager uses this special user account to perform operations in the vCenter Single
Sign-On domain. When an actual human user initiates a task, such as a recovery workflow, Site
Recovery Manager checks its database to ensure that the user has sufficient privileges to perform
the task. Once the permissions are validated, Site Recovery Manager uses the solution user
account to execute any required operations on vCenter Server.
You are prompted to provide the following details when creating the vCenter extension:
Site Name: Display name for the Site Recovery Manager instance.
Administrator email
Local host: IP address or FQDN of the Site Recovery Manager instance.
SRM Listener port: Port used by Site Recovery Manager Server (Default: 9086).
SRM UI listener port: Port used to connect to the Site Recovery Manager web interface (Default:
443).
Extension ID:
Default extension ID: Used when only a single instance of Site Recovery Manager is registered to
the vCenter Server instance.
Custom extension ID: Used when multiple instances of Site Recovery Manager are registered.
Each Site Recovery Manager instance must use a unique extension ID.
The Extension ID is required during the pairing process. Site Recovery Manager only creates a
pairing with another Site Recovery Manager instance with the same Extension ID. Therefore,
when configuring the remote site Site Recovery Manager instance that you intend to use as a pair,
you must enter the same Extension ID when creating the vCenter Server extension.
You might be required to restart certain services after failures or under the direction of VMware
support. For example, if vCenter Server is unavailable while Site Recovery Manager is booting,
the srm-server service might fail to start correctly. When vCenter Server becomes operational, you
can restart the srm-server service instead of performing a full reboot of the appliance. You can
also check the status of the services using the Services tab and attempt to restart any services that
show a Stopped or Failed state.
Updating and patching the Site Recovery Manager is a simple operation that can be performed
using an online repository, such as the VMware website or a CD-ROM installation.
You can download ISO files for the latest Site Recovery Manager release from the VMware
download page for Site Recovery Manager. Site Recovery Manager ISO files, provided by
VMware, include an update repository with all the required RPMs to update both the operating
system and the Site Recovery Manager software. VMware recommends using the latest supported
Site Recovery Manager release for your version of vSphere.
VMware also recommends that you take a snapshot of your appliance before the update process
begins to ensure that you can safely recover in the event of an upgrade failure or a power outage
during the upgrade process.
When you deploy the Site Recovery Manager appliance, you create passwords for the root
account, the admin account, and the database user. Using the Access tab, you can change the
password of the admin account (the account you use to log in to the UI ) and the database user
password. If you change the database user account password, you must restart the srm-
vpostgres service in the Services tab. You cannot change the root password through the UI.
You can enable and disable SSH access through this interface. To maintain high security levels,
you should only enable SSH access when required and disable it when you complete any tasks that
require SSH access. By default, you cannot log in to Site Recovery Manager through SSH using
the root account. You must log in using the admin user account and use the sudo command to
perform operations that require root privileges. You must retain the root password for use with the
sudo command.
Site Recovery Manager uses a self-signed certificate by default for the Site Recovery Manager
server service. You click Change to generate a new self-signed certificate, upload a PKCS#12 CA
signed certificate, or upload a signed CSR.
When you configure Site Recovery Manager initially, a plug-in is created in vSphere Client. The
plug-in performs a simple redirect to the URL of the Site Recovery Manager user interface, which
is provided by a dedicated Tomcat server running on the Site Recovery Manager server. This
service is called the dr-client service, which can be started, stopped, or restarted in the Site
Recovery Manager configuration UI.
Site Recovery Manager pairing is an essential part of Site Recovery Manager configuration.
For example, during recovery operations, some tasks are performed on the protected site, such as
shutting down VMs. Such tasks are coordinated by the protected site Site Recovery Manager
server and its vCenter Server instance. Other tasks are performed on the recovery site, such as
mounting recovered datastores and registering VMs. These tasks are performed by the recovery
site's Site Recovery Manager and vCenter Server.
Each Site Recovery Manager instance maintains its own database, and each database records
information from the perspective of the relevant site. For example, VMs are identified in the
protected site database with a local identifier and a reference is added to the peer (remote).
Site Recovery Manager uses the Lookup Service on the Platform Services Controller instance to
locate a vCenter Server with a matching Site Recovery Manager Extension ID. Once a qualifying
vCenter Server instance is selected by the user, the Lookup Service provides details about the
remote Site Recovery Manager instance associated with the relevant extension.
If the remote Platform Services Controller instance belongs to the same vCenter Single Sign-On
domain as the local Platform Services Controller instance, Site Recovery Manager does not create
an additional solution user account because the directory entry should replicate from the source
Platform Services Controller instance. However, if the remote Platform Services Controller
instance is not participating in the same domain as the local Platform Services Controller instance,
Site Recovery Manager must create a solution user account on the remote Platform Services
Controller instance so that it can authenticate with remote domain services.
Once the remote Site Recovery Manager instance with the matching extension ID is located, the
pairing process between the Site Recovery Manager server instances takes place over port 9086.
You must ensure that port 9086 is open on the firewall between the Site Recovery Manager
servers.
For more details about the ports required for Site Recovery Manager pairing, see Network Ports
for Site Recovery Manager in the Site Recovery Manager Installation and Configuration Guide at
https://docs.vmware.com/en/Site-Recovery-Manager/.
The Import/Export feature in the Site Recovery Manager UI is new for version 8.2. Site Recovery
Manager 8.1 supports importing and exporting of the Site Recovery Manager configuration
through the command-line only.
You should periodically back up your Site Recovery Manager configuration before running any
recovery workflows. Backing up in this way enables you to restore your configuration easily, if
you need to reinstall Site Recovery Manager after a server failure.
Alarms
Protection groups
Network, folder, resource, and storage policy mappings, including IP customization rules
Placeholder VM information
Advanced settings
Inventory mappings provide a convenient way to specify how resources at the Site Recovery
Manager protected site are mapped to resources at the recovery site. These mappings are applied
to all members of a protection group when the group is created, and they are reapplied as needed,
for example, when new members are added. If you do not create mappings, you must specify
mappings individually for each VM that you add to a protection group. A VM cannot be protected
unless it has valid inventory mappings for networks, folders, and resource pools.
You should not specify resource mappings for resources that are not used by protected VMs.
Storage policy protection is dynamic, which means that with storage policy protection groups, Site
Recovery Manager only applies the inventory mappings at the moment that you run a recovery
plan.
VM placement decisions are made according to the inventory mappings when a recovery plan
runs, so Site Recovery Manager does not create placeholder VMs on the recovery site. You cannot
configure protection individually on the VMs in a storage policy protection group. As a
consequence, you must configure sitewide inventory mappings if you use storage policy
protection.
Network mappings relate only to VM port groups. You cannot map management network or
VMkernel port groups.
When creating network mappings, you must ensure that you choose a valid destination group on
the recovery site. Site Recovery Manager does not check whether the port group that you map to
provides the connectivity that the VM requires. Nothing prevents you from mapping to an
internal-only network that has no connection to external network interface cards (NICs).
Sometimes this mapping is the correct choice. For example, you might plan to use a VM at the
recovery site that acts as a firewall or router. The firewall or router connects the internal network
to a network that is connected to a physical NIC.
Even if you choose a network mapping that has an external NIC, nothing in Site Recovery
Manager confirms that the NIC is correct. You must have descriptive names for the port groups on
both the protected site and the recovery site.
Site Recovery Manager can protect VMs that are attached to NSX networks present on the
protected and recovery site without having to configure inventory mappings.
Configuring appropriate test networks is important for successfully testing your disaster recovery
plans. Because production VMs remain powered on and accessible during test recovery, you must
ensure that VMs recovered during the test do not conflict with production VMs.
If your testing requirements only require you to validate that machines start on the recovery site
and can have their IP settings customized, you can choose to use the default auto-generated test
networks during your test recoveries. VMs that are recovered to different hosts on the recovery
site cannot communicate with each other because the auto-generated test networks use standard
switches with no uplinks. Therefore, the VMs are not connected to the external physical network
resources. VMs that are recovered to the same host can still communicate to each other. This
configuration should be adequate for testing purposes in most environments and should be
considered unless interconnectivity between VMs is required during testing.
If you require full network communication between VMs during a test recovery, you should create
dedicated network port groups for use only during test recoveries. These port groups should allow
layer 2 connectivity between recovered VMs but should not allow for layer 3 connectivity to
prevent conflicts with production services still running on the protected site. You should have one
It is good practice to use only one subnet for each network port group you have configured in
vSphere. If you have configured your vSphere network to use one subnet per port group, the
subnet mapping feature provides a simple way to create IP customization rules for your protected
VMs.
As part of the port group mapping, you can add a subnet mapping rule. In the example, all VMs
connected to the Web_Servers_Prod port group with IP addresses in the 172.20.16.0/22 subnet are
assigned the same host address in the 10.20.32.0/22 subnet in the recovery site. For example, a
VM with an IP address 172.20.17.200/22 is assigned 10.20.33.200/22 in the recovery site. The
subnet in the recovery site should be dedicated for recovered VMs to avoid potential IP address
conflicts when VMs are recovered.
When you configure subnet mappings, the source subnet and destination subnet must use the same
subnet mask. This configuration ensures that an equal number of host addresses is available in the
recovery site and that only the network portion of the IP address requires customizing.
VMs and data centers should be organized into folders. With folders, inventory mappings are
more specific. Place VMs into folders. Data centers are superfolders. Inventory mappings can be
done without folder mappings, but organizing VMs into folders allows for better management,
planning, and permissions.
Folder hierarchies are used to organize which VMs are only local and which ones come from the
protected site. These hierarchies also help categorize VMs by purpose, recovery point objective,
recovery time objective, or other important factors.
vSphere clusters and resource pools are the most commonly mapped compute resources.
Storage policy protection groups are dynamic by nature. Using rules defined in a storage policy,
VMs can automatically be placed on appropriate storage to meet requirements. By creating a
storage policy protection group, you can ensure a VM is configured for protection by simply
attaching the appropriate storage policy and making the VM compliant. Storage policies in use on
the protected site must be mapped to similar storage policies on the recovered site to ensure that
VMs are correctly located in the recovery site vCenter Server inventory.
Other mappings for VMs that are protected as part of storage policy protection groups are not
applied to a VM until the moment of recovery. The global settings that apply to the VM when the
recovery is started are dynamically applied and cannot be statically assigned when protection is
configured.
Even if mappings cannot be split, they can be collapsed. The recovery site might have fewer
folders, networks, or resource pools than the protected site. Therefore, you can map multiple
locations at the protected site to a single place in the recovery site.
Site Recovery Manager reserves a place for protected VMs in the inventory of the recovery site by
creating a subset of VM files on the specified datastore at the recovery site. This subset of files is
called a placeholder VM. The subset of VM files is used to register the placeholder VM with the
vCenter Server system at the recovery site.
Placeholder VMs cannot be powered on. Their only function is to reserve a place in the inventory
of the recovery site. Placeholder VMs are composed of only a few files: .vmx, .vmxf, and .vmsd
files. The disk files (.vmdk) are not present. The placeholder VM files have a size of
approximately 1 KB each. Placeholder VMs should not be replicated.
When you add a virtual machine to a protection group, Site Recovery Manager creates a
placeholder VM at the recovery site. When a placeholder VM is created, Site Recovery Manager
derives its folder and compute resource assignments from inventory mappings that you establish at
the protected site. At the recovery site, you can modify folder and compute resource assignments
as necessary. If placeholder virtual machines are deleted, they can be easily recreated from the
Protection Group menu without removing replication from VMs.
After you select the datastore to contain the placeholder VMs, Site Recovery Manager reserves a
place for protected VMs in the inventory on the recovery site. Site Recovery Manager creates a set
of VM files on the specified datastore at the recovery site. Site Recovery Manager uses that subset
to register the placeholder virtual machine with vCenter Server on the recovery site.
The placeholder datastore should be accessible to all hosts in the recovery cluster. The placeholder
datastore should not be replicated and is relatively small.
You establish placeholder datastores at both sites to enable planned migration and reprotection.
The placeholder datastore should be on shared storage available to multiple ESXi hosts at the
recovery site. The key is that the storage area selected for placeholder VMs must not be one of the
logical unit numbers replicated from the protected site.
Site Recovery Manager does not create placeholder VMs for storage policy protection groups.
You do not need to identify a placeholder datastore if you only use storage policy protection
groups.
You should update mappings manually if the names of mapped objects change. Network port
groups, folders, and resource groups at the recovery site can be renamed or deleted. Site Recovery
Manager does not prevent a vCenter Server user, with the proper security rights at the recovery
site, from making changes to the inventory structure of mapped objects. If a mapped object is
changed or deleted, the inventory mapping does not automatically update.
You should click Refresh to update folder and resource groups that are recently renamed. After
you click Refresh, Site Recovery Manager is aware of the new network port group names, but
these names must be manually updated.
The SRA is an important component of Site Recovery Manager that allows Site Recovery
Manager to issue requests to the storage or replication software.
An array manager is the software that manages your array replication. Typically, the array
manager is either part of the array software, or is an external solution that works with your storage
array to coordinate replication. You must configure Site Recovery Manager with the access details
to communicate with the array manager. An array manager can manage replication from one or
more storage arrays.
You add an array pair to Site Recovery Manager by using a compatible SRA and by providing
details about the local and remote array managers. Site Recovery Manager can then query the
array managers to retrieve details about devices being replicated between the two arrays.
Consistency group configuration on the storage can have a significant impact on the way Site
Recovery Manager operates. If Site Recovery Manager is not aware of the consistency group
configuration, it may request a failover of a device that is part of a consistency group, without
considering the impact to other devices within the group. Some SRAs are capable of reporting
consistency group information to Site Recovery Manager to help prevent such issues from
The diagram shows Site Recovery Manager architecture with array-based replication. An SRA is
added in this architecture.
If you use array-based replication, you must install an appropriate SRA on the Site Recovery
Manager server hosts at each site. An array vendor provides the SRA.
You must install an SRA specific to your storage array type or replication software type. You must
also ensure that the SRA is supported for your release of Site Recovery Manager.
To check the availability of an SRA for your type of storage, see the VMware Compatibility
Guide for Site Recovery Manager at
http://www.vmware.com/resources/compatibility/search.php?deviceCategory=sra.
SRAs play a key role in allowing Site Recovery Manager to interact with storage arrays from
different vendors. If you use arrays from different vendors, or even different array types from the
same vendor, you require a separate SRA for each vendor and array type.
The Site Recovery Manager appliance comes prebundled with the Docker service for the
management of containerized applications. SRAs are deployed as containers, which are
prepackaged standalone applications that are abstracted from the operating system. Containerized
solutions allow for consistent functionality, regardless of changes to the underlying infrastructure.
Updates to the Site Recovery Manager appliance do not impact the running of the SRA.
You should perform a rescan of newly installed SRAs on both Windows deployments and on Site
Recovery Manager appliance deployments.
Each discovered SRA is displayed in the user interface with a summary of the array models and
replication software versions supported by the SRA. You must ensure that your replication
software and arrays are supported by the SRA version you installed. If you upgrade your SRA,
you might also be required to upgrade your replication software or your array management
software.
The Summary tab displays information about support for the stretched storage functionality.
Other features that are not shown in the user interface might be supported by your SRA.
You should check your SRA documentation for details. For example, SRAs might include the
following features:
Consistency groups: If the SRA supports consistency groups, it provides information to Site
Recovery Manager about consistency group configuration on the array. Site Recovery
Multi-array discovery: Support for this feature allows a single SRA to communicate with
multiple arrays.
Dynamic access restriction: This feature allows Site Recovery Manager to request that
recovered devices only be presented to a subset of initiators in the recovery site, such as a
single cluster of hosts.
An array pair is any two storage arrays or replication software endpoints between which
replication exists. Typically, most environments use one source array and one target array.
However, some devices might replicate from one source array to more than one target array. In
that case, you need to configure multiple array pairs to allow Site Recovery Manager to work with
all replicated devices.
If you have more than one storage array type or vendor, you need an SRA for each array type. If
you have more than one storage array of the same type and vendor, your SRA must support multi-
array discovery. For example, a separate replication solution coordinates the replication between
multiple source and destination arrays. The SRA is configured to connect to the replication
software management interface. The SRA displays multiple array pairs, and you must enable each
array pair to detect the replicated devices within each array pair.
After you add details for the local and remote array managers, Site Recovery Manager issues a
request to the selected SRA called the discoverArrays command. The SRA should respond
with details about the local array type, replication solution, array vendor, and array identifier
(typically, the serial number). The local array or replication software also informs Site Recovery
Manager about any remote storage arrays with which it has a replication agreement. If both the
local and remote array managers have replication agreements, an array pair is created.
After an array pair is enabled, Site Recovery requests a list of devices that are replicated between
the array pair using the discoverDevices SRA command. The SRA should respond with a
list of all replicating devices, regardless of whether they are used by the vSphere environment.
Using the unique device identifiers in the SRA response, such as an NAA ID, Site Recovery
Manager queries the vCenter Server inventory to determine whether the device contains a
registered datastore or whether it is used as a raw device mapping (RDM) for a virtual machine.
This information appears in the Datastore column on the Array Pairs page in the Site Recovery
Manager UI.
The replication direction is displayed relative to the Site Recovery Manager instance that you are
logged in to. In the example, both replicated devices replicate from the array local to the sa-vcsa-
01.vclass.local vCenter Server site to the array local to the sb-vcsa-01.vclass.local vCenter Server
site. In bidirectional architecture, devices replicate in both directions.
A virtual machine must be protected in its entirety. When a storage device is recovered, Site
Recovery must ensure that any VM files on the device do not depend on files on a different device
that is not failed over. Site Recovery Manager automatically groups together datastores when the
files of a protected VM spans multiple devices.
Site Recovery Manager will perform this calculation periodically (every 5 minutes by default), any
time the discoverDevices SRA command is issued and after inventory changes to protected
VMs. This task is recorded in the Tasks and Events section of vCenter Server as the Recompute
Datastore Groups task.
If the datastore group calculation changes after you create protection groups, the protection group
status might be affected. You must understand how the layout of virtual machines affects the
calculation of datastore groups so that you do not inadvertently affect the status of protected VMs
and prevent the recovery of these VMs
To determine how Site Recovery Manager calculates datastore groups, you must consider the
layout of the VM files.
Is each VM self-contained, in its entirety, within a single datastore?
Can datastores be recovered in isolation?
In this example, each VM has a complete set of files on a single datastore. Each datastore can thus
be recovered in isolation without affecting the protection status of any VM. For example, if the
device containing Datastore 1 is recovered, every file belonging to the VM hosted on the datastore
is also recovered.
In this example, you add a virtual disk to the third VM, but you place the virtual disk file on
Datastore 2.
How does this change affect the calculation of datastore groups?
Can datastores still be recovered in isolation?
If the device containing Datastore 2 is recovered by itself, the virtual disk belonging to VM3 is
also recovered to the recovery site. This configuration leaves part of the VM running in the
production site, but the disk on which the VM depends is only available in the recovery site. The
production virtual machine, VM3, crashes due to loss of access to its virtual disk, and you cannot
power on the VM in the recovery site because of the incomplete set of files. The same issue occurs
if Datastore 3 is allowed to fail over in isolation. Therefore, to provide adequate protection to the
virtual machines, Site Recovery Manager must recover both datastores simultaneously.
If you create individual protection groups in Site Recovery Manager using the datastore groups
shown in the previous diagram, where a virtual disk is added to the third VM but the virtual disk
file is placed on Datastore 2, the protection groups based on Datastore 2 and Datastore 3 become
invalid and require reconfiguration to include the new datastore group in a single protection group.
If a single datastore spans two extents corresponding to partitions of two different devices, the two
extents must be in a single consistency group. In addition, the SRA must report consistency group
information from the array in the device discovery stage. Otherwise, the creation of protection
groups based on this datastore is not possible even though the SRA reports that the extents that
make up this datastore are replicated.
If multiple datastores belong to the same consistency group, you must ensure that all datastores are
configured as part of the same protection group in Site Recovery Manager. Otherwise, Site
Recovery Manager might request the failover of a single device but not anticipate the impact on
other devices in the consistency group. When the SRA supports consistency groups, Site Recovery
Manager can use the information provided by the SRA to ensure that all datastores in the
consistency group are grouped together for simultaneous failover by Site Recovery Manager.
vSphere Replication appliance, which consists of the vSphere Replication management server
and the vSphere Replication server
vSphere Replication agent, which is installed on the ESXi hosts and consists of the vSphere
Replication service and vSphere Replication filter
To use vSphere Replication, you must deploy one vSphere Replication appliance at each site. The
vSphere Replication appliance is registered with the corresponding vCenter Server instance on
each site.
Additional vSphere Replication servers can be loaded for load balancing.
For a list of all the ports that must be open for vSphere Replication, see VMware knowledge base
article 2087769 at http://kb.vmware.com/kb/2087769.
The vSphere Replication appliance manages the replication infrastructure and the replication
process. To meet the replication load of your environment, a separate vSphere Replication server
appliance is available. The vSphere Replication server appliance performs the replication process.
You might need to deploy one or more vSphere Replication server appliances at the recovery site
to balance the replication load. You must register additional vSphere Replication server appliances
with the vSphere Replication appliance on the corresponding site.
You can have a maximum of one vSphere Replication appliance per vCenter Server instance. The
vSphere Replication appliance contains a vSphere Replication server. You can have up to nine
additional vSphere Replication servers registered to the vSphere Replication appliance.
The vSphere Replication appliance is responsible for mapping datastores, configuring replication,
and coordinating replication operations between protected and recovery sites. The vSphere
Replication appliance also runs recovery plan tests that are passed to it by Site Recovery Manager.
The vSphere Replication appliance coordinates between the vSphere Replication agent at the
protected site, the vSphere Replication server at the recovery site, and the ESXi hosts on both
The vSphere Replication server is a subset of the functionality of the full vSphere Replication
appliance. The vSphere Replication server neither partakes in management functions nor in
creation of policy for replication. In most scenarios, you must deploy a single vSphere Replication
appliance at each site so that vSphere Replication works.
The vSphere Replication server performs the following tasks:
Allows replication even if vCenter Server, the Site Recovery Manager server host, or the
vSphere Replication appliance is down
To meet the load-balancing needs of your environment, you might need to deploy additional
vSphere Replication server instances at each site. vSphere Replication supports a maximum of 200
virtual machines per vSphere Replication server.
The vSphere Replication service is integrated with the ESXi host agent, hostd, and manages the
replication process for virtual machines.
The vSphere Replication service writes the replication configuration for a virtual machine to its
configuration file, such as RPO, vSphere Replication server IP addresses, and the data transfer
port.
The vSphere Replication filter maintains the replication state for each replicated disk that is
attached to a virtual machine. To maintain an accurate state, the vSphere Replication filter tracks
regions of a virtual disk that are modified by the guest OS. The vSphere Replication filter
communicates this information to the vSphere Replication service.
Multiple vSphere Replication filter instances can run simultaneously. Each instance writes to an
in-memory state and a persistent-state file to store the replication state of a virtual machine. The
persistent-state file is named hbr-persistent-state-RDID, followed by a hexadecimal
number and a .psf extension. The in-memory state is flushed when the virtual device is
destroyed.
At the lowest level (inside the VMkernel), the vSphere Replication filter watches disk updates
from virtual machines. The vSphere Replication filter also tracks which blocks have changed since
the last replica was sent to the remote site. When an instance of changed blocks for replication
must be created, the vSphere Replication agent transfers the data to the vSphere Replication server
on the remote site.
vSphere Replication can be used in isolation from Site Recovery Manager in a single vCenter
Server environment. For example, you might have a number of ESXi clusters dispersed across a
campus, each with their own storage and managed by a single vCenter Server instance. vSphere
Replication can be used to replicate VMs from one ESXi cluster to a datastore accessible by
another ESXi cluster. In the case of a failure, the VM can be recovered in the secondary cluster.
Array-based replication can provide synchronous replication for the most demanding applications,
such as database servers, that require extremely low or even zero RPO values. However, it may
not be cost-effective to have all applications replicated using the array-based solution due to
licensing costs and bandwidth requirements. vSphere Replication can be used in parallel with
array-based replication to provide replication for less critical workloads and those services that do
not require such high-performance replication and low RTO values.
Site Recovery Manager and vSphere Replication can be used by cloud service providers or IT
departments to provide their customers with Disaster Recovery to the Cloud services. The shared
recovery site feature provides this capability by enabling you to protect virtual machines from
multiple vCenter Server instances at protected sites to a single vCenter Server instance at a shared
recovery site. vSphere Replication is used to replicate virtual machines. These sites replicate to a
single recovery site that is set up by the service provider. The service provider determines which
replication methods to offer through its services, that is, vSphere Replication, array-based
replication, or both.
Please note that only a single vSphere Replication appliance, which provides management
functionality, is required in the shared recovery site. It is not possible to register more than one
instance of a vSphere Replication appliance to a single vCenter Server instance. If you attempt to
register an additional vSphere Replication appliance, the existing vSphere Replication server
extension is overwritten on the vCenter Server instance and the original vSphere Replication
appliance will no longer communicate with vCenter Server. You can provide isolation for each
One of the key benefits of vSphere Replication is that it is entirely storage agnostic. vSphere
Replication uses the ESXi host to perform storage operations when replicating VM files. This
means you can have storage from different vendors, and even using different protocols, in each
site, preventing vendor lock-in. For example, you can have Fibre Channel storage from vendor A
in the protected site and replicate to an NFS device from vendor B at the recovery site.
vSphere Replication also integrates with other vSphere storage types such as vSAN and vSphere
Virtual Volumes, as either source or destination datastores. When using vSAN as a destination
datastore, you can choose storage policies for the replica virtual machine and its disks when
configuring replications. For more information, please see Using vSphere Replication with vSAN
Storage in the VMware vSphere Replication Administration Guide at
https://docs.vmware.com/en/vSphere-Replication.
vSphere Replication is configured on a per-VM basis which means that you can replicate and
protect only a subset of VMs that are stored on a particular datastore. With array-based replication,
all VMs on a replicated datastore must be protected by Site Recovery Manager.
When determining how much bandwidth is required for vSphere Replication, you must first
determine how much data is to be protected by vSphere Replication. You may configure
replication only for a subset of your VMs. Calculate the combined disk capacity of all VMs to be
protected to determine your data set size. For example, you have 10 VMs with 100 GB of disk
capacity each, resulting in a combined data set size of 1 TB. You configure replication for 5 VMs
only, which will result in a replicated data set of 500 GB.
You should then calculate the average change rate within the replicated data set. You should
monitor each VM that you plan to replicate for a longer period of time to calculate the change rate.
For example, you monitor a VM that you are planning to replicate for a period of 24 hours and
determine that 6 GB of changed data is generated every 24 hours. You can then use the data
change rate over the 24-hour period to calculate the average change rate for the RPO period that
you plan to implement. For example, you plan to set an RPO value of 4 hours for the VM. By
dividing the data change rate that you determined for the 24-hour period by the number of RPO
periods (there are 6 x 4-hour RPO periods every 24 hours), you can determine that the average
amount of data to be transferred during each RPO period is approximately 1 GB.
During the OVF file deployment wizard, vSphere Replication appliance must bind with the
vCenter Extension vService to allow it to access the vCenter Server APIs. To make the binding
status valid, the VirtualCenter.FQDN value must be set to a correct value.
The ISO file provided by VMware includes the OVF templates for the combined vSphere
Replication appliance and the separate vSphere Replication server. You must choose the correct
OVF template files to ensure that you are deploying the correct type of server.
The deployment of the vSphere Replication using the Deploy OVF Template wizard requires
details common to any OVF deployment wizard such as inventory locations and compute, storage,
and network resources to be used by the deployed VM.
A number of the details requested during the deployment wizard are unique to vSphere
Replication OVF deployments. The Enable fileintegrity option determines whether the appliance
is required to report changes made to key OS files. Enabling this option does not guarantee that
files cannot be modified but any changes are logged in a log file at
/opt/vmware/support/logs/fileintegrity/fileintegrity.log for auditing
purposes.
The Disable VCTA option allows you to disable the vCloud tunneling agent, which is required
when you replicate VMs to a cloud service, such as vCloud Air powered by OVH. VMware
recommends disabling this service, unless it is required for replications to the cloud.
Reconfigure various vSphere Replication appliance settings through the web-based VAMI. These
settings are established during installation but can be modified after the appliance is deployed.
You connect to the vSphere Replication appliance by opening a browser and entering the
https://VR_appliance_IP_address:5480/ appliance URL. After the appliance is
configured and registered for the first time, you have the option to connect to the appliance
management interface through the Site Recovery plug-in in vSphere Client.
After you log in, the configuration panel on the VR tab is used to configure database settings,
vCenter Server information, and SSL certificate settings.
For increased security, change the SSL certificate that vSphere Replication uses. The vSphere
Replication appliance uses certificate-based authentication for all connections that it establishes
with vCenter Server and remote site appliances. vSphere Replication generates a standard, self-
signed SSL certificate when the appliance first boots and registers with vCenter Server.
Ensure that the vCenter Server name (or IP address) that you configure here matches the vCenter
Server name (or IP address) that you used during the Site Recovery Manager installation.
The vSphere Replication appliance contains an embedded vPostgreSQL database that you use
immediately after you deploy the appliance, without additional database configuration. If you do
not use the embedded database, you can use the vCenter Server database server to create and
support an external vSphere Replication database. Because vSphere Replication has different
schema requirements, vSphere Replication can use the same database server but not the same
database as vCenter Server. You can use an external database for easier backup or to meet your
company's database standards.
The PhotonOS Linux distribution manages services using systemd. You can start and stop
services using systemctl commands from the command line.
You can start and stop the hms and tomcat services using the virtual appliance management
interface (VAMI) by navigating to https://<VRMS_IP_or_FDQN>:5480.
Establish the vSphere Replication appliance connection at either the protected site or the recovery
site. Before vSphere Replication appliances are paired, appliances must be installed and
configured at the protected site and the recovery site.
To meet the load-balancing needs of your environment or to allow separation of duties for
different tenants, you might need to deploy one or more additional vSphere Replication servers at
each site.
Because the vSphere Replication server is managed through the Site Recovery client, normal
administration should require little or no VM-level management. Provide the IP address, default
gateway, DNS information, and a root password to complete the deployment wizard.
A vSphere Replication server should always be deployed at both site pair locations, even if the
initial replication is in only one direction. Deploying the vSphere Replication server at both sites
prepares you for multidirectional protection and the eventual support of failback. The vSphere
Replication server acts as a proxy for the primary hosts to access replica storage without knowing
details of the remote site.
The vSphere Replication server uses the same disks as a vSphere Replication management server.
However, the management services are not enabled and the server type identification changes by
using a different OVF template (.ovf file) and manifest file (.mf).
Before you register a vSphere Replication server, a vSphere Replication appliance must be
configured at each site. The vSphere Replication appliance must be configured and be in a
connected state.
After you register the vSphere Replication server, the Site Recovery Manager UI changes. A new
object in the center pane appears that represents your vSphere Replication server and the value in
the Target VR Servers text box is incremented by one.
Questions?
During a full vSphere Replication synchronization, both sites read the complete disk and compute
checksums. Checksums are sent to the protected site, where they are compared with the initial data
and the set of disk blocks requiring updates is created. Blocks that have changed on the protected
site are sent to the recovery site, to overlap replication with the checksum process. In addition to
establishing the initial state of the virtual disk, a full synchronization is also performed to recover
from errors or if an unusual state is detected.
After full synchronization is complete, a delta-based instance of blocks is created for replication.
The recovery site virtual disk is fully consistent, rather than crash consistent. Lightweight delta
data is tracked and sent in 8 KB chunks.
After the initial synchronization of a VM to its target, vSphere Replication sends only changed
blocks from the source VM. In the recovery point objective (RPO) that is defined by the
administrator, vSphere Replication tracks the blocks that are changing and creates a lightweight
delta. A lightweight delta is a bundle of changed blocks to be transferred to the target disk.
As the target receives the lightweight delta from the source, the data is written to a redo log on the
target disk. To preserve the integrity of a consistent disk, new updates for the next instance are
stored in a redo log. After the sending of an instance of replication data is complete, the redo log is
merged into the base disk. This process preserves the consistency of the replica because only
complete redo logs are committed. The use of redo logs also enables vSphere Replication to create
isolated test images of VMs without affecting ongoing replication.
Pointers to changed blocks are kept in both a memory bitmap and a persistent state file in the
directory of a VM. Because memory contents are always current, the persistent state file represents
the state of the lightweight delta that is being shipped. After a lightweight delta is shipped and
acknowledged, the memory bitmap is copied to the persistent state file and the memory bitmap is
restarted for the next lightweight delta.
The target vSphere Replication server cannot access any hosts in the recovery site cluster.
When replicating encrypted VMs, you must ensure that both vCenter Servers instances connect to
the same key management server (KMS) using the same KMS cluster name. The replica VM must
be encrypted using the same key as the source VM. Keys are identified by vCenter Server using a
combination of both the KMS cluster name and the key identifier. If the vCenter Server instance in
the recovery site cannot retrieve the encryption key, the replica disk cannot be created.
Non-encrypted VMs are replicated over TCP port 31031 and the data is transferred in its
unencrypted form. However, encrypted VMs are replicated over a secure TLS connection using
port 32032 and the data transmission is encrypted end-to-end.
To provide load balancing and scale your environment, deploy multiple vSphere Replication
servers. When configuring vSphere Replication for a VM, you can manually select the vSphere
Replication server to use or the server can be automatically assigned to you.
With automatic assignment, the round-robin method is used and the vSphere Replication server
with the least number of replications is chosen. The number of VMs that a vSphere Replication
server is replicating is displayed to help you with your decision. The vSphere Replication
appliance is in the list of vSphere Replication servers. If a vSphere Replication server fails during
a replication, the replication is resumed when the server is available again. If a replication needs to
be moved to a different vSphere Replication server, the first replication is a full replication.
Each vSphere Replication server supports up to 200 replications. If a site expects to receive more
than 200 replicated VMs, deploy one or more additional vSphere Replication servers at that site.
Additional vSphere Replication servers require 716 MB of RAM.
When you specify a target datastore, a folder with same name as the source VM will be created
where the replica VM files will be stored. If a folder of the same already exists, a folder with a
(1) suffix will be created.
You can configure a different disk format for the replica disk than is configured on the source
disk. For example, the source disk might be thick-provisioned but you may configure a thin-
provisioned disk format for the replica disk to conserve datastore space on the target site.
The Configure datastore per disk option allows you to specify individual datastores for each
replica disk. This option might be important where VM disks are extremely high capacity or have
a high I/O requirement. By putting the high-performance disk on its own datastore and the OS
disks on another datastore, you can ensure that the VM will perform as expected when it is
recovered.
If you select Select seeds, vSphere Replication will search the target datastore for VM disks with
the same names as the source disks and offer these for use as seed disks. Using seed disks can
You can ship a copy of a VM disk to the target site on physical media and copy it to a datastore on
the remote site. This virtual disk file can then be used as a seed disk for the replication job, which
is especially useful when configuring replication for large amounts of data for the first time. For
example, consider you are configuring replication for 10 TB of virtual disk data. Assuming that
only about 70 percent of a link is available for traffic replication, this means that, on a 10 Mb link,
you obtain a link speed of about 3 GB per hour. On a 100 Mb link, you obtain a speed of about 30
GB per hour. To replicate the 10 TB of data on the 100 Mb link would take 333.33 hours or
almost 14 days. It might be more cost-effective and time-efficient to ship that data physically to
the remote site and then transfer only the data that has changed since the disk copy was created.
Take great care when manually specifying virtual disks as seed disks. If the virtual disk you
choose is actually a VM disk in use by a VM in the recovery site, the data will be overwritten by
the replica data from the source VM and you may suffer data loss. You will therefore be prompted
to validate that you have chosen the correct seed disks in the UI before you are allowed to
proceed.
The configurable range for an RPO in vSphere Replication is from 5 minutes to 24 hours.
For example, an RPO of 1 hour seeks to ensure that the VM loses no more than 1 hour of data
during the recovery. For smaller RPOs, less data is lost in a recovery but more network bandwidth
is consumed to keep the replica synchronized.
To connect scheduling to an RPO, vSphere Replication uses the RPO to decide how often to create
an instance. If the RPO is set for 15 minutes, the results at the replica site can never be more than
15 minutes stale. In practice, setting the RPO is difficult because the instance takes time to create.
The instance might take 5 minutes to transfer. By the time that the instance is complete, the data is
already 5 minutes old. To ensure that the data is not more than 15 minutes stale, you might need to
execute a new RPO request 5 minutes before the deadline. The estimated time that the RPO takes
is based on previous times for the VM.
vSphere Replication is designed to leverage overwrites by the guest. The assumption is that,
during an RPO window, the guest overwrites the same block many times. vSphere Replication
transfers only the last update to the block in the RPO window.
An RPO period starts when a consistent copy of a VM has been created on the recovery site. For
example, a VM may have an RPO value of 1 hour. A replication of the VM begins at 12:00. The
replication job will replicate the state of the VM as it existed at 12:00. It takes 10 minutes to
transfer the changed data to the recovery site. The next replication must complete before 13:00 PM
or the usable copy of the VM in the recovery site will be more than 1 hour old. vSphere
Replication uses historical data to estimate how much time the next replication will take and
schedules the start of the next replication to ensure that the replication completes before the RPO
period expires.
For this example, assume that it takes an average of 10 minutes to replicate the VM data for each
replication job. vSphere Replication will start the next replication at least 10 minutes before 13:00
to ensure that a more recent, consistent copy of the VM exists before the copy created at 12:00
becomes more than one hour old. The next replication job may begin at 12:45, meaning that the
next RPO period will expire at 13:45, and the process repeats itself.
RPO scheduling is not a fixed schedule. Configuring an RPO value of 4 hours does not mean that
replications will happen every 4 hours exactly. The host will also consider other VMs that require
An RPO violation occurs when the last consistent copy of the VM is older than the configured
RPO value. An RPO violation might only exist for a few minutes, as a replication job takes longer
than expected. For example, vSphere Replication might schedule a replication job with enough
time based on the previous replication behavior. However, an unpredictably high or unusual
amount of delta data might not finish replicating before the RPO period expires. As an example,
the last consistent data of a VM was created at 12:00 and the VM has an RPO value of 1 hour. If
the previous replications all took 10 minutes to complete, vSphere Replication must start the next
replication at least 10 minutes before 13:00. vSphere Replication begins the replication at 12:45.
However, if the amount of data is significantly higher than normal and takes 20 minutes to
transfer, the replication job will still be in progress at 13:00. This means that there is no usable
copy of the VM that you can recover that is less than 1 hour old and you experience an RPO
violation. When the ongoing replication job completes at 13:05, the last consistent copy of the VM
is the state of the VM at 12:45, and so the RPO violation is cleared. vSphere Replication will use
this data to ensure that the next replication begins even earlier to avoid a repeat of this situation.
When configuring replication for a VM, you can enable multiple point-in-time (MPIT) instances.
vSphere Replication retains these MPIT instances in the form of a redo log or a snapshot. You
must specify the number of instances to retain daily and for how many days. A maximum of 24
instances is supported. For example, you can configure 8 instances per day for 3 days, or you can
configure 24 instances per day for 1 day.
During replication, vSphere Replication replicates all aspects of the VM to the target site,
including potential viruses and corrupted applications. A VM might suffer from a virus or
corruption and you might have configured vSphere Replication to keep MPIT instances
(snapshots). You can recover and revert the VM to a snapshot that is in a good, uncorrupt state.
If you try to retain more snapshots than your RPO allows, vSphere Replication retains all
snapshots up to the configured number of days. For example, if the MPIT retention policy is
configured to retain three snapshots during a one-day period, vSphere Replication retains only one
of the replica snapshots every eight hours. If an eight-hour RPO is configured for replication,
vSphere Replication might retain every replication during the day because approximately three
replicas are made during that day.
Beyond the retention policy, the most recent complete snapshot is always retained to provide the
most up-to-date data available for failover. This most recent complete point-in-time (PIT) instance
is always used for failover. You cannot select an earlier point in time for failover. At the time of
failover, the replicated VM disk (VMDK) file is attached to the replicated VM, and the VM is
powered on. After failing over, you can open the snapshot manager for that VM and select from
the retained historical PIT instances, such as other snapshots.
An administrator chooses how many instances to retain daily, and for how many days. To keep the
snapshot tree manageable and within supported limits, the maximum number of retained points in
time is 24. If you try to retain more days or more instances of data that would put the number of
snapshots over 24, you are blocked.
Many large snapshots of a VM considerably increase the amount of time to commit a snapshot or
revert to a snapshot after a failover occurs. Therefore, use the MPIT retention policy only if
necessary. Additional storage will also be required on the target datastore to accommodate the
snapshot tree.
After a successful recovery, vSphere Replication presents the retained instances as ordinary VM
snapshots. Use vSphere Web Client to view these snapshots. Select one of these snapshots to
revert the VM to a specific point in time. vSphere Replication does not preserve the memory state
when you revert to a snapshot.
Snapshots are named and labeled with a description based on when the replica was created. Up to
24 snapshots can be preserved as points in time. Failover is always to the most recent replica, not
to earlier points in time.
For VMs with high levels of storage I/O, quiescing of the file system and applications can take
several minutes and affect VM performance. When quiescing a file system and applications for
Windows VMs, vSphere Replication requires a regular VM snapshot before replication. When you
estimate the RPO time, consider the time and resource consumption for the quiescing and for the
consolidation of the snapshots. For example, if you configure a replication of a Windows VM with
an RPO of 15 minutes and quiescing is enabled, vSphere Replication generates a VM snapshot
and consolidates it every 15 minutes.
vSphere Replication can be configured to compress the data that it transfers through the network.
Compressing the replication data that is transferred through the network saves network bandwidth
and might help reduce the amount of buffer memory used on the vSphere Replication server.
However, compressing and decompressing data requires more CPU resources on both the source
site and the server that manages the target datastore.
vSphere Replication supports end-to-end compression when the source and target ESXi hosts are
version 6.0 or later. The support of data compression for all other use cases depends on the
You might want to disable replication for a period of time or as a troubleshooting step. You must
determine whether a seed disk was used when the replication was first configured if you want to
retain a copy of the disk on the recovery site. If a seed was not originally used when replication
was first configured, vSphere Replication will attempt to delete all the replica files on the target
datastore. You can prevent this deletion by renaming the folder on the target datastore. When
vSphere Replication attempts to delete the files, the file path will no longer be valid and the
operation will fail, but the replication will be disabled. You must then manually clean up files
related to vSphere Replication in the target datastore.
Protection groups are a way of grouping VMs that would be recovered together. In many cases, a
protection group consists of the VMs that support a service or application such as email or an
accounting system. For example, an application might consist of a two-server database cluster,
three-application servers, and four web servers. In most cases, it would not be beneficial to fail
over part of this application, only two or three of the VMs in the example. Therefore, all nine VMs
would be included in a single protection group.
Creating a protection group for each application or service has the benefit of selective testing.
Having a protection group for each application enables non-disruptive, low-risk testing of
individual applications, allowing application owners to non-disruptively test disaster recovery
plans as needed.
A protection group contains VMs with data replicated by either array-based replication or vSphere
Replication. Before a protection group can be created, replication must be configured.
The minimum required privilege for the user creating the protection group is Site Recovery
Manager > Protection Group > Create. The role named SRM Protection Groups Administrator
has this privilege.
If you created site-wide inventory mappings, they can be used to configure resource mappings for
all members of a protection group. If you do not use site-wide inventory mappings, you configure
mappings for individual VMs in a protection group.
You configure VMs and create protection groups differently depending on whether you use array-
based replication, vSphere Replication, or storage policy protection.
You cannot create protection groups that combine VMs using array-based replication with VMs
using vSphere Replication or storage policy protection. You can include a combination of array-
For array-based replication, Site Recovery Manager computes and creates datastore groups to
collect all files that are associated with protected VMs. A datastore group is the smallest unit of
storage that can be failed over or tested independently. You associate these datastore groups with
protection groups.
With array-based replication, all of the VMs and templates on the datastores in the datastore group
of the protection group are recovered together. The protection group initially contains only those
VMs that store all files on one of the datastore groups that are associated with the protection
group.
If a VMFS datastore is extended to cover two LUNs, and both of those LUNs are being
replicated, Site Recovery Manager includes both LUNs in the datastore group.
If a VM has files spread across two VMFS datastores that are on two replicated LUNs, Site
Recovery Manager includes both LUNs in the datastore group.
This process is automatic. You have control over only two things:
Ensuring that all the LUNs used by a VMFS datastore are replicated
Ensuring that all files for VMs that you want to protect are stored only on replicated LUNs
Protection groups for vSphere Replication are different from array-based protection groups. When
you create a vSphere Replication protection group, you add the VMs that you configured for
vSphere Replication to the protection group.
Storage for VMs in vSphere Replication protection groups can be of the following types:
Direct-attached
NFS
iSCSI
Fibre Channel
vSAN
VM storage policies are an enhancement of VM storage profiles. They ensure that VMs are placed
on storage that guarantees a specific level of capacity, performance, availability, redundancy, and
so on.
When you define a storage policy, you specify storage requirements for applications that would
run on VMs. After you apply this storage policy to a VM, the VM is placed on a specific datastore
that can satisfy the storage requirements.
Datastores can be referenced by tags. Rules based on tags reference datastore tags that you
associate with specific datastores. You can apply more than one tag to a datastore.
Typically, tags serve the following purposes:
Attach a broad storage-level definition to datastores that are not represented by storage
providers, for example, VMFS and NFS datastores.
Encode policy-relevant information that is not advertised through vSphere API for Storage
Awareness, such as geographical location or administrative group.
Storage policy protection groups (SPPGs) enable the automatic protection of VMs that are
associated with a storage policy.
If a datastore is tagged and a storage policy is created that maps to tag categories containing that
tag, then the datastore is automatically associated with that storage policy. An SPPG that includes
that storage policy automatically protects VMs that reside on the datastore.
Before creating an SPPG, you must ensure that the environment meets the following prerequisites:
Configure network mappings. If you use SPPGs and you do not configure network mappings,
testing a recovery plan uses an automatically created isolated network and succeeds with a
warning, planned migration or disaster recovery fail.
Site Recovery Manager automatically recognizes if VMs are connected to port groups that are
backed by NSX universal logical switches and does not require manual network mapping between
the protected and recovery locations. Site Recovery Manager intelligently understands that it is
logically the same network on both sites and therefore automatically links the protected and
recovery networks. You can override auto-mapping by manually configuring network mappings
on stretched networks.
SPPGs support only array-based replication. vSphere Replication is supported only through legacy
VM-based protection groups.
SPPGs try to protect VMs associated with their storage policies only during policy association and
server startup. Site Recovery Manager does not try to periodically protect VMs that are already
associated with a policy.
For more information on the limitations of SPPGs, see Site Recovery Manager Administration at
https://docs.vmware.com/en/Site-Recovery-Manager/.
A protection group for each datastore group should be created. The protection group information
in an array-based protection group includes the following:
Array pair
Datastore group
New VMs created on the datastores in the LUN group are displayed in the protected VMs list. But
you must configure protection for each new VM in the Site Recovery Manager UI.
Only VMs that you configured for vSphere Replication and that are not already in a protection
group appear in the list of eligible VMs.
Placeholder VMs are created in the vCenter Server inventory to indicate that VMs are protected.
However, when using SPPGs, placeholder datastores are not required. Placeholder VMs are not
created or required for this type of protection group.
If Site Recovery Manager cannot map a VM to a folder, network, or resource pool on the recovery
site, the VM status is set to Mapping Missing. A placeholder is not created for the VM.
If placeholder VMs are removed from the inventory, the protected VMs are no longer protected.
Add VMs to an array-based protection group by creating the VMs on a datastore from the
datastore groups associated with the protection group. You can also add VMs to the protection
group by using vSphere Storage vMotion. vSphere Storage vMotion moves VMs onto one of the
datastores that belong to the datastore groups that are associated with the protection group.
When you change the configuration of a VM, such as adding a device on a different datastore,
protection is not guaranteed. You might need to reconfigure protection. You can only add or
remove VMs in an SPPG if recovery has never been run on that protection group. You cannot
modify the settings of a VM that you protect in an SPPG.
If the status of a protection group is Not Configured, site-wide inventory mappings can be used to
configure protection for all unconfigured VMs in a single step.
Editing mappings for protected VMs individually enables the use of different resources on the
recovery site for different VMs.
SPPGs cannot be edited after their initial creation.
A recovery plan includes one or more protection groups. You can include a protection group in
more than one recovery plan. For example, you can create one recovery plan to handle a planned
migration of services from the protected site to the recovery site. You can have another recovery
plan to handle an unplanned event such as a power failure or natural disaster. Having these
different recovery plans enables you to decide how to perform recovery.
Having multiple recovery plans is desirable. One plan should always be for a total site failure.
Other plans might cover smaller problems, such as the failure of a storage array at the protected
site. Proper names and descriptions for protection groups help in the selection process.
Where possible, reuse protection groups across different recovery plans. For example, a total-site
recovery plan can probably include all your protection groups.
Protected sites and recovery sites are accessible at different gateway router addresses on the
Internet. Local networking might need to be configured differently as well.
When your VMs are running on the recovery site, you might need to change their network
configuration. You must start by considering what the current network infrastructure of the
recovery site is. Your VMs might require new gateway addresses and new DNS assignments. If
any of these VMs are firewall VMs, they might require a new rule set.
If your recovery site is to work only as a disaster recovery site, your job is much easier. You might
be able to configure many of the network pieces as they were in the protected site.
In addition to considering things internal to the network infrastructure of the recovery site, you
must also plan how to direct your production traffic to the recovery site. Various methods are
available, but all require planning. The most important consideration is that whatever you are
using as a router or a gateway between the two sites must be outside both sites. Otherwise, if
switches are located at the protected site and this site stops responding, your switches go down
with it.
During a recovery plan test, the recovered VMs are connected to a test network. The test network
can be automatically generated by Site Recovery Manager. You can also specify a data center
network to use as the test network. This isolated test network is managed by its own virtual switch.
In most cases, recovered VMs use this network without having to change network properties like
IP address, gateway, and so on. Site Recovery Manager defaults to an isolated test network so that
an accidental test recovery does not affect production.
VMs that must interact with one another should be failed over to the same test network. For
example, if a web server accesses information on a database, those VMs should fail over together
to the same network. This step enables testing of the functionality of the failed over VMs.
You might wish to override site-level mappings for certain recovery plans. For example, you
might have many VMs connected to a single port group in the protected site. You might have site-
level mappings to use the default auto-generated switch for test recovery. However, you might
create a recovery plan for a critical subset of VMs which must be connected to a usable network in
the recovery site for application testing during test failover. You can configure a recovery that
By ensuring that VMs are stored in a logical fashion on disk according to their protection group,
you minimize the shuffling of VMs to fit optimal layouts for Site Recovery Manager. VM disk
files with a similar priority or files belonging to the same protection group should be stored in the
same datastores. Storing these files on the same datastores minimizes the amount of replication
required to create efficient protection groups and therefore recovery plans. Ensuring that your
storage layout and VM placement are organized this way mitigates many issues.
Site Recovery Manager provides the following specific controls regarding how VMs are
recovered:
Priority groups.
VM dependencies.
Operations before power on: Run a command in the guest OS when it is powered on during a
recovery plan test or disaster recovery event.
To control the startup sequence of VMs, Site Recovery Manager provides the ability to use VM
recovery priority groups and dependencies. The default priority level of each VM is level 3. Group
1 VMs are started first, then Group 2, and so on. If dependencies exist between VMs in the same
priority group, Site Recovery Manager first powers on the VMs on which other VMs depend.
Startup sequence configuration is done as a property of the VM itself. Because the configuration is
a VM property, you have very specific control without trading off the requirement for fast
recovery offered by parallel execution and without requiring multiple recovery plans.
The shutdown sequence for recovery or migration is the reverse of the startup sequence.
Five priority groups are available in a recovery plan. The VMs in group 1 are started first. The
VMs in group 5 are started last. The default priority group is 3. All the VMs within each priority
group start in parallel.
If a VM that is eligible for stretched storage migration has a lower priority than a VM that is not
eligible for stretched storage migration, the eligible VM won’t be migrated.
Configuring VM dependencies enables a user to control the sequence for starting VMs in a
priority group. Dependencies enable you to set a property for a VM that requires one or more other
VMs to be up and running before the VM starts.
The VM dependency option enables you to have a single recovery plan that contains five priority
groups. In each priority group, dependencies are created to control the start sequence of the VMs
in that group. VMs in a priority group start in parallel unless VM dependencies, such as startup
order, are defined.
When considering the use of VM dependencies, be aware that the recovery time objective (RTO)
might be affected. That is, instead of performing tasks in parallel, you start setting dependencies
for and start ordering all of your VMs. As an example, you have ten VMs that start in parallel and
the RTO is 5 minutes. Adding a single dependency might add 2 minutes if one VM waits for the
dependent VM to complete booting. If every VM depends on another VM, the effect is a
sequential boot order. Your 5-minute RTO might need to be adjusted to 25 minutes.
Shutdown actions are used to power off VMs at the protected site only during recovery plan
execution.
You can configure a timeout value for VM shutdown to complete. VMware Tools must be
installed to begin the shutdown. If you are executing a recovery plan, that is, disaster recovery
mode, the VM is powered off if the shutdown operation fails.
Startup actions are used to power on VMs at the recovery site during test and execution of
recovery plans. When a VM is powered on, you can select the option to wait for heartbeats from
VMware Tools for a defined period of time. You can also add a second delay before running post-
power-on steps and starting dependent VMs. These options can increase the RTO because they
might cause unnecessary delay if the VM has a permanent error preventing VMware Tools from
starting. Take caution that these settings do not adversely affect your RTO.
When a VM is powered on during a recovery plan, you can configure commands to run either on
the recovered VM or on the Site Recovery Manager server host. You can also create an action that
pauses the execution of the recovery and interactively prompts the user for input before allowing
the recovery plan to continue execution.
Site Recovery Manager can avoid any downtime for eligible VMs by performing a cross-vCenter
Server migration of the VM during planned migration. The VM must meet certain prerequisites to
enable this functionality.
During planned migrations, you must select the Enable vMotion of eligible VMs option during
the recovery plan execution. Only VMs that have been enabled for vMotion will be migrated. If a
VM that is eligible for stretched storage migration has a lower priority than, or a dependency on, a
VM that is not eligible for stretched storage migration, the eligible VM will not be migrated.
The Auto option will use the recovery site's recovery.useIpMapperAutomatically advanced
setting to determine if the IP Subnet Mapping rule should be evaluated and applied to the VM
during recovery. This setting is true by default. If the value is set to false, the IP Subnet Mapping
rule will not be applied to any VMs with the Auto option enabled for IP customization.
The Use IP customization rules if applicable option will force Site Recovery Manager at the
recovery site to evaluate and apply the IP Subnet Mapping rule for the network to which the VM
is connected, regardless of the value of the recovery.useIpMapperAutomatically setting on the
recovery site.
The Manual IP customization option allows you to configure individual rules for the VM when
subnet mappings are not available for the network port group or if you want to override the subnet
mapping for a particular VM. You must enter IP settings that will be applied to the VM for each
site, which allows the original settings to be reapplied to the VM if it is returned to the original site
after a recovery event.
The No IP Customization option disables IP Customization for the chosen VM.
VMs with manually-defined IP customization are not subject to the IP Mapping Rule evaluation
during recovery. Manually-specified IP configuration takes precedence over IP mapping rules.
You configure IP settings for both the protected and recovery sites. This configuration allows a
VM to be failed back to the original protected site following recovery and have the original IP
settings reapplied in the protected site.
IP settings that can be configured include IPv4, IPv6, and DNS settings.
When configuring the IP settings for the protected site, you can click the RETRIEVE button to
query the VM for its current settings and use the retrieved settings for the protected site. VMware
Tools must be installed and running in the VM to use the RETRIEVE option. You should verify
all retrieved settings for accuracy and manually correct or enter any missing values.
Site Recovery Manager uses the Guest Operations API and requires functioning VMware Tools in
the recovered VM and the vgauth service to be enabled or running. The vgauth service is installed
as part of most VMware Tools distributions.
To view the steps performed during a particular workflow, click the Recovery Steps tab on the
recovery plan overview page in the Site Recovery client.
An example of a command that would run on a VM is a command that changes the network
settings on a VM. If your VM has a Linux-based guest OS, you might run a script with commands
to configure network settings.
The Site Recovery Manager server also provides a small set of commands that can be run.
Continuing with the example of updating DNS, use the dns_update command to update a
single DNS entry. The dns_update command is intended for use by Site Recovery Manager
administrators. These administrators must be familiar with DNS servers and DNS records, as well
as how hosts on the recovery network use DNS for name resolution.
Suspending VMs is useful in an active-active data center where noncritical workloads are run at
recovery sites. Allowing Site Recovery Manager to suspend VMs hosting noncritical workloads
frees up CPU and memory resources for the recovered VMs.
You might want to configure VMs to be suspended at both sites. However, identify only VMs that
are suspended at the recovery site. To configure VMs to suspend at both sites, you must recover
your protected VMs to the opposite site and reprotect before adding VMs to suspend at the new
recovery site. After this configuration is established, the plan resumes VMs at one site and
suspends them at the other with each recovery.
Planned migration provides better support for data center migrations by ensuring no loss of data
during the migration process and recovery of application-consistent VMs at the recovery site.
During planned migration, protection groups and their VMs are temporarily displayed in a
partially recovered state.
Site Recovery Manager supports active-active stretched storage between protected and recovery
sites by using an orchestrated cross-vCenter Server vMotion to perform planned migrations,
eliminating service downtime. Disaster recovery and test recovery continue to use the existing
LUN-based recovery functionality.
A planned migration using orchestrated vMotion provides the ability to proactively migrate VMs
from a vCenter Server system to another vCenter Server system to carry out maintenance on one
site or to mitigate a possible disaster without incurring downtime for the VM application
workload.
A planned migration using orchestrated vMotion is only available when using stretched storage,
storage policy protection groups (SPPGs), and vCenter Servers in Enhanced Linked mode with
stretched networks, with the assumption that the IP addresses of the VMs don’t change.
vSphere vMotion in planned migration mode can be overridden for individual VMs or for the
recovery plan (if vSphere vMotion would take too long). Along with VMs that are not compatible
The Site Recovery Manager test environment provides a perfect location for conducting OS and
application upgrades as well as patch testing. Test environments are complete copies of production
environments configured in an isolated network segment, which ensures that testing is as realistic
as possible and doesn’t affect production workloads or replication.
Running a workflow as a disaster recovery event attempts synchronization and final protection but
continues to run regardless of errors with synchronization, VM shutdown, and so on. This
approach is designed to give the quickest recovery time during a crisis.
Running a workflow in planned migration mode executes the same workflow as the disaster
recovery event with one major difference, which is that the workflow pauses if errors are
encountered, enabling you to rectify the situation. For example, if a VM does not shut down
correctly, a planned migration pauses the workflow and enables you to manually power off the
VM. After you power off the VM, the workflow continues.
vSphere DPM and vSphere DRS are enabled on a recovery site cluster. After the recovery hosts
are powered on, Site Recovery Manager relies on vSphere DRS to manage the assignment of VMs
to hosts in the cluster.
After the recovery or test is complete, Site Recovery Manager reenables DPM for the cluster. But
hosts in the cluster are left in the running state so that DPM powers them down as needed. If these
services are not running, Site Recovery Manager registers VMs across available ESXi hosts to try
to distribute the potential load as evenly as possible. For example, if a protection group contains
ten VMs and the recovery site cluster contains five hosts, Site Recovery Manager registers two
VMs on each host. Site Recovery Manager registers the VMs to try to distribute the VMs during
recovery so that no single host is restarting all of them.
vSphere Replication supports the protection of VMs with snapshots but you can only recover the
latest snapshot. Because vSphere Replication erases snapshot information in the recovered VM,
snapshots are no longer available after recovery.
Array-based replication supports the protection and recovery of VMs with snapshots. You can
specify a custom location for storing snapshot delta files by setting the workingDir parameter
in the VM configuration file (.vmx). Site Recovery Manager does not support the use of the
workingDir parameter.
If you do not test recovery plans, an actual disaster recovery situation might not recover all VMs,
resulting in data loss.
Testing a recovery plan exercises nearly every aspect of a recovery plan, although Site Recovery
Manager makes several concessions to avoid disrupting ongoing operations on the protected and
recovery sites. Recovery plans that suspend local VMs do so for tests and for actual recoveries.
With this exception, running a test recovery does not disrupt replication or ongoing activities at
either site.
The Site Recovery Client offers several ways to initiate a recovery plan test. You can also initiate
a test through a call to the API.
To ensure complete data accuracy, all workflows attempt preliminary data synchronization before
running and potentially a final data synchronization after the protected VMs are shut down.
Workflows can be run more than once to attempt completion of recovery plans that might have
encountered an error.
When using SPPGs, test recovery is performed by using the regular test recovery workflow for
replicated devices, including stretched devices. vSphere vMotion compatibility checks are
performed for each VM on the stretched devices.
Select the Replicate recent changes to recovery site check box to ensure that the recovery site
has the latest copy of the protected VMs. Selecting this option might cause the synchronization
process to take more time.
Replication and protection of the protected environment are not affected during tests. Temporary
snapshots of replicated storage are created at the recovery site. For array-based replication, the
arrays are rescanned to discover replicated VMFS datastores.
When testing a recovery plan, you can choose an option to replicate recent changes, which is
enabled by default. Replicating recent changes provides the latest data for the testing process.
However, it also lengthens the time required to recover VMs in the recovery plan, because
replication needs to finish before the VMs are recovered.
Replication continues during the test of a recovery plan. Site Recover Manager uses snapshots as
part of the recovery plan test process. These are either array snapshots (or clones) with array
replication or VM snapshots with vSphere Replication. This approach allows powering on and
modifying VMs recovered as part of the test while replication continues to avoid RPO violations.
When using SPPGs, test recovery is performed by using the regular test recovery workflow for
replicated devices, including stretched devices. If the array does not support creating read/write
Testing a recovery plan exercises nearly every aspect of a recovery plan and avoids disruption to
production site operations.
When you cancel a recovery plan test, you are warned that manual steps might be required. These
manual steps are used to verify that the cancel operation properly terminated all steps in the
recovery plan and restored the environment to its original state.
No recovered VMs associated with the recovery plan that was tested are left running.
Placeholder VMs are recreated on the recovery site. The placeholder VMs indicate that those
VMs are protected on the protected site and are instantiated on the recovery site when a
recovery plan is executed.
If you experience errors during the cleanup process, select the Force Cleanup check box to ignore
all errors and return the recovery plan to the ready state. If you use this option, you might need to
clean up your storage manually. You should also run another recovery plan test as soon as
possible.
The steps begin executing after you respond to the prompt that appears after a recovery plan test
finishes, or when you begin a cleanup operation.
The first step in running a recovery plan is the attempt to synchronize storage. Then, protected
VMs at the protected site are shut down. This effectively quiesces the VMs and commits final
changes to disk as the VMs complete the shutdown process. Storage is synchronized again to
replicate changes made during the shutdown of the VMs.
Replication is performed twice to minimize downtime and data loss.
When you start recovery plan execution, the Recovery dialog box warns that you are about to run
the recovery plan. Running the recovery plan results in changes to the protected VMs and to the
infrastructure of both the protected site and the recovery site data centers. Select the check box to
confirm that you understand the implications of running the recovery plan. You choose to run the
recovery plan as either a planned migration or as a disaster recovery.
If your array supports stretched storage, select the Enable vMotion of eligible VMs check box.
The Recovery dialog box provides a summary of the recovery plan information. This summary
includes the following information:
The connectivity status from the recovery site to the protected site
Both types of recovery plan execution try to synchronize storage twice during the recovery
process. Attempts to synchronize data are made to ensure application consistency. Data
synchronization executes as an early initial step in a recovery plan after an attempt to shut down
the protected VMs. This part of the process attempts to ensure that data is recent and synchronized
after the VMs are quiescent.
In the case of planned migrations, a recovery stops replication after a final synchronization of the
source to the target. For disaster recoveries, VMs are restored to the most recent available state, as
determined by the recovery point objective (RPO). After the final replication is complete, Site
Recovery Manager makes changes at both sites, which requires significant time and effort to
reverse. Because of this time and effort, the privilege to test a recovery plan and the privilege to
run a recovery plan must be separately assigned.
After all the protected VMs are failed over and reported as powered on, you are ready to start
verifying that all application services restarted cleanly at the recovery site. After you complete the
verification of the failed-over application services at the recovery site, you are in a position to
report the successful failover. You report the successful failover to the business and enable the
Use forced recovery in cases where problems exist in communicating with the protected site, such
as sporadic network connectivity, a failed vCenter Server system, or storage failures. As a result,
protected VMs are unmanageable and cannot be shut down, powered off, or unregistered. In such
a case, the system state cannot be changed for extended periods. To resolve this situation, forced
recovery is used. Forced recovery does not complete the process of shutting down the VMs at the
protected site. Therefore, a split-brain scenario occurs, but the recovery might complete more
quickly.
Running forced recovery with array-based replication can affect the mirroring between the
protected and the recovery storage arrays. After you run forced recovery, you must check that
mirroring is set up correctly between the protected array and the recovery array before you can
perform further replication operations. If mirroring is not set up correctly, you must repair the
mirroring by using the storage array software.
When you enable forced recovery, outstanding changes on the protection site are not replicated to
the recovery site before the failover sequence begins. Replication of the changes occurs according
to the RPO period of the storage array. You might have the situation where a new VM or template
After enabling this option, the forced recovery option appears on the Confirmation Options page.
VMware recommends that you protect the new production site to some other site immediately
after a recovery. If the original production site is operational, protect the new production site by
using the original production site settings, effectively reversing the direction of protection.
Although protection is reestablished in the opposite direction by recreating all protection groups
and recovery plans, the process is time consuming and prone to error. To address this issue, Site
Recovery Manager provides an automated way to achieve reprotection.
The reprotect operation enables you to protect recovered VMs (after a recovery) back to the
original protected site, including reversing the direction of replication. The reprotect operation
uses the protection information that was established before a recovery to reverse the direction of
protection. Perform a reprotect operation after a recovery is complete.
Automatic reprotection of the environment is supported for protection groups that use either
vSphere Replication, array-based replication, or SPPGs.
Failback is a term for a collection of procedures that is used to restore the original configuration of
the protected site and the recovery site after a recovery. Configure and run a failback procedure
when you are ready to restore services to the protected site. When errors occur during a failback,
you must resolve those errors and repeat the failback until the process is completed successfully.
A failback procedure uses the reprotect workflow and the planned migration workflow. An
example of this is a disaster avoidance situation. The threat could be rising floodwaters from a
major storm and Site Recovery Manager is used to migrate VMs from the protected site to the
recovery site. Fortunately, the floodwater subsides before damage is done, leaving the protected
site unharmed. A recovery plan cannot be immediately failed back from the recovery site to the
original protected site. The recovery plan must first undergo a reprotect workflow. This operation
involves reversing replication and setting up the recovery plan to run in the opposite direction.
The original state depicts Site A as the original and Site B as the recovery site. Next, the event
happens as shown in example two. If site A goes offline, Site Recovery Manager recovers the
protected VMs to site B. After the recovery, the protected VMs from site A start on site B without
protection. At this time, Site A is abandoned and Site B is unprotected.
When site A comes back online, a reprotect operation is run to protect the recovered VMs on site
B as depicted in example 3. Site B becomes the new protected site and site A becomes the new
recovery site. Site Recovery Manager reverses the direction of replication from site B to site A.
Reprotection is available only in noncatastrophic failures. The original vCenter Server systems,
ESXi hosts, Site Recovery Manager server hosts, and corresponding databases must be eventually
recoverable. Configure Site Recovery Manager so that the entire environment that is recovered can
be reprotected again back to the initial site.
When you rerun a recovery, operations that succeeded previously are skipped. For example,
successfully recovered VMs are not recovered again and continue running without interruption.
For the original protected site to be deemed available, the vCenter Server instances, ESXi servers,
Site Recovery Manager server hosts, and corresponding databases must all be recoverable.
To unpair and recreate the pairing of protected and recovery sites, both sites must be available. If
you cannot restore the original protected site, you must reinstall Site Recovery Manager on the
protected and recovery sites.
When a user initiates the reprotect operation, Site Recovery Manager instructs the underlying
arrays and vSphere Replication to reverse the direction of replication.
In array-based replication, the reprotect operation uses the SRAs to communicate with the arrays
that are associated with the protection groups in the recovery plan. This communication ensures
that replication is established in the reverse direction. With array-based replication, you might
want to ensure that the reprotect has successfully reversed the direction of replication and that a
failback is successful. You must return to the Array Managers panel in vSphere Web Client and
examine the direction of replication for each relevant device.
When you perform reprotect on a recovery plan that includes an SPPG, the replication technology
that your storage arrays provide reverses the replication of all of the consistency groups that are
associated with the storage policies that the protection group contains.
After the replication is reversed, Site Recovery Manager creates placeholder VMs at the new
recovery site, that is, the original protected site. When creating placeholder VMs, Site Recovery
Manager uses the location of the original production VM to determine where to create placeholder
In the reprotect process using storage policy protection, Site Recovery Manager reverses the
direction of replication and protects the VMs that are associated with the relevant storage policies
on what was previously the recovery site. Site Recovery Manager reestablishes vSphere entity
protection and monitoring on the new protected site.
Reversing the replication of an SPPG is the same as reversing the replication of an array-based
replication protection group because it only affects the underlying storage. When you perform a
reprotect operation on a recovery plan that includes an SPPG, the replication technology that your
storage arrays provide reverses the replication of all of the consistency groups that are associated
with the storage policies that the protection group contains.
If the storage arrays fail to reverse replication for consistency groups in the protection group, the
recovery plan goes into the Incomplete Reprotect state. In this state, you must resolve the storage
issues and run the reprotect operation again. Rerunning the reprotect operation on an SPPG only
affects the direction of replication of consistency groups for which a previous reprotect operation
did not complete successfully.
If reprotection fails, or succeeds partially, perform one of the following remedial actions to
complete the reprotection:
Incomplete Reprotect: If a reprotect operation fails to synchronize storage, ensure that sites
are connected, review the reprotection progress in the vSphere Web Client, and start the
reprotection task again.
If Site Recovery Manager fails to create placeholder VMs, recoveries are still possible.
Review the reprotect steps in vSphere Web Client, resolve open issues, and start the
reprotection task again.
Reprotect Interrupted: Ensure that both Site Recovery Manager server hosts are running and
start the reprotection task again.
Failback refers to the capability of running a recovery plan after an environment is migrated or
failed over to a recovery site. Failback returns the environment to its starting site. A failback is the
process of running the same recovery plan that was used to migrate the environment to the
recovery site in the first place.
Perform a reprotect operation. The recovery site becomes the protected site. The former
protected site becomes the recovery site.
Run a recovery plan to shut down the VMs on the protected site and start the VMs on the
recovery site.
To avoid interruptions in VM availability, you might run a test before you complete the
planned migration. If the test identifies errors, resolve them before you perform the planned
migration.
Perform a second reprotect to revert the protected and recovery sites to their original
configuration before the recovery.
Configure and run a failback when you are ready to restore services to the original protected site,
after you have brought it back online from an incident.
Planned migration and failback do not obviate the need for extensive testing of recovery plans.
Even after a successful failover, you have no guarantee that the reprotect and failback are
successful because the environment at the original site might have changed. Testing the failback
process is as important as testing the original failover.
You might have a situation where you fail back to a site whose equipment and configuration are
still in a good state after the disaster recovery event. However, if you suffer a total site loss,
additional steps must be followed before you fail back to that site. You must take steps to recreate
the environment at the lost site before beginning failback. If the equipment is completely replaced,
a reprotect and failback is not an option because the array pairs have changed and the protection
groups must be recreated.
Site Recovery Manager supports event logging. Each event includes a corresponding alarm that
Site Recovery Manager can trigger if the event occurs. This provides a way to track the health of
your system and to resolve potential issues before they affect the protection that Site Recovery
Manager provides.
In an environment with more than one vCenter Server, Site Recovery Manager displays all events
from the Site Recovery Manager Servers that are registered as extensions, even if events for a
specific vCenter Server are selected.
As a vCenter Server extension, Site Recovery Manager adds its own alarms to the alarms provided
by vCenter Server. None of the Site Recovery Manager alarms are configured by default to take
action. You must configure an alarm to enable actions for it.
Alarm events can be categorized according to functional areas:
Site Status
Recovery Events
SNMP Traps
Licensing Events
Permissions Events
To configure an alarm, create a new alarm definition and then add a specific Site Recovery
Manager event trigger to it.
During the creation process, add the alarm name, description and select the object to monitor. The
alarm is enabled by default. Next, select and add the specific event triggers.
Each event represents a single Site Recovery Manager instance and triggers an alarm for the
extension with which it is registered.
Add a condition that triggers the alarm, and select an argument from the drop-down list, the
operator, and the transition from warning to critical condition.
Finally, specify the actions to take when the alarm state changes. For example, by sending an
email.
For alarms to send email notifications, configure the Mail settings in the vCenter Server Settings
menu.
Alarms exist on both the protected site and the recovery site. Configuring alarms on the correct
site is necessary. For example, configuring the Remote Site Down alarm to send an email
notification when the alarm triggers on the protected site alerts only when the recovery site is
down.
See Recommended Alarms for SRM Admins to Watch at
http://blogs.vmware.com/vsphere/2011/02/recommended-alarms-for-srm-admins-to-watch.html.
Site Recovery Manager generates a report for each recovery plan operation: test, cleanup,
recovery, and reprotect.
The report is accessible through the History tab. View the report by clicking View under the
Actions column. Individual reports, or the entire list, can be exported from the user interface.
History reports include who started the process as well as information about storage
synchronization.
Further detail is included in history reports, such as which protection groups and which devices,
hosts, and datastores contribute to the recovery plan that was executed. Duration detail for each
step is easily found for tweaking or audit purposes.
Reporting in traditional DR was time consuming, with thick binders and stacks of report
documents to manage. Site Recovery Manager saves times on reporting, especially when the
reports are mandatory a few times a month.
When workflows such as a recovery plan test and cleanup are performed in Site Recovery
Manager, history reports are automatically generated. These reports document items such as the
workflow name, execution times, successful operations, failures, and error messages. History
reports are useful for several reasons including internal auditing, proof of disaster recovery
protection for regulatory requirements, and troubleshooting. Reports can be exported to .html,
.xml, .csv, or a Microsoft Excel or Word document.
The default Site Recovery Manager settings should be adequate for the majority of users and
environments. However, there may be environments where timeout values may need to be
adjusted on teh recommendation of a storage vendor, for example. Larger environments may also
require the adjustment of advanced settings to prevent issues when large numbers of virtual
machines are powering on simultaneously during recovery plan execution.
Advanced settings must be applied on each Site Recovery Manager server separately. It is good
practice to make the same modifications on both sites.
Advanced settings are not preserved when Site Recovery Manager is upgraded or redeployed with
an existing database.
You can configure logging levels for Site Recovery Manager components in the Advanced
Settings > Log Manager view in the Site Recovery client.
In most cases, the default logging levels should be adequate, but you may be requested by
VMware Support to increase the level for task that is failing. For example, if IP Customization is
failing for a number of VMs, you may need to change the logManager.IPCustomizer logging level
to trivia on the recovery site Site Recovery Manager server.
To help identify the cause of any problems you encounter during the day-to-day running of Site
Recovery Manager, you might need to collect Site Recovery Manager log files to review or send
to VMware Support.
Site Recovery Manager creates several log files that contain information that can help VMware
Support diagnose problems. You can use the Site Recovery Manager log collector to simplify log
file collection.
The Site Recovery Manager Server and client use different log files.
The vRealize Operations Management Pack for Site Recovery Manager 8.2 allows administrators
to monitor the local Site Recovery Manager services in vRealize Operations Manager.
The vRealize Operations Management Pack for Site Recovery Manager provides capabilities for
monitoring the connectivity between Site Recovery Manager instances, the availability of a remote
Site Recovery Manager instance, and the status of protection groups and recovery plans in Site
Recovery Manager.
Alarms are generated when Site Recovery Manager server connectivity issues occur or protection
groups and recovery plans are in an error state. The UI provides statistics for the number of objects
related to Site Recovery Manager and how many of them have errors.
www.vmware.com/education
Task 1: Create a Protection Group for an Individual VM ............................................ 45
Task 2: Create an Array-Based Replication Protection Group ................................... 46
Task 3: Edit a Protection Group ................................................................................. 46
Task 4: (Optional) Add an ISO from a Nonreplicated Datastore to an ABR-Protected
VM ................................................................................................................. 47
Lab 8 Building Recovery Plans ....................................................... 49
Task 1: Create a Recovery Plan................................................................................. 49
Task 2: Configure VM Recovery Properties ............................................................... 50
Task 3: Add a Custom Step ........................................................................................ 53
Lab 9 Performing a Recovery Plan Test and Cleanup ................... 55
Task 1: Test a Recovery Plan .................................................................................... 55
Task 2: Monitor the Recovery Steps .......................................................................... 56
Task 3: Verify the Recovery Site Inventory After the Test Recovery ......................... 56
Task 4: Clean Up the Recovery Plan Test ................................................................. 57
Lab 10 Performing a Planned Migration ......................................... 59
Task 1: Execute a Recovery Plan Using the Planned Migration Option .................... 59
Task 2: Verify Inventory Changes After Recovery ..................................................... 60
Lab 11 Reprotecting and Failing Back ............................................ 62
Task 1: Perform a Reprotect Operation...................................................................... 62
Task 2: Fail Back the VMs to the Original Protected Site........................................... 63
Typographical Conventions
The steps in this lab are already performed on the recovery site vCenter Server.
In the BGInfo displayed on the desktop for this beta lab. the title of the course is
incorrect. The title will be corrected in the GA version of the course.
2. On the student desktop, open a Chrome browser window to vSphere Client (SA-VCSA-
01), using the shortcut in the Site-A Systems folder.
7
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
b. If the Open vmware-cip-launcher.exe? message box appears, select the Always open
these types of links in the associated app check box and click Open vmware-cip-
launcher.exe.
8
2. In the left pane, click the VMs and Templates icon.
3. Expand sa-vcsa-01.vclass.local.
a. Verify that the following folders are present:
• Internal_VMs containing four VMs
• Internal_VMs containing four VMs
• Retail_Services containing three VMs
4. In the left pane, click the Datastores icon.
5. Expand sa-vcsa-01.vclass.local and confirm that the following datastores are accessible:
• Local-SA-ESXi-09
• SA-Internal-DS-01
• SA-MGMT-DS-01
9
• SA-Placeholder
• SA-Retail-DS-01
• SA-Retail-DS-02
• Unity-DS
• SB-MGMT-DS-01
• SB-Placeholder
• SB-Unity-DS
• SB-VR-Target-DS-01
6. If any datastores are not visible or are inaccessible, you must perform a rescan of the
storage on each cluster.
a. In the left pane, click the Hosts and Clusters icon.
b. Right-click each cluster and select Storage > Rescan Storage.
c. Ensure that all check boxes are selected and click OK.
10
d. Repeat for each cluster in both vCenter Server inventories.
7. Select the Hosts and Clusters tab and select sb-vcsa-01.vclass.local > SB-DC-01 > SB-
MGMT-01.
8. Power on the sb-srm-01 VM.
9. Wait one minute and then power on sb-vr-01 VM.
10. Minimize vSphere Client.
7. Click Next.
8. In the Select a name and folder window, specify the values for the VM.
a. Virtual Machine name: Enter sa-srm-01.
b. Virtual Machine folder location: Select sa-vcsa-01.vclass.local > SA-DC-01 >
SA_MGMT_VMs.
9. In the Select a compute resource window, verify that SA-MGMT-01 is selected and click
NEXT.
10. Review the details and click NEXT.
11. Accept the license agreement and click NEXT.
12. In the Configuration window, select the 2 vCPU radio button and click NEXT.
13. In the Select storage window, select Thin Provision in the Select virtual disk format
drop-down menu.
a. Verify that the SA-MGMT-DS-01 datastore is selected.
11
b. Click NEXT.
14. In the Select networks window, configure the properties from the drop-down menus for
each option.
a. Network 1: Select VM Network.
b. IP allocation: Select Static - Manual.
c. IP protocol: Select IPv4.
d. Click NEXT.
15. In the Customize template window, configure the values for each option.
a. Enable SSHD: Leave the default (disabled).
b. Initial root password: Enter VMware1!.
c. Initial admin user password: Enter VMware1!.
d. NTP Servers: Enter 172.20.10.10.
e. Hostname: Leave blank.
f. Initial database password: Enter VMware1!.
g. File Integrity Flag: Leave the default (disabled).
h. Default Gateway: Enter 172.20.10.10.
i. Domain Name: Enter vclass.local.
j. Domain Search Path: Enter vclass.local.
k. Domain Name Servers: Enter 172.20.10.10.
l. Network 1 IP address: Enter 172.20.10.20.
m. Network 1 Netmask: Enter 255.255.255.0.
16. Click NEXT.
17. Review the settings to ensure that the disks are thin-provisioned and that Network 1 uses
the VM Network port group.
18. Click FINISH.
19. Wait for the Deploy OVF Template task to complete and then power on the sa-srm-01
VM.
12
Lab 2 Configuring the Site Recovery
Manager Appliance
13
Q1. What is the host name?
1. The host name is sa-srm-01.vclass.local.
Q2. Did you enter the host name during the OVF deployment?
2. No. The field was left blank.
Q3. From where does Site Recovery Manager get its host name?
3. When the host name field is left blank, Site Recovery Manager performs a reverse lookup of its IP address and retrieves its host name.
1. In the left pane of the Site Recovery Manager appliance management interface, select
Summary.
2. In the right pane, click CONFIGURE APPLIANCE.
3. In the 1 Platform Services Controller window, enter the credentials.
a. PSC host name: Enter sa-vcsa-01.vclass.local.
b. PSC port: Leave the default (443).
c. User name: Enter [email protected].
d. Password: Enter VMware1!.
4. Click NEXT.
5. In the Security Alert pop-up box, click CONNECT.
6. In the 2 vCenter Server window, verify that sa-vcsa-01.vclass.local is selected and click
NEXT.
7. In the Security Alert pop-up box, click CONNECT.
8. In the Name and extension window, enter the values for each option.
a. Site Name: Enter SiteA-SRM.
b. Administrator email: Enter [email protected].
c. Local host: Leave the default (sa-srm-01.vclass.local).
d. Extension ID: Leave the default (Default extension ID (com.vmware.vcDr)).
14
9. Click NEXT.
10. In the Ready to complete window, review the details and click FINISH.
11. Wait for the task to complete.
12. In the left pane, click Services.
Q1. What is the status of the srm-server service?
1. The appliance is configured and registered, so the srm-server service must be in a Started state.
3. In the Site Recovery pane, review the information about the vCenter Server instances in
the vCenter Single Sign-On domain with registered Site Recovery Manager extensions.
15
Q1. How many Site Recovery Manager instances are registered in the vCenter Single
Sign-On domain and what is their status?
1. Two instances of Site Recovery Manager are registered in the vCenter Single Sign-On domain: one registered to sa-vcsa-01.vclass.local and one registered to sb-vcsa-01.vclass.local. They must both show the status OK.
16
Lab 3 Configuring Inventory Mappings
and Placeholder Datastores
17
4. In the Resource Mappings pane, ensure that sa-vcsa-01.vclass.local is selected and click +
NEW.
5. Expand the inventories of both sites to view the resource pools of SA-Prod-Cluster and
SB-Prod-Cluster.
6. Select SA-Prod-Cluster in the left panel and SB-Prod-Cluster in the right panel to map
the protected site cluster to the recovery site cluster.
7. Click ADD MAPPINGS.
8. Repeat steps 6 and 7 for the resource pools:
a. Map Internal_Services on the protected and recovery sites.
b. Map Retail_Services on the protected and recovery sites.
c. Map SiteA_Test_RP on the protected and recovery sites.
18
9. Click NEXT.
10. In the Reverse mappings window, select all the mappings to create the reverse mappings
from the recovery site resources to the protected site resources.
19
2. Verify that sa-vcsa-01.vclass.local is selected and click + NEW.
3. In the Creation Mode window, select the Automatically prepare mappings for folders
with matching names radio button and click NEXT.
4. In the Recovery folders window, select the datacenter (DC) object in each site and click
ADD MAPPINGS.
Q1. How many mappings are automatically discovered?
1. Four mappings are automatically discovered. These mappings include the data center objects that you manually selected, which are folders in themselves, and three subfolders with matching names.
Q2. How many folders are not automatically mapped from the protected site
inventory?
2. One folder called SA_MGMT_VMs is not automatically mapped because the recovery site inventory does not include a folder of the same name.
b. Click Remove.
6. In the Reverse mappings window, select all the folder mappings and then click NEXT.
7. Click FINISH.
8. Verify that the folder mappings appear correctly.
20
Task 3: Configure Network Mappings and an IP
Customization Rule
You configure network port group mappings and create an IP customization rule on a mapped
port group.
1. In the left pane of the Site Recovery client, click Network Mappings.
2. Click + NEW.
3. In the Creation Mode window, select the Prepare mappings manually radio button and
then click NEXT.
4. Create the mapping between the distributed port groups.
a. In the left panel, select sa-vcsa-01.vclass.local > SA-DC-01 > sa-vds-01 and select
the port group ProductionVMs.
b. In the right panel select sb-vcsa-01.vclass.local > SB-DC-01 > sb-vds-01 and select
the port group Recovered_ProductionVMs.
c. Click ADD MAPPINGS.
d. Click NEXT.
5. Create the reverse mapping by selecting the mapped port group and click NEXT.
6. In the Test Networks window, verify that Isolated network (auto created) is selected as
the Test Network and click NEXT.
7. Click FINISH.
8. Verify that the network mappings appear correctly.
9. Select the network mapping that you just created and click ADD.
21
10. Create an IP customization rule for the subnets used by the network port groups.
a. Enter 172.20.11.0/24 as the subnet for the ProductionVMs network.
b. Enter 172.20.111.0 as the subnet for the Recovered_ProductionVMs network.
You cannot customize the recovery site subnet mask. The subnet mask for each
network must be the same value.
22
14. In the Network Mappings pane, select sb-vcsa-01.vclass.local.
15. Click ADD next to IP Customization to create an IP Customization rule for the reverse
mapping.
16. Create an IP customization rule for the subnets used by the network port groups.
a. Enter 172.20.111.0/24 as the subnet for the Recovered_ProductionVMs network.
b. Enter 172.20.11.0 as the subnet for the ProductionVMs network.
17. Configure the settings for the recovery network.
a. Gateway: 172.20.11.10.
b. DNS addresses: 172.20.11.10.
c. DNS suffixes: vclass.local.
d. Primary WINS server: Leave blank.
e. Secondary WINS server: Leave blank.
18. Click ADD.
1. In the Site Recovery client, select Placeholder Datastores in the left pane.
2. In the right pane, verify that sa-vcsa-01.vclass.local is selected and click + NEW.
23
3. From the list of datastores, select SA-Placeholder and click ADD.
4. In the Placeholder Datastores window select sb-vcsa-01.vclass.local and click + NEW.
5. From the list of datastores, select SB-Placeholder and click ADD.
Q1. On which host or cluster is the placeholder datastore accessible?
24
1. The datastore is accessible on SB-Prod-Cluster, which you configured as the mapped resource for the protected site cluster, SA-Prod-Cluster.
1. On the student desktop, open the Chrome browser and connect to the Site Recovery
Manager appliance management interface at https://sa-srm-01.vclass.local:5480/configure.
2. Log in with the following credentials:
Username: admin
Password: VMware1!
3. Select Storage Replication Adapters.
25
4. Click NEW ADAPTER.
5. In the pop-up box, click UPLOAD.
6. Browse to C:\Materials\Downloads\EMC Unity SRAs, select unityblocksra.tar,
and click Open.
7. Wait for the upload to complete and click CLOSE.
Because beta code and beta SRA images are used, the browser might incorrectly
report a failure with the import.
9. If the image does not appear after a browser refresh, retry steps 7 and 8.
26
4. Select sb-vcsa-01.vclass.local to view the currently installed SRAs in SiteB.
Q1. What is the status of the SRA?
1. The SRA appears with the status Unable to find SRA at the paired site.
27
5. Click NEXT.
6. In the Remote array manager window, configure the options.
a. Name for the array manager: Enter SiteB-UnityVSA.
b. Management IP/Hostname: Enter 172.20.110.70.
c. Filter by name (Optional): Leave blank.
d. Username: Enter admin.
e. Password: Enter VMware1!.
7. Click NEXT.
8. In the Array pairs window, verify that the array pair SiteA-UnityVSA - SiteB-UnityVSA
is selected and that the status is Ready to be enabled, and click NEXT.
9. Verify the settings and click FINISH.
10. Check the Recent Tasks pane and verify that the Discover Replicated Devices task
completes.
28
Q1. Which tasks immediately follow the Discover Replicated Devices task?
1. The Recompute Datastore Groups and Recompute Device Groups tasks immediately follow the Discover Replicated Devices task. These tasks are automatically initiated by Site Recovery Manager to determine the consistency group layout and the datastore group calculation.
Q3. Are the replicated devices part of the same consistency group?
3. Yes. Both devices are part of the consistency group SA-Retail-CG-01.
Q4. How does the consistency group configuration affect the datastore group
calculation?
4. Because both devices are failed over simultaneously at the storage level, Site Recovery Manager must put both devices in a si ngle datastore group.
12. Click DISCOVER DEVICES to launch a rescan of the storage for newly configured
replicated devices and record the tasks in the Recent Tasks pane.
29
Lab 5 Deploying vSphere Replication
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
31
2. In the left pane, click the Hosts and Clusters icon.
3. Select sa-vcsa-01.vclass.local.
4. In the right pane, select the Configure tab.
5. Select Settings > Advanced Settings.
6. Click the filter icon to the right of the Name column, enter FQDN in the search box, and
press Enter.
7. Verify that the value for the VirtualCenter.FQDN entry matches the actual FQDN of the
vCenter Server system sa-vcsa-01.vclass.local.
8. Open Windows Explorer and navigate to C:\Materials\Downloads.
9. Right-click VMware-vSphere_Replication-8.2.0-<xxxx>.iso and select Mount.
10. Return to vSphere Client.
11. In the left pane, right-click the SA-MGMT-01 cluster and select Deploy OVF Template...
12. In the Select an OVF Template window, select the Local File radio button and click
Choose Files.
13. Navigate to F:\bin .
F: is the DVD drive represented by the mounted vSphere Replication ISO file.
32
15. Click Open.
16. Click NEXT.
17. In the Select a name and folder window, enter sa-vr-01 in the Virtual machine name
text box.
18. Select sa-vcsa-01.vclass.local > SA-DC-01 > SA_MGMT_VMs for the location for the
VM.
19. Click NEXT.
20. In the Select a compute resource window, verify that SA-MGMT-01 is selected and click
NEXT.
21. In the Review details window, click NEXT.
22. In the License agreements window, select the I accept all license agreements check box
and click NEXT.
23. In the Configuration window, select the 2 vCPU radio button and click NEXT.
24. In the Select storage window, verify that the SA-MGMT-DS-01 datastore is selected.
25. In the Select virtual disk format drop-down menu, select Thin Provision and click
NEXT.
26. In the Select networks window, select VM Network in the Destination Network drop-
down menu.
27. Verify that the IP allocation value is Static - Manual and that the IP protocol value is
IPv4.
28. Click NEXT.
29. In the Customize template window, configure the template values.
You can ignore the browser prompt to save the password.
33
a. Password: Enter VMware1!.
b. NTP Servers: Enter 172.20.10.10.
c. Hostname: Leave blank.
d. Enable file integrity: Do not enable.
e. Disable VCTA: Select the checkbox.
f. Default Gateway: Enter 172.20.10.10.
g. Domain Name: Enter vclass.local.
h. Domain Search Path: Enter vclass.local.
i. Domain Name Servers: Enter 172.20.10.10.
j. Management Network IP Address: Enter 172.20.10.25.
k. Management Network Netmask: Enter 255.255.255.0.
30. In the vService bindings window, verify that the Binding status appears with a check mark
and click NEXT.
31. In the Ready to complete window, verify that the settings are correct and click FINISH.
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
34
6. Open a new browser tab and browse to the URL that appears in the VM console.
7. On the login page, enter the following credentials and then click Login:
Username : root
Password: VMware1!
8. On the home page, select the Configuration tab.
35
d. Password: Enter VMware1!.
e. VRM Host: Enter sa-vr-01.vclass.local.
f. VRM Site Name: Enter SiteA-VR.
g. vCenter Server Address: Enter sa-vcsa-01.vclass.local.
h. vCenter Server Admin Mail: Enter [email protected].
i. IP Address for Incoming Storage Traffic: Enter 172.20.10.25.
10. Verify that the settings are correct and click Save and Restart Service.
11. In the Confirm SSL Certificate warning pop-up box, click Accept.
12. Monitor the tasks in the top right corner and wait until all the tasks are complete.
13. Verify that the VRM service is running.
14. Click Logout user root and close the browser tab.
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
Q2. Which services are currently available between the sa-vcsa-01.vclass.local <---->
sb-vcsa-01.vclass.local sites as shown in the left panel?
2. Only Site Recovery Manager is available between the sites. Replication is available only within the same vCenter Server instance, as shown in the right panel.
36
6. In the newly opened Site Recovery client tab, click NEW SITE PAIR.
7. In the Site details window, select the sa-vcsa-01.vclass.local radio button as the first site.
8. In the Second site section, configure the settings.
a. PSC host name: Enter sb-vcsa-01.vclass.local.
b. PSC port: Leave the default (443).
c. User name: Enter [email protected].
d. Password: Enter VMware!.
9. Click NEXT.
10. Click CONNECT in the Security Alert pop-up box.
11. In the vCenter Server and services window, select the sb-vcsa-01.vclass.local radio button.
12. Verify that the vSphere Replication sites SiteA-VR and SiteB-VR are listed as a valid
service pair and select the vSphere Replication check box for the site pair.
13. Click NEXT.
14. Click FINISH.
15. Verify that vSphere Replication is now available as a service in the sa-vcsa-01.vclass.local
<----> sb-vcsa-01.vclass.local panel.
1. On the student desktop, open a Chrome browser window to the vSphere Client (SB-
VCSA-01), using the shortcut in the Site-B Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
37
5. Navigate to F:\bin.
F: is the DVD drive representing the mounted vSphere Replication ISO.
7. Click Open.
8. In the Select a name and folder window, enter sb-vrs-01 in the Virtual machine name
text box and select sb-vcsa-01.vclass.local > SB-DC-01 > SB_MGMT_VMs for the
folder location.
9. Click NEXT.
10. In the Select a compute resource window, verify that SB-MGMT-01 is selected and click
NEXT.
11. In the Review details window, click NEXT.
12. In the Select storage window, verify that the SB-MGMT-DS-01 datastore is selected and
select Thin Provision from the Select virtual disk format drop-down menu.
13. Click NEXT.
38
14. In the Select networks window, select VM Network in the Destination Network drop-
down menu.
15. Verify that the IP allocation value is Static - Manual and that the IP protocol value is
IPv4 and click NEXT.
16. In the customize template window, configure the settings.
a. Password: Enter VMware1!.
b. NTP Servers: Enter 172.20.110.10.
c. Hostname: Leave blank.
d. Enable file integrity: Deselect the check box.
e. Disable VCTA: Select the check box.
f. Default Gateway: Enter 172.20.110.10.
g. Domain Name: Enter vclass.local.
h. Domain Search Path: Enter vclass.local.
i. Domain Name Servers: Enter 172.20.110.10.
j. Management Network IP Address: Enter 172.20.110.26.
k. Management Network Netmask: Enter 255.255.255.0.
17. Click NEXT.
You can ignore the browser prompt to save the password.
18. Review the settings and click FINISH.
19. Wait for the Deploy OVF package task to complete.
20. Right-click the sb-vrs-01 VM and select Power > Power On.
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
39
2. Select Menu > Site Recovery.
3. Click OPEN Site Recovery for the sb-vcsa-01.vclass.local vCenter Server instance.
a. If prompted, log in with the following credentials:
Username: [email protected]
Password: VMware1!
4. In the Site Recovery client, click VIEW DETAILS for the sb-vcsa-01.vclass.local <---->
sa-vcsa-01.vclass.local site pair.
5. In the left pane, select Configure > Replication Servers.
Q1. How many vSphere Replication Server instances are currently registered in the
sb-vcsa-01.vclass.local site?
1. One vSphere Replication Server instance is registered. The vSphere Replication Server is embedded in the deployed vSphere Rep lication appliance.
40
Lab 6 Enabling Replication on VMs
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
41
4. In the VM validation window, verify that Internal-VM-01 is selected and click NEXT.
5. In the Target site window, select sb-vcsa-01.vclass.local as the recovery site.
6. Leave the Auto-assign vSphere Replication Server check box selected and click NEXT.
7. In the Target Datastore window, click the information icon next to the Disk format drop-
down menu and verify that the current disk format is Thin provision.
8. Select SB-VR-Target-DS-01 as the target datastore and click NEXT.
9. In the Replication settings window, move the Recovery Point Objective slider to 15
minutes and click NEXT.
10. In the Protection Group window, select the Do not add to protection group now radio
button and click NEXT.
11. In the Ready to complete window, review the settings and click FINISH.
12. Monitor the Recent Tasks column and wait for the Enable replication of virtual machine
task to complete.
Q1. What is the replication status of the newly configured VM?
1. The status is Not active because the VM is not powered on. vSphere Replication only replicates the data of powered-on VMs.
13. Leave the Site Recovery tab open, return to the vSphere Client tab, and power on the
Internal-VM-01 VM.
14. Return to the Site Recovery client tab and click the refresh icon.
Q2. What is the status of the powered-on VM?
2. The status changes to Initial sync until the first replication is complete. Then the status appears as OK.
15. Return to the vSphere Client tab and power on all the VMs in the Internal_Services and
Retail_Services resource pools.
16. Select Menu > Storage.
17. Select SB-VR-Target-DS-01 from the inventory and click the Files tab.
18. Select the Internal-VM-01 folder to view the replica VM files.
Q3. What size is the file called Internal-VM-01.vmdk?
3. The size is approximately 550 MB. Answers may vary. The virtual disk of the source VM is 16 GB. But the replica disk is also thin-provisioned, so the consumed space is much less.
1. In the Files tab for SB-VR-Target-DS-01, select the Internal-VM-04 folder to view its
files.
Q1. Which files are in the folder?
1. The only file in the folder is Internal-VM-04.vmdk, which is a copy of the virtual disk attached to the VM Internal-VM-04 in the protected site.
2. In the left pane of vSphere Client, click the Hosts and Clusters icon.
42
3. Right-click Internal-VM-04 and select All Site Recovery actions > Configure
Replication....
4. In the VM validation window, verify that Internal-VM-04 appears and click NEXT.
5. In the Target site window, select sb-vcsa-01.vclass.local as the target site.
Q2. To which vSphere Replication Server are replications assigned?
2. One replication is assigned to the sb-vr-01 vSphere Replication Server.
6. Select the Manually select vSphere Replication Server radio button and select sb-vrs-01
as the target server.
7. Click NEXT.
8. In the Target datastore window, select SB-VR-Target-DS-01 and select the Select seeds
check box.
A message indicates that a possible seed disk is detected.
9. Click NEXT.
Q3. What is the file path for the suggested seed disk?
3. The file path suggested for the seed disk is [SB-VR-Target-DS-01] Internal-vm-04/Internal-VM-04.vmdk. The virtual disk filename matches the name of the source virtual disk file.
10. Note the warning at the top of the window and select The selected seeds are correct
check box.
11. Click NEXT.
12. In the Replication settings window, configure the settings.
a. Leave the default RPO value of one hour.
b. Select the Enable point in time instances check box.
c. Configure the replication setting to Keep 24 instances per day for the last 1 day(s).
13. Click NEXT.
14. Select the Do not add to protection group now radio button and click NEXT.
15. Click FINISH.
43
c. Click Outgoing in the left pane.
2. In the right pane, click + NEW.
3. Select Internal-VM-02 and Internal-VM-03 from the list of registered VMs in the
protected site.
4. Click NEXT.
5. In the Target site window, leave the Auto-assign vSphere Replication Server option
enabled and click NEXT.
6. In the Target datastore window, select SB-VR-Target-DS-01 as the target datastore and
click NEXT.
7. In the Replication settings window, change the RPO setting to 8 hours and click NEXT.
8. In the Protection group window, select the Do not add to protection group now radio
button and click NEXT.
9. Click FINISH.
44
Lab 7 Building Protection Groups
1. On the student desktop, open a browser tab to the Site Recovery client at https://sa-vr-
01.vclass.local/dr.
a. Click View Details for the sa-vcsa-01.vclass.local <-> sb-vcsa-01.vclass.local site
pair.
2. Select the Protection Groups tab.
3. In the right pane, click + NEW.
4. In the Name and direction window, enter MPIT_VM_PG in the Name text box.
5. Leave the default Direction and Location values and click NEXT.
6. In the Type window, select the Individual VMs (vSphere Replication) radio button and
click NEXT.
7. In the Virtual Machines window, select the Internal-VM-04 check box and click NEXT.
45
8. In the Recovery plan window, select the Do not add to recovery plan now check box and
click NEXT.
9. Click FINISH.
10. Select MPIT_VM_PG to view the protection group configuration.
11. Select the Virtual Machines tab.
Q1. On which host is the VM Internal-VM-04 recovered?
1. Answers may vary. You identify the host on which the VM is recovered by checking the Recovery Host column.
7. Click NEXT.
8. In the Recovery plan window, select the Do not add to recovery plan now check box and
then click NEXT.
9. Click FINISH.
46
4. Select all the Internal-VM-0x VMs to add to the protection group and click NEXT.
5. Click FINISH.
6. Click the Virtual Machines tab.
7. Select the Internal-VM-04 check box and click CONFIGURE PROTECTION.
8. Expand each of the sections.
Q1. Which resource is the only one for which you can override the site mappings?
1. You can override the site mappings only for the network device.
9. Under Network - NIC1, select the Override site mapping check box and select sb-vcsa-
01.vclass.local > SB-DC-01 > sb-dvs-01 > Priority_dvPg for the mapped network port
group for the NIC.
10. Click OK.
3. Leave the Site Recovery client tab open and open a Chrome browser tab to the vSphere
Client (SA-VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
47
2. The VM is no longer protected. An error message reads Device not found: CD/DVD drive 1. Site Recovery Manager cannot protect this VM in its entirety with the current protection group settings because a file used by the VM resides on a datastore that is not part of the protection group.
11. Select the check box next to Web-VM-Retail and click CONFIGURE PROTECTION.
12. Expand the CD/DVD drive 1 device settings.
Q3. What is the device status?
3. The device status indicates that the device is not replicated.
48
Lab 8 Building Recovery Plans
1. On the student desktop, open a browser tab to the Site Recovery client at https://sa-vr-
01.vclass.local/dr.
a. Click View Details for the sa-vcsa-01.vclass.local <--> sb-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the Recovery Plans pane, click + NEW.
4. In the Name and direction window, enter Master-RP in the Name text box.
a. Verify that the direction appears as SiteA-SRM ---> SiteB-SRM and click NEXT.
5. In the Protection Groups window, select all the protection groups and click NEXT.
6. In the Test Networks window, leave Use site-level mapping enabled for all network port
groups and click NEXT.
7. Click FINISH.
49
Task 2: Configure VM Recovery Properties
Using the Site Recovery client, you edit the recovery plan and configure VM recovery
properties.
1. In the left pane, click the newly created Master-RP recovery plan.
Q1. In the Master-RP pane, how many VMs are listed as ready for recovery in the VM
Status pane?
1. It displays seven VMs with the status Ready for recovery.
50
8. Click YES in the pop-up box.
9. Deselect the App-VM-Retail check box if it is still selected, select the Internal-VM-04
check box, and click CONFIGURE RECOVERY.
10. Verify that the Priority Group selected is 3 (Medium).
11. Expand VM Dependencies and select View all from the drop-down menu.
Q2. Why is it not possible to configure a VM dependency between Internal-VM-04 and
App-VM-Retail or DB-VM-Retail?
2. You cannot configure a dependency between App-VM-Retail or DB-VM-Retail and Internal-VM-04 because they are in a higher priority group. VM dependencies can only be configured between VMs in the same priority group.
12. In the VM Dependencies section, select the Web-VM-Retail check box so that this VM
starts before Internal-VM-04.
51
13. Select the IP Customization tab.
14. Select No IP Customization from the Select an IP customization mode drop-down
menu.
15. Click OK.
16. Select Internal-VM-01 and click CONFIGURE RECOVERY.
17. On the IP Customization tab, select No IP Customization from the Select an IP
customization mode drop-down menu.
18. Click OK.
19. Repeat steps 15-17 for Internal-VM-02 and Internal-VM-03.
20. Verify the correct configuration of priorities and dependencies for the VMs in the recovery
plan.
52
Task 3: Add a Custom Step
You configure a custom step in the recovery plan to prompt the user for confirmation before
moving to the next step.
b. In the Content text box, enter Confirm that the services provided by DB-
VM-Retail and App-VM-Retail are available.
4. Click ADD.
53
Lab 9 Performing a Recovery Plan
Test and Cleanup
1. On the student desktop, open a browser tab to the Site Recovery client at https://sa-vr-
01.vclass.local/dr.
a. Click View Details for the sa-vcsa-01.vclass.local <----> sb-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the left pane, click Master-RP.
4. Select the Recovery Steps tab.
5. Click TEST.
6. In the Confirmation options window, verify that the Replicate recent changes to recovery
site check box is selected and click NEXT.
55
7. Click FINISH.
1. On the Recovery Plan tab, monitor the tasks and steps as they are performed.
Q1. Why do all the VMs appear to power on together, even though you configured a
startup priority?
1. Initially, VMs power on with detached network devices to perform IP customization. When customization is complete, the VMs po wer off, and network devices are reattached. The second power-on operation follows the configured startup order.
2. Expand the categories so that you can see the tasks associated with the Power on priority 1
VMs and Power on priority 2 VMs steps.
a. Wait until all the tasks are complete for the Power on priority 2 VMs.
Q2. Did the second power-on operation for App-VM-Retail start after the Wait for
VMware Tools step was completed for the DB-VM-Retail?
2. Yes. The second power-on operation for a priority 2 VM does not start until after the priority 1 VMs start successfully.
3. Verify that the plan is waiting on step 8 (Prompt: Verify that Priority 1 and 2 VMs have
started).
a. If required, scroll up to the top of the Recovery Steps tab and click DISMISS to
acknowledge the prompt.
Q3. Does Internal-VM-04 begin to power on immediately after acknowledging the
prompt?
3. No, Internal-VM-04 does not power on because it has a dependency on Web-VM-Retail. It does not power on until Web-VM-Retail starts and VMware Tools is responding.
4. Wait for all test steps to complete and verify that the Plan Status appears as Test complete.
1. On the student desktop, open a Chrome browser window to the vSphere Client (SA-
VCSA-01), using the shortcut in the Site-A Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
56
Q1. Which VMs are connected to this port group?
1. All the recovered VMs are connected to the port group.
Q2. Why are the VMs not connected to the port groups for which you configured site-
level mapping?
2. The VMs are not connected to the mapped port groups because this operation is a test recovery. The port groups are configured to use an auto-generated network during test recovery operations.
Q9. What are the names of the recovered datastores in the recovery site?
9. The names of the datastores are prefixed with snap-xxxxxxxx-. This prefix indicates that the devices are resignatured after failover to ensure that no conflict exists with the production datastores.
1. On the student desktop, open a browser tab to the Site Recovery client at https://sa-vr-
01.vclass.local/dr.
a. Click View Details for the sa-vcsa-01.vclass.local <----> sb-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the left pane, click Master-RP.
4. In the Master-RP pane, select the Recovery Steps tab.
5. Click CLEANUP.
57
6. In the Confirmation options window, click NEXT.
7. Click FINISH.
8. Wait for the cleanup operation to complete.
9. Return to the vSphere Client tab and click the refresh icon.
Q1. What happens to the recovered datastores after the cleanup?
1. The recovered datastores are no longer available in the recovery site because they are unmounted and deleted.
58
2. The VMs are no longer powered on and are replaced with placeholder VMs.
1. On the student desktop, open a browser tab to the Site Recovery client at https://sb-vr-
01.vclass.local/dr.
a. Click View Details for the sb-vcsa-01.vclass.local <----> sa-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the left pane, click Master-RP.
4. In the Master-RP pane, select the Recovery Steps tab.
5. Click RUN.
59
6. Verify that Planned migration is selected and select the check box to acknowledge that
you understand the process.
7. Click NEXT.
8. Click FINISH.
9. Monitor the steps as they complete.
a. When step 11 of the recovery plan is complete, click DISMISS to acknowledge the
user-defined prompt.
10. Wait for the recovery operation to complete.
11. In the Recovery Steps tab, expand step 2 to view all the tasks associated with shutting
down the protected-site VMs.
a. Verify that the VMs shut down in reverse priority order.
60
1. On the student desktop, open a Chrome browser window to the vSphere Client (SB-
VCSA-01), using the shortcut in the Site-B Systems folder.
a. Log in, if required, using the following credentials:
User name: [email protected]
Password: VMware1!
5. Select the Networks tab and select sb-vcsa-01.vclass.local > SB-DC-01 > sb-vds-01.
6. In the left pane, select the Recovered_Production_VMs port group and, in the right pane,
select the VMs tab.
a. Verify that six VMs are connected to the port group.
7. In the left pane, select the Priority-dvPg port group and, in the right pane, select the VMs
tab.
Q2. Why is Internal-VM-04 connected to the Priority-dvPg port group and not to the
Recovered_Production_VMs port group?
2. Internal-VM-04 is connected to the Priority-dvPg port group because you configured a network mapping override in the recovery settings for the VM.
61
Lab 11 Reprotecting and Failing
Back
1. On the student desktop, open a browser tab to the Site Recovery client at https://sb-vr-
01.vclass.local/dr.
a. Click View Details for the sb-vcsa-01.vclass.local <--> sa-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the left pane, click Master-RP and, in the right pane, select the Recovery Steps tab.
4. Verify that the Plan status is Recovery complete.
a. If so, click REPROTECT.
5. In the Confirmation options window, select the I understand that this operation cannot
be undone check box and click NEXT.
6. Click FINISH.
7. Wait for the tasks to complete.
8. Open a browser tab to the vSphere Client using the vSphere Client (SA-VCSA-01)
shortcut.
9. Select Menu > Hosts and Clusters.
10. Verify that Placeholder VMs are in the sa-vcsa-01.vclass.local inventory for each protected
VM.
1. On the student desktop, open a browser tab to the Site Recovery client at https://sa-vr-
01.vclass.local/dr.
a. Click View Details for the sa-vcsa-01.vclass.local <--> sb-vcsa-01.vclass.local site
pair.
2. Select the Recovery Plans tab.
3. In the left pane, click Master-RP.
a. Verify that the VM Status pane shows that all seven VMs are Ready for recovery.
4. Select the Recovery Steps tab.
5. Right-click the step Prompt: Verify that Priority 1 and 2 VMs have started and select
Delete Step.
6. Click RUN.
7. Select the check box to indicate that you understand the implications of running the test
and click NEXT.
63
8. Click FINISH.
9. Monitor the tasks as they complete.
10. When the recovery is complete click REPROTECT to configure the sites to their original
state.
a. Acknowledge the warning by selecting the check box.
b. Click OK.
c. Click FINISH to complete the operation.
64
Answer Key
66
Lab 6 Enabling Replication on VMs
Task 1: Configure a Single VM for vSphere Replication ....... 41
1. The status is Not active because the VM is not powered on. vSphere Replication only
replicates the data of powered-on VMs.
2. The status changes to Initial sync until the first replication is complete. Then the status
appears as OK.
3. The size is approximately 550 MB. Answers may vary. The virtual disk of the source VM
is 16 GB. But the replica disk is also thin-provisioned, so the consumed space is much
less.
67
group settings because a file used by the VM resides on a datastore that is not part of the
protection group.
3. The device status indicates that the device is not replicated.
69