Deep dive into VMware vCenter Site Recovery Manager 8.1. Part 2 – Creating a Disaster Recovery Site

Having a disaster recovery site is a must for any company. That’s why I wrote an article some time ago on how to set up Site Recovery Manager (SRM) so that it allows for creating a disaster recovery site. Today, I describe how you can actually create that site and migrate your VMs from the main site there.

SRM

VMware vCenter Site Recovery Manager Topologies

First, before I move to some hands-on stuff, let’s talk about topologies supported by VMware vCenter Site Recovery Manager. SRM allows for 3 following topologies:

1. 2 sites with one vCenter Server instance per Platform Services Controller (PSC).

2. 2 sites with several vCenter Server instances per PSC

3. 1 site with several vCenter Server instances sharing one PSC

2-Site Topology with One vCenter Server Instance per PSC

For my money, it’s the smartest and the most commonly used configuration. You can use either external PSC or one embedded in a vCenter Server instance. This topology enables to create sites on different Single Sign-On (SSO) domains, be they in Enhanced Linked Mode or not. The scheme below shows in detail how the described topology is implemented.

2-Site Topology with One vCenter Server Instance per PSC

Today, I discuss how to deploy this very configuration. Creating any other topology mentioned in this section looks pretty similar to deploying one that I describe today. Sure, there are some differences, and if you are going to deploy any of those configurations mentioned here, look through the documents VMware provides on this topic.

2-Site Topology with Multiple vCenter Server Instances per PSC

In this scenario, PSC instances are external relative to the vCenter Server ones. This configuration also allows for creating sites in different SSO domains, no matter whether they are in Enhanced Linked Mode or not. Find this topology scheme below.

2-Site Topology with Multiple vCenter Server Instances per PSC

Single-Site Topology with a PSC

It is possible to deploy SRM such that all its instances are connected to vCenter Server instances that share one PSC. Even though single-site configuration looks fairly simple, I strongly warn you against this scenario: your environment just won’t failover if something happens to PSC!

Single-Site Topology with a PSC

Deploying Site Recovery Manager in a Two-Site Configuration with Single vCenter Server Instance per PSC

Creating a Site Pair

To start with, create a Site Pair. Without a Site Pair, SRM won’t protect the site since it won’t be aware of the replication direction.

Open the Site Recovery Manager console

C:\2b7ae7b1179a8bcf56ac7ad2ed9a1960

Click New site pair to start a Site Pair creation.

C:\a812448e47e7587efeeff8d8a28149ee

Specify site details for the vCenter Server instance that is to be replicated and PSC host name and credentials.

C:\ff17c7e3091797ef94f21955f603e374

Select the services that you want to pair.

C:\c1916fbee58c0534d417ac619c9bd9bd

Next, connect the certificates that are necessary for pairing.

C:\760fbd11aa8610f1738aaa43fb47bd90

Verify the settings and press Finish to start pairing.

C:\9a38c88a98050a0f1cc390b2b1774523

Now, let’s whether SRM was configured right and test it.

C:\3fdeb1359aa287d380c1afe886182aea

Go to the Summary menu and make sure that there were no errors during pairing.

C:\58ff594d2d1d2fa94e9b1a18955fadcc

Setting up Site Recovery Manager

Now, let’s go back to where the previous part finished, namely, to setting up SRM. Site Recovery Manager pre-configuration is a long process that includes the following steps:

  • Configuring LUN replication
  • Setting up vCenter Server and SRM instances connection between sites.
  • Connecting SRM to Site Recovery Adapters on both sites. Testing LUN replication between sites.
  • Connecting datastores, VM port groups, and ESX servers on both sites.
  • Creating protection groups on the primary site. The protection group is a collection of VMs that belongs to one datastore group. It is the key component of any recovery plan. Datastore group, in its turn, is an object generated automatically based on some rules that have something to do with connections between LUNs, VMFS volumes, and VMs. To make it simpler to understand, let’s assume that if there is only one VMFS volume per LUN and VMs that share that volume do not have files on other volumes, they belong to one datastore group, which, in its turn, is associated with one protection group.
  • Select datastore for placeholder VMs for protection Group, the VMFS volume on the recovery site that keeps VM metadata, i.e., *.vmsd, *.vmx, and *.vmfx files. Those files are copied to the recovery site and allow registering the VMs in vCenter on that site.
  • For each VM in Protection Group, you need to specify the following parameters to migrate it successfully to the remote site.
  • Datacenter
  • Resource pool
  • Network port group
  • Metadata storage
  • Customization specification (XML files that contain guest OS settings for VM, e.g., IP, administrator credentials, etc.)
  • Recovery priority
  • Message for an administrator that is output while replicating a VM, i.e., before and after turning on the VM.
  • Scripts run before and after turning on the VM.

While working with SRM, you need certain rights that can be configured pretty easily. Here are the options which you might need to set up depending on your SRM topology:

  • Check Storage Replication Adapter status. Find how to set it up in Part 1.

C:\e0ec9a0eacb3832f1d973f279030ea6d

  • Network mappings.

C:\bd98125f497fc6fd99d7de245ae2bb5d

  • Mapping directories for site replication.

C:\5d5095e89bf17e2438e4ac027dfdeeb4

  • Resource mappings.

C:\c105d5f4a9b0be78a10dafead621a42d

  • Storage policy mappings.

C:\85103f316248129387a44fbd25f2e89f

  • Placeholder datastores.

C:\1203f20b71aa50a27ed0ec04af87d8ab

In this article, I won’t go into setting up each option. The options set you need to configure depends on infrastructure topology. Find more details here.

Setting up Replication Between Sites

Once you are done with configuring replication between the main and secondary sites, you can move on to setting the replication mechanism. You can choose either forward or reverse replication (I use SRM in the Forward replication mode today). To add a configuration, press New.

C:\7337cd4f0bd2a103bca2e989491fb186

At the first step of the Configure Replication wizard, you need to select the VMs that you want to protect. Here, I decided to protect only one VM, VM-TEST-1.

C:\f704a63e744da1c8d84279d513952e64

Select the target site and vSphere Replication server that will handle the replication.

C:\a85789d696dfa74325d25dd550fd7d32

Next, select a target datastore and specify additional settings, i.e., disk format and VM storage policy.

C:\24c06a6ddd33788b70f3a000bc34abca

Configure VM replication afterward (RPO, number of instances to keep, and how long each lasts).

C:\6883efd5aff4a301c5699a4b0b9946dc

At the next step, you should add the VM to a protection group. You can either add it to a new protection group or to the existing one. I have not created any protection groups yet (I create them later), so I just opt for Do not add to protection group now.

C:\f3b8b3e77b5c42994ba94e3d247758e4

Verify all the settings and press Finish to initiate replica addition.

C:\d376482848dcfae1966617747aa3e4d6

Wait until the wizard finishes and refresh the Replications tab. See, the VM is good to go!

C:\4edad983530e0157b728663a60ff1f2d

You can also set up the reverse replication using the small guide described above. Everything looks just the same… apart from replication direction!

Creating a Protection Group

Now, go to the Protection Groups tab. As it comes from its name, there, you can create protection groups – an abstraction for replicating and restoring multiple VMs.

Create a protection group.

C:\235aeeb86f4f9506cdfc77f023b8dc1b

Enter a protection group name and select the replication direction.

C:\5b5f6ef786fa8410bb4660d5c2ad6dde

Select the replication type afterward. Here, I protect individual VMs.

C:\01c86265f7934c2d35e4970179667f6a

Select the VMs that are to be included in the protection group. I use the VM I selected while configuring replication.

C:\d513f3328e854d63545fcfb69a2c3c9d

I did not add the protection group to the recovery plan (I am going to create one later). Once you select this option, SRM displays a warning saying that the protection group cannot be recovered without being added to a recovery plan. Just ignore this warning since the recovery plan will be created a bit later.

C:\d8c61ab68e55b2e2aff0d42b1d0679eb

Verify the settings and click Finish to initiate the recovery group creation process.

C:\6a7b40727526a395b1560153c789ad4b

Check the status of the protection group now.

C:\89561f9d131b17db764fdacfe4ba5fd7

Creating a Recovery Plan

Congratulations, you’re almost there! Let’s create a recovery plan now. Start the recovery plan creation process by pressing New.

C:\8e2f1eb568b07f427b5a39e14c1f16b1

Specify the name, direction, and location for the plan.

C:\c4f74019484d4d7f794be64d28da6b40

Select the protection group and protection policy.

C:\204e8d1a8f565ca81ecf9d2fa38c5c0d

Select the network for testing the plan.

C:\c0898e82365ddde99d3065fc1aeb9a9f

Review the settings and click Finish.

C:\038f857413d1fccf2b1b96e45f647018

Check the recovery plan status.

C:\485f253bda3e879064bf0c761d7b037b

Let’s test the plan!

Now, since everything seems fine, let’s jump to the testing of the recently created disaster recovery site.

Select the recently created recovery plan, press Recovery Steps, and finally click the self-titled button to run the test.

C:\bfeff419f68ad14ed27db09eabac7b9a

This plan is a sequence of steps which after being passed bring you to the fully functioning replica of your environment on a recovery site. To create the test, you need to follow the steps of the wizard. First, you need to confirm whether you want the recent changes to be replicated to the recovery site.

C:\f83ca25b58e5a13dcf3ef22e7a32f3e7

Then, review the settings and run the test.

C:\92efc2658431a0a260f56fc70aebe1ef

Provided that everything is fine, data on both sites get synced, and a VM starts on the recovery site. The VM on the main site, in its turn, is shut down. Check the plan status once the test is over. Don’t forget to run a cleanup at the end; otherwise, the infrastructure won’t just be working as it should!

C:\c71978a3a8ee6850f161096e9639cb5e

Speaking of cleanup, it is a two-stage process. At the first step, you need to confirm removing the environment and resetting the plan to the Ready state.

C:\311cab10de9d7a5e2fd57890e91a8295

Next, verify the settings and click Finish to start the cleanup.

C:\639d0f96067998718d1e301158282eb9

That’s pretty much it! The recovery plan is set to the Ready for Recovery mode, meaning that your environment is protected.

C:\6c483f0322969280f93d2b26d1781c67

You can also run the test migration of the environment on your main site to the disaster recovery site. I won’t discuss how it can be done in this article, but you can find the whole process here. Note that after recovering the protected environment, you need to run cleanup; otherwise, failover won’t be possible!

Conclusion

I guess that this article series was pretty long to read. Nevertheless, I believe both articles to be very important because they discuss such important thing as creating a disaster recovery site. With VMware vCenter Site Recovery Manager, that procedure is fairly simple. I hope you’ll have your environments running smoothly!