SRM 5 has finally been released and it now includes a built in failback, aka (ReProtect) so this post will try to capture some of the new SRM5 features and then an expansion of the (very soon to be released) the Symmetrix SRA (Storage Replication Adapter)
Before we go ahead, I also wanted to thank Cody Hosterman @codyhosterman , cody is a Snr Systems engineer who is responsible for many many things but in the context of the vSpecialist team, he makes sure we get the info we need before it leaves the door, thank you Cody for being so patient !
So first thing first, here’s the new scalability features within SRM 5
Maximum
Enforced
Protected virtual machines total
1000
No
Protected virtual machines in a single
protection group
500
No
Protection groups
250
No
Simultaneous running recovery plans
30
No
vSphere Replicated virtual machines
500
No
The planned migration is quite an im;ortant one generically speaking but has no meaning when using EMC Symmetrix array because of the fact that SRDF will ALWAYS make sure the data is consistent prior to the failover.
Some other new features which are “under the hood”
IPv6
SRM will support IPv6 for all network links.
vSphere Replication will support communication over IPv6 if underlying ESXi servers support IPv6.
Single UI – don’t need to use two clients or linked mode
IP Customization performance increase
The command line doesn’t change for bulk imports, but the actual action of customization is much faster.
In guest callouts
Since 1.0 you can execute a script that was held on the SRM server, now you can do it inside the guest
API:
Existing API on recovery site preserved
New API on both protected and recovery sides
Protected Site API set includes:
List replicated datastores / protection groups / resources / VM
Query the status of protection for a VM or VMs
Protect or unprotect one or more VM
Status of protection group
Recovery Side API includes:
Recovery Plan info
Start / cancel, list / answer prompts
Get XML representation of historical run of plan
Get basic result information of a plan (name, start, stop, etc.)
DMX-3 and DMX-4 running Enginuity operating environment 5771 or later
VMAX running Enginuity operating environment 5874
Management of Symmetrix array is done in-band
Solutions Enabler version 7.3.1 or later, There is a special SYMAPI preference that allows the SRA to discover remote devices even if the RDF state is partitioned. This preference is new in 7.3.1 and this behavior is required by SRM 5.
Has to be 32 bit version
Solutions Enabler is required on server running VMware SRM
Host running Solutions Enabler is required at each site
Can be the VMware SRM Server if it has connection to the storage array
Solutions Enabler in a client / server configuration
Host providing SYMAPI service needs to be configured
SSL connections between client and server recommended
Solutions Enabler Virtual Machine Appliance makes deployment of server easier
Virtual Appliances can be SE only or include SMC/SPA as well
Supported Functionality and Restrictions
SRDF/S, SRDF/A, SRDF/STAR Concurrent/Cascaded modes are supported
Support for enterprise consistency
SRDF/S ECA is supported
SRDF/A MSC is supported
Provides consistency across multiple SRDF/A groups
All SRDF/A groups in the MSC session needs to be managed by a single SRM instance
TimeFinder used for testing recovery plans
TimeFinder/Mirror and TimeFinder/Clone are fully supported
TimeFinder/Snap is fully supported with SRDF/S
TimeFinder/Snap is supported with SRDF/A with restrictions
5875 with write pacing enabled on R1
Logging
VMware SRM maintains logs on the vCenter Server
Location determined by VMware
Adapter logs located with SRM Logs (NEW LOCATION)
Default log location: %ProgramData%\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\EMC Symmetrix
Log file name is EmcSrdfSra_<date>.log
API logs is symapi-<date>.log
Available on the Symmetrix API server handling the client requests
Troubleshooting requires the logs listed above
Both protection and recovery side files are required
SRDF/STAR Support
Newly supported with SRA version 5.0, previously only two site solutions were allowed
SRDF/STAR uses one of the following RDF capabilities to mirror the same production data synchronously to one remote site and asynchronously to another remote site:
Concurrent SRDF configuration: A single source (R1) device is remotely mirrored to two target (R2) devices at the same time.
Cascaded SRDF configuration: It consists of a primary site (SiteA) replicating data to a secondary site (SiteB) and the secondary site (SiteB) replicating the same data to a tertiary site (SiteC).
SRDF/STAR topology:
Workload site: It is the primary data center where the production workload is running.
Sync target site: It is the secondary site usually located in the same region as the workload site. The production data is mirrored to this site using synchronous replication.
Async target site: It is the secondary site in distant location. The production data is mirrored to this site using asynchronous replication.
STAR Site operations: Operations performed on the workload site or target site.
Connect: Begin SRDF/Star synchronization
Protect: Enable SRDF consistency protection for a target site
Disconnect: Suspend SRDF/Star synchronization.
Unprotect: Disable SRDF/Star consistency protection to the specific target site.
Switch: Switch workload operations to a target site
SRDF STAR Considerations
The user is expected to setup the STAR group before the Adapter operations.
SRA supports Failover for STAR devices between the workload site and sync target site only.
The async target site is considered a bunker site and assumed that it will not be connected to a host to control it.
SRA supports Test Failover for STAR devices at the sync target site only.
The STAR commands in SRA might take multiple hours and depends on the amount of replication data.
SRDF/STAR Concurrent Setup
The R1 devices must be configured as concurrent dynamic devices.
Create an RDF1-type composite group on the control host at the workload site.
Add devices to the composite group from those SRDF groups that represent the concurrent links for SRDF/Star configuration.
Create two SRDF group names – one SRDF group name for all synchronous links and one for all asynchronous links.
For each source SRDF group that you added to the composite group, define corresponding empty recovery RDF groups (static or dynamic) at both the remote sites
SRDF/STAR Cascaded Setup
The R1 devices must be configured as cascaded dynamic devices.
Create an RDF1-type composite group on the control host at the workload site.
Add devices to the composite group from those SRDF groups that represent the cascaded links for the SRDF/Star configuration.
Create one SRDF group name for all synchronous links.
For each source SRDF group that you added to the composite group, define a corresponding empty recovery SRDF group (static or dynamic) at the workload site.
SRDF/STAR Setup
Create SRDF/STAR options file specifying the names of each SRDF/Star site and the required parameters.
Perform the symstar setup operation.
Create the matching R2 or R21 composite groups needed for recovery operations at the synchronous and asynchronous target sites
Device Discovery
Dynamic RDF devices
SRDF/S and SRDF/A
Adaptive Copy is still NOT supported
SRDF/STAR
STAR/Concurrent or Cascaded, Diskless Cascaded is not supported
Must be in STAR Configuration, standalone Concurrent or Cascaded configurations are not allowed
[WARNING]: Non-STAR Cascaded and Concurrent Devices are not supported. Skipping this device
Device Discovery- Consistency Groups
Consistency groups are required for all devices!
The devices will only be filtered out from discovery if they do not have a consistency group at one of the two sides. If they have none they will appear but will be grouped in the same protection group leading to a possibly incorrect configuration.
Even single devices without dependencies must be in a group
New requirement from SRM 5.0
Match the consistency groups to protection groups
Use VSI 5.0 SRA Utilities to create groups
Only supported to create groups for SRDF/A and SRDF/S devices
STAR consistency groups must be created manually
First group on workload site created manually
Secondary and tertiary groups, on Sync and Async site SYMAPI servers respectively, are easily created using “symstar buildcg” command
For SRDF/A pairs, the RDF Daemon must be enabled on both SYMAPI servers
Also set “SYMAPI_USE_RDFD” to ENABLE on both in options file
SRDF/S Algorithm
VSI scans for VMFS volumes on Symmetrix devices
All virtual machines are discovered for that VMFS
If any VM spans multiple datastores, all underlying Symmetrix devices are in CG
If one datastore spans multiple Symmetrix devices (must be same Symmetrix) all devices are included in CG
Any devices used as RDMs by the previous VMs are included in CG
All devices in step 3-5 are in one CG
All or some of the previewed CGs can be created
All or some of the CGs can be merged if desired
SRDF/A Algorithm and considerations
Very similar to SRDF/S with some special considerations
RDF operations cannot be performed on a subset of devices contained in a single RA group with SRDF/A. This means, all of the devices within an RA group must be part of any RDF operation.
Non-VMware associated Symmetrix devices may be affected if they are in the same RA group
Avoid this when possible—dedicate certain RA groups to only VMware devices when using SRDF/A
Array Manager Configuration
Array pairs must be enabled before devices can be discovered
Pairs can only be enabled if the remote array manager for that pair is also configured
I.E., set up array managers on both sides first, then enable the pairs
No need to enable array pairs in STAR configuration to Async site, only Sync site pair is necessary
Once pairs are enabled devices can be discovered by the SRA on the “Devices” tab
Device IDs, replication direction and consistency groups will be reported
Protection Groups and Recovery Plans
Datastore groups are defined by similar rules as VSI uses to suggest consistency groups
Create protection group
Select datastore group(s)
Recovery Plans
Can include one or more protection groups
Note how you can enter both protected site, and recovery site IP – which will help in both failover, and failback.
Must still log on to Recovery site to create pairings with VSI.
Notwithstanding the new architecture of SRM 5 user interface
Test Failover – General Considerations
TimeFinder/Mirror
The adapter requires the BCV pairs to be fully established prior to test failover.
TimeFinder/Snap or Clone
The adapter doesn’t require any TF relationships between the device pairs prior to ‘test failover – start’ operation.
Must be configured in options file though
If a relationship exists for the input RDF2 devices, the adapter analyses all existing relationships during ‘test failover – start’ operation and creates/recreates new/existing sessions.
For Snap, if “recreate” is not supported in the microcode, the adapter terminates any activated snap sessions and creates new sessions.
Test Failover Considerations with STAR
Test Failover can only be performed on the Sync site.
Not supported with the Async site
TestFailoverWithoutLocalSnapshots is NOT allowed in conjunction with STAR (for the current release..)
Will ignore setting and look for TimeFinder device pairings.
If no devices pairings are defined, test will fail
Other Test Failover advanced options are supported:
TestFailoverForce (not supported for RDF devices with the links in “Split” state, in the current release of the SRDF Adapter.)
TerminateCopySessions
These are the STAR modes allowed by the SRA for test failover (and failover) with STAR:
STAR State
Sync Target Site
Async Target Site
Protected
Protected
Protected
Tripped
PathFailed
PathFailed
Tripped
PathFailed
Protected
Tripped
Protected
PathFailed
Test Failover Example: SRDF/STAR Cascaded
Configure device pairings with VSI
Available in vCenter inventory at ESX or Cluster level
Choose, TimeFinder mode, device pairs and save
Test Failover Example: SRDF/STAR Cascaded
Initiate test of recovery plan
“Replicate recent changes to recovery site” is a non-operation for the SRDF SRA. The issued command “SyncOnce” is accepted but not used as SRDF constantly synchronizes storage.
Test Failover Example: SRDF/STAR Cascaded
Force Cleanup” is required when test failover storage operations fail. Due to:
Incorrect options file
Licensing issue
Etc…
“Force Cleanup” is not available upon the first “Cleanup” attempt after a test. “Cleanup” operation must fail once before this option can be selected.
Will allow process to complete regardless of storage operation errors
I am running into this issue that you mentioned. It looks like the replicated devices are being filtered out in the SRM UI.
◦The devices will only be filtered out from discovery if they do not have a consistency group at one of the two sides. If they have none they will appear but will be grouped in the same protection group leading to a possibly incorrect configuration.
We have not been successful in using VSI to create the consistency groups so we proceeded to create the consistency groups manually… I want to confirm if this is the correct way:
This is what we’ve done:
We have 10 replicated devices, all in a single RDF group (SRDF/A), consistency enabled. We created a composite group called “cg-protected” at the Protected site that contains the 10 R1 devices. Then we created a composite group called “cg-recovery” at the Recovery site that contains the 10 R2 devices. We also followed other prerequisites like enabling the RDF daemon on the Solutions Enabler vApp and configured the Options file.
With all that, SRM is still filtering the devices. I see this in the SRM logs:
[WARNING]: No remote group exists for this consistency group. Skipping this group.
Any idea on what else to check??? Do the composite group names need to be the same?
The names do not have to be the same but the groups have to have the same devices in them etc… Are you creating them with the -rdf_consistency flag?
This is the process you should use for creating composite groups manually (my R1 and R2 devices happen to have the same device IDs):
On the protected side SYMAPI server:
symcg create SRDFSpr -type RDF1 -rdf_consistency
symcg add dev 65 -cg SRDFSpr
symcg add dev 69 -cg SRDFSpr
symcg add dev 6d -cg SRDFSpr
symcg add dev 71 -cg SRDFSpr
symcg enable -cg SRDFSpr
On the recovery side SYMAPI server:
symcg create SRDFSre -type RDF2 -rdf_consistency
symcg add dev 65 -cg SRDFSre
symcg add dev 69 -cg SRDFSre
symcg add dev 6d -cg SRDFSre
symcg add dev 71 -cg SRDFSre
Itzik,
Great site; very informative!
I am running into this issue that you mentioned. It looks like the replicated devices are being filtered out in the SRM UI.
◦The devices will only be filtered out from discovery if they do not have a consistency group at one of the two sides. If they have none they will appear but will be grouped in the same protection group leading to a possibly incorrect configuration.
We have not been successful in using VSI to create the consistency groups so we proceeded to create the consistency groups manually… I want to confirm if this is the correct way:
This is what we’ve done:
We have 10 replicated devices, all in a single RDF group (SRDF/A), consistency enabled. We created a composite group called “cg-protected” at the Protected site that contains the 10 R1 devices. Then we created a composite group called “cg-recovery” at the Recovery site that contains the 10 R2 devices. We also followed other prerequisites like enabling the RDF daemon on the Solutions Enabler vApp and configured the Options file.
With all that, SRM is still filtering the devices. I see this in the SRM logs:
[WARNING]: No remote group exists for this consistency group. Skipping this group.
Any idea on what else to check??? Do the composite group names need to be the same?
The names do not have to be the same but the groups have to have the same devices in them etc… Are you creating them with the -rdf_consistency flag?
This is the process you should use for creating composite groups manually (my R1 and R2 devices happen to have the same device IDs):
On the protected side SYMAPI server:
symcg create SRDFSpr -type RDF1 -rdf_consistency
symcg add dev 65 -cg SRDFSpr
symcg add dev 69 -cg SRDFSpr
symcg add dev 6d -cg SRDFSpr
symcg add dev 71 -cg SRDFSpr
symcg enable -cg SRDFSpr
On the recovery side SYMAPI server:
symcg create SRDFSre -type RDF2 -rdf_consistency
symcg add dev 65 -cg SRDFSre
symcg add dev 69 -cg SRDFSre
symcg add dev 6d -cg SRDFSre
symcg add dev 71 -cg SRDFSre