Hi,

A lot has changed since I published my SRM failback post (http://volumes.blog/2011/01/10/srm-automatic-failback-using-emc-symmetrix-vmax/)

SRM 5 has finally been released and it now includes a built in failback, aka (ReProtect) so this post will try to capture some of the new SRM5 features and then an expansion of the (very soon to be released) the Symmetrix SRA (Storage Replication Adapter)

Before we go ahead, I also wanted to thank Cody Hosterman @codyhosterman
, cody is a Snr Systems engineer who is responsible for many many things but in the context of the vSpecialist team, he makes sure we get the info we need before it leaves the door, thank you Cody for being so patient !

So first thing first, here’s the new scalability features within SRM 5

  Maximum Enforced
Protected virtual machines total 1000 No
Protected virtual machines in a single

protection group

500 No
Protection groups 250 No
Simultaneous running recovery plans 30 No
vSphere Replicated virtual machines 500 No

 

The planned migration is quite an im;ortant one generically speaking but has no meaning when using EMC Symmetrix array because of the fact that SRDF will ALWAYS make sure the data is consistent prior to the failover.

Some other new features which are “under the hood”

  • IPv6
  • SRM will support IPv6 for all network links.
  • vSphere Replication will support communication over IPv6 if underlying ESXi servers support IPv6.
  • Single UI – don’t need to use two clients or linked mode
  • IP Customization performance increase
    • The command line doesn’t change for bulk imports, but the actual action of customization is much faster.
  • In guest callouts

Since 1.0 you can execute a script that was held on the SRM server, now you can do it inside the guest

API:

  • Existing API on recovery site preserved
  • New API on both protected and recovery sides
  • Protected Site API set includes:
    • List replicated datastores / protection groups / resources / VM
    • Query the status of protection for a VM or VMs
    • Protect or unprotect one or more VM
    • Status of protection group
  • Recovery Side API includes:    
    • Recovery Plan info
    • Start / cancel, list / answer prompts
    • Get XML representation of historical run of plan
    • Get basic result information of a plan (name, start, stop, etc.)

       

SRDF SRA 5.0

 

Requirements for SRDF Storage Adapter

  • DMX and VMAX storage arrays
    • DMX-1/2 running Enginuity operating environment 5671
    • DMX-3 and DMX-4 running Enginuity operating environment 5771 or later
    • VMAX running Enginuity operating environment 5874    
  • Management of Symmetrix array is done in-band
    • Solutions Enabler version 7.3.1 or later, There is a special SYMAPI preference that allows the SRA to discover remote devices even if the RDF state is partitioned. This preference is new in 7.3.1 and this behavior is required by SRM 5.
    • Has to be 32 bit version
  • Solutions Enabler is required on server running VMware SRM
  • Host running Solutions Enabler is required at each site
    • Can be the VMware SRM Server if it has connection to the storage array
  • Solutions Enabler in a client / server configuration
    • Host providing SYMAPI service needs to be configured
    • SSL connections between client and server recommended
  • Solutions Enabler Virtual Machine Appliance makes deployment of server easier
    • Virtual Appliances can be SE only or include SMC/SPA as well

Supported Functionality and Restrictions

  • SRDF/S, SRDF/A, SRDF/STAR Concurrent/Cascaded modes are supported
  • Support for enterprise consistency
    • SRDF/S ECA is supported
    • SRDF/A MSC is supported
      • Provides consistency across multiple SRDF/A groups
      • All SRDF/A groups in the MSC session needs to be managed by a single SRM instance
  • TimeFinder used for testing recovery plans
    • TimeFinder/Mirror and TimeFinder/Clone are fully supported
    • TimeFinder/Snap is fully supported with SRDF/S
    • TimeFinder/Snap is supported with SRDF/A with restrictions
      • 5875 with write pacing enabled on R1

        Logging

  • VMware SRM maintains logs on the vCenter Server
    • Location determined by VMware
  • Adapter logs located with SRM Logs (NEW LOCATION)
    • Default log location: %ProgramData%\VMware\VMware vCenter Site Recovery Manager\Logs\SRAs\EMC Symmetrix
  • Log file name is EmcSrdfSra_<date>.log
  • API logs is symapi-<date>.log
    • Available on the Symmetrix API server handling the client requests
  • Troubleshooting requires the logs listed above
    • Both protection and recovery side files are required

       

SRDF/STAR Support

 

  • Newly supported with SRA version 5.0, previously only two site solutions were allowed
  • SRDF/STAR uses one of the following RDF capabilities to mirror the same production data synchronously to one remote site and asynchronously to another remote site:
    • Concurrent SRDF configuration: A single source (R1) device is remotely mirrored to two target (R2) devices at the same time.
    • Cascaded SRDF configuration: It consists of a primary site (SiteA) replicating data to a secondary site (SiteB) and the secondary site (SiteB) replicating the same data to a tertiary site (SiteC).
    • SRDF/STAR topology:
    • Workload site: It is the primary data center where the production workload is running.
    • Sync target site: It is the secondary site usually located in the same region as the workload site. The production data is mirrored to this site using synchronous replication.
    • Async target site: It is the secondary site in distant location. The production data is mirrored to this site using asynchronous replication.
    • STAR Site operations: Operations performed on the workload site or target site.
    • Connect: Begin SRDF/Star synchronization
    • Protect: Enable SRDF consistency protection for a target site
    • Disconnect: Suspend SRDF/Star synchronization.
    • Unprotect: Disable SRDF/Star consistency protection to the specific target site.
    • Switch: Switch workload operations to a target site

SRDF STAR Considerations

  • The user is expected to setup the STAR group before the Adapter operations.
  • SRA supports Failover for STAR devices between the workload site and sync target site only.
    • The async target site is considered a bunker site and assumed that it will not be connected to a host to control it.
  • SRA supports Test Failover for STAR devices at the sync target site only.
  • The STAR commands in SRA might take multiple hours and depends on the amount of replication data.

SRDF/STAR Concurrent Setup

  1. The R1 devices must be configured as concurrent dynamic devices.
  2. Create an RDF1-type composite group on the control host at the workload site.
  3. Add devices to the composite group from those SRDF groups that represent the concurrent links for SRDF/Star configuration.
  4. Create two SRDF group names – one SRDF group name for all synchronous links and one for all asynchronous links.

For each source SRDF group that you added to the composite group, define corresponding empty recovery RDF groups (static or dynamic) at both the remote sites

SRDF/STAR Cascaded Setup

  1. The R1 devices must be configured as cascaded dynamic devices.
  2. Create an RDF1-type composite group on the control host at the workload site.
  3. Add devices to the composite group from those SRDF groups that represent the cascaded links for the SRDF/Star configuration.
  4. Create one SRDF group name for all synchronous links.
  5. For each source SRDF group that you added to the composite group, define a corresponding empty recovery SRDF group (static or dynamic) at the workload site.

SRDF/STAR Setup

  • Create SRDF/STAR options file specifying the names of each SRDF/Star site and the required parameters.
  • Perform the symstar setup operation.

    Create the matching R2 or R21 composite groups needed for recovery operations at the synchronous and asynchronous target sites

Device Discovery

  • Dynamic RDF devices
  • SRDF/S and SRDF/A
  • Adaptive Copy is still NOT supported
  • SRDF/STAR
    • STAR/Concurrent or Cascaded, Diskless Cascaded is not supported
    • Must be in STAR Configuration, standalone Concurrent or Cascaded configurations are not allowed

[WARNING]: Non-STAR Cascaded and Concurrent Devices are not supported. Skipping this device

Device Discovery- Consistency Groups

  • Consistency groups are required for all devices!
    • The devices will only be filtered out from discovery if they do not have a consistency group at one of the two sides. If they have none they will appear but will be grouped in the same protection group leading to a possibly incorrect configuration.
    • Even single devices without dependencies must be in a group
    • New requirement from SRM 5.0
    • Match the consistency groups to protection groups

Use VSI 5.0 SRA Utilities to create groups

  • Only supported to create groups for SRDF/A and SRDF/S devices
  • STAR consistency groups must be created manually
    • First group on workload site created manually
    • Secondary and tertiary groups, on Sync and Async site SYMAPI servers respectively, are easily created using “symstar buildcg” command
  • For SRDF/A pairs, the RDF Daemon must be enabled on both SYMAPI servers
    • Also set “SYMAPI_USE_RDFD” to ENABLE on both in options file
    • SRDF/S Algorithm
      • VSI scans for VMFS volumes on Symmetrix devices
      • All virtual machines are discovered for that VMFS
      • If any VM spans multiple datastores, all underlying Symmetrix devices are in CG
      • If one datastore spans multiple Symmetrix devices (must be same Symmetrix) all devices are included in CG
      • Any devices used as RDMs by the previous VMs are included in CG
      • All devices in step 3-5 are in one CG
      • All or some of the previewed CGs can be created

All or some of the CGs can be merged if desired

  • SRDF/A Algorithm and considerations
    • Very similar to SRDF/S with some special considerations
    • RDF operations cannot be performed on a subset of devices contained in a single RA group with SRDF/A. This means, all of the devices within an RA group must be part of any RDF operation.
    • Non-VMware associated Symmetrix devices may be affected if they are in the same RA group
      • Avoid this when possible—dedicate certain RA groups to only VMware devices when using SRDF/A

Array Manager Configuration

  • Array pairs must be enabled before devices can be discovered
  • Pairs can only be enabled if the remote array manager for that pair is also configured
    • I.E., set up array managers on both sides first, then enable the pairs
  • No need to enable array pairs in STAR configuration to Async site, only Sync site pair is necessary

  • Once pairs are enabled devices can be discovered by the SRA on the “Devices” tab
    • Device IDs, replication direction and consistency groups will be reported

Protection Groups and Recovery Plans

  • Datastore groups are defined by similar rules as VSI uses to suggest consistency groups
  • Create protection group
    • Select datastore group(s)

Recovery Plans

  • Can include one or more protection groups

Note how you can enter both protected site, and recovery site IP – which will help in both failover, and failback.

Recovery Plans – VM Dependencies


Test Failover


  • Requires Virtual Storage Integrator 5.0 SRA Utilities
  • TimeFinder configuration pairing saved to options file
    • Located:
      • %ProgramData%\EMC\EmcSrdfSra\Config\EmcSrdfSraTestFailoverConfig.xml
  • To use R2 devices for Failover instead of TimeFinder copies set the following global option to “yes”:
    • “TestFailoverWithoutLocalSnapshots”
    • Located here:
      • %ProgramData%\EMC\EmcSrdfSra\Config\EmcSrdfSraGlobalOptions.xml
  • Must still log on to Recovery site to create pairings with VSI.
    • Notwithstanding the new architecture of SRM 5 user interface

Test Failover – General Considerations

 

  • TimeFinder/Mirror
    • The adapter requires the BCV pairs to be fully established prior to test failover.
  • TimeFinder/Snap or Clone
    • The adapter doesn’t require any TF relationships between the device pairs prior to ‘test failover – start’ operation.
      • Must be configured in options file though
    • If a relationship exists for the input RDF2 devices, the adapter analyses all existing relationships during ‘test failover – start’ operation and creates/recreates new/existing sessions.
      • For Snap, if “recreate” is not supported in the microcode, the adapter terminates any activated snap sessions and creates new sessions.

Test Failover Considerations with STAR

  • Test Failover can only be performed on the Sync site.
    • Not supported with the Async site
  • TestFailoverWithoutLocalSnapshots is NOT allowed in conjunction with STAR (for the current release..)
    • Will ignore setting and look for TimeFinder device pairings.
    • If no devices pairings are defined, test will fail
  • Other Test Failover advanced options are supported:
    • TestFailoverForce (not supported for RDF devices with the links in “Split” state, in the current release of the SRDF Adapter.)
    • TerminateCopySessions
  • These are the STAR modes allowed by the SRA for test failover (and failover) with STAR:
STAR State Sync Target Site Async Target Site
Protected Protected Protected
Tripped PathFailed PathFailed
Tripped PathFailed Protected
Tripped Protected PathFailed

Test Failover Example: SRDF/STAR Cascaded

  • Configure device pairings with VSI
    • Available in vCenter inventory at ESX or Cluster level

      Choose, TimeFinder mode, device pairs and save

Test Failover Example: SRDF/STAR Cascaded

  • Initiate test of recovery plan
    • “Replicate recent changes to recovery site” is a non-operation for the SRDF SRA. The issued command “SyncOnce” is accepted but not used as SRDF constantly synchronizes storage.

Test Failover Example: SRDF/STAR Cascaded

  • Force Cleanup” is required when test failover storage operations fail. Due to:
    • Incorrect options file
    • Licensing issue
    • Etc…
  • “Force Cleanup” is not available upon the first “Cleanup” attempt after a test. “Cleanup” operation must fail once before this option can be selected.
    • Will allow process to complete regardless of storage operation errors

       

3 Comments »

  1. Itzik,
    Great site; very informative!

    I am running into this issue that you mentioned. It looks like the replicated devices are being filtered out in the SRM UI.

    ◦The devices will only be filtered out from discovery if they do not have a consistency group at one of the two sides. If they have none they will appear but will be grouped in the same protection group leading to a possibly incorrect configuration.

    We have not been successful in using VSI to create the consistency groups so we proceeded to create the consistency groups manually… I want to confirm if this is the correct way:

    This is what we’ve done:
    We have 10 replicated devices, all in a single RDF group (SRDF/A), consistency enabled. We created a composite group called “cg-protected” at the Protected site that contains the 10 R1 devices. Then we created a composite group called “cg-recovery” at the Recovery site that contains the 10 R2 devices. We also followed other prerequisites like enabling the RDF daemon on the Solutions Enabler vApp and configured the Options file.

    With all that, SRM is still filtering the devices. I see this in the SRM logs:
    [WARNING]: No remote group exists for this consistency group. Skipping this group.

    Any idea on what else to check??? Do the composite group names need to be the same?

    • The names do not have to be the same but the groups have to have the same devices in them etc… Are you creating them with the -rdf_consistency flag?

      This is the process you should use for creating composite groups manually (my R1 and R2 devices happen to have the same device IDs):

      On the protected side SYMAPI server:
      symcg create SRDFSpr -type RDF1 -rdf_consistency
      symcg add dev 65 -cg SRDFSpr
      symcg add dev 69 -cg SRDFSpr
      symcg add dev 6d -cg SRDFSpr
      symcg add dev 71 -cg SRDFSpr
      symcg enable -cg SRDFSpr

      On the recovery side SYMAPI server:
      symcg create SRDFSre -type RDF2 -rdf_consistency
      symcg add dev 65 -cg SRDFSre
      symcg add dev 69 -cg SRDFSre
      symcg add dev 6d -cg SRDFSre
      symcg add dev 71 -cg SRDFSre

Leave a ReplyCancel reply