Hi, The first part of this blog series cover the SRM5 and the Symmetrix SRA functionality up to the point of running a SRM failover test.. http://volumes.blog/2011/10/08/vmware-srm-5-with-emc-symmetrix-%e2%80%93-what%e2%80%99s-new-part-1/ This part will […]
The first part of this blog series cover the SRM5 and the Symmetrix SRA functionality up to the point of running a SRM failover test..
This part will cover the actual Failover and Failback (ReProtect)
Gold Copies for Failover
SRA 5.0 has changed to create two separate files for Gold Copy information:
- It is important to note that in this release, for
recovery side gold copy only TF/Mirror (TF Clone emulation) is supported. SNAP
and CLONE are not. This will be resolved in 5.1. Protected side gold copies are
supported for all TF mechanisms.
Supported with all replication modes
- SRDF/A, SRDF/S, SRDF/STAR
Same requirements as for Test Failover
- E.G., SRDF/A and TimeFinder/Snap require 5875 and Write Pacing
New with the SRA 5.0 is the ability to create a gold copy on the Protection side as well as the Recovery side.
- Configuration of the recovery side gold copy can be done with VSI like with SRM 4.
- The ability to create/edit the protection side gold copy options file is not yet in VSI but is planned. Manual editing is required in this case.
If one or both of these options file are configured, a gold copy will be created on none, one or both of the sides during a Failover operation.
- The files that should be edited are the ones on the Recovery side SRM server
Default behavior of adapter is to continue on with Failover if gold copy creation fails. This can be changed by editing a parameter in the global options file.
Few differences in behavior of Test Failover and Gold Copy
- If consistency protection is not enabled, test failover fails where as gold copy will succeed.
Test failover performs “consistent” split or activate whereas gold copy doesn’t need to.
- For example, if the RDF link is in “Transmit Idle” state, the adapter cannot perform consistent split on the BCVs or consistent activate on clones and snaps. Therefore test failover fails where as goldcopy operation detects this scenario and performs normal split.
Planned Migration Recovery Plan execution
Previously the SRA performed an RDF swap of devices if possible after failover
- No longer occurs with failover for SRDF/A or SRDF/S devices due to new “reprotect” operation
STAR devices will be reconfigured to reverse replication though during failover
Array, cluster or site failure, “Disaster Recovery” option must be used
- Rides through failures on the protected side unlike “Planned Migration” which will fail upon any errorsSRA will try to reconfigure STAR environment if possible
- Certain failure scenarios will caused STAR Cascaded to become Concurrent
- In SRM 4.x the SRA performed an RDF Swap (when possible) after failover by default
- SRM 5 has increased the granularity of operations and has enforced this to be a separate operation: Reprotect
Reprotect may not be possible if there was a failure on the protected side
- Usually only fully functional in Planned Migration scenarios
- Storage operations may fail
Reprotect SRDF/A And SRDF/S
After failover device pairs are in “FailedOver” state
- Replication direction (though suspended) remains A -> B
The SRA performs a swap during Reprotect and reverses replication
- Replication is resumed but in the opposite direction B -> A
- Device personalities change: R1 becomes R2 and vice versa
- Protection groups and recovery plans are automatically updated and reversed
Reprotect – SRDF/STAR
For STAR, replication is already reversed and reconfigured during failover
- Assuming there were no site/storage failures, if so manual intervention may be necessary
Protection groups and recovery plans are automatically updated and reversed
Reprotect offers Force Cleanup similar to test recovery
- Only available after one failed reprotect
- If needed, usually means manual intervention with the storage will be required to resume replication
Reprotect – Failback
- After reprotect has been successfully executed failback can occur
- Failback is no different than failover and is executed in the same way