We have just released the 3.6 of Dell PowerStore and, it’s a big release for our customers and us The Witness Server is a stand-alone 3rd component, ideally residing in […]
We have just released the 3.6 of Dell PowerStore and, it’s a big release for our customers and us
The Witness Server is a stand-alone 3rd component, ideally residing in a 3rd data center and used to help determine which array should survive in the event of a failure. Witness can also be known as a tie breaker or mediator server. A Witness Server is passive in nature and records the history of the “keep alive” signals it receives, meaning the Witness Server can easily be recreated if needed.
The Witness Server will run as a physical host or virtual machine and will be delivered as a Linux RPM with support for both RH8.5 and SLES 15 SP3.
So why are we adding Witness Server to Metro Volume? When metro replication is configured, the user specifies one site as “preferred” and the other as “non-preferred.” If the non-preferred site goes down, failover happens automatically. If the preferred site becomes unavailable, the user will need to initiate a manual failover process. With Witness Server, we automate the failover procedure and cover the following new use cases:
- Ability for automatic recovery from a preferred site failure – NEW
- Ability for automatic recovery from a preferred array failure – NEW
- Ability for automatic recovery from a link failure – supported today and preferred system continues to serve IO
- Ability for automatic recovery from a Non-preferred site failure – supported today
- Ability for automatic recovery from a Non-preferred array failure – supported today
To ensure a common understanding, I want to go a bit into the terminology
There are two general configuration options for a Metro Volume which is spanned across two PowerStore cluster in metro distance.
1st Non Uniform
- Hosts have only paths to the local PowerStore
- Hosts are using “Local Connectivity” in Host connectivity options on PowerStore
If there is a failure, a host may lose all paths to the metro volume. A VM running on the affected host need to be restarted manually or by vSphere HA, when configured, on a host with active paths to the metro volume.
If the configuration is extended with cross-connects it’s a Uniform configuration.
- Hosts can access the local PowerStore and remote PowerStore
- Depending on host connectivity option setting in PowerStore manager, the hosts get ALUA Active-Optimized paths to local PowerStore or local and remote PowerStore
- A proper host connectivity option setting is important for best possible performance of a Metro Volume
When we take previous example, when a host lose all paths to one of the two PowerStore cluster, the metro volume can still be access through cross links and switching the active paths in ESXi should be transparent for the running VMs
A Metro volume is a logical Volume built up by volumes on both participating PowerStore systems. The provided active-active metro solution allows simultaneous host access on both volumes in a metro volume configuration.
PowerStore controls the metro volume replication in a metro session for bi-directional mirroring of writes.
A host write ideally first go into the local PowerStore and is immediately synchronized to the remote PowerStore and host gets the acknowledge when data is committed for both volumes
In a configuration without a witness the individual metro volumes uses the roles Preferred and Non-Preferred. The volume in role Preferred will be the declared winner and can continue during a failure situation. The opposite volumes gets the role Non-Preferred and will go offline for the hosts when preferred is not reachable regardless if preferred is available for the hosts or not.
The new Metro Witness covered here, improves the failure handling and helps PowerStore to decide to keep the Non-Preferred online when it’s sure that Preferred is not available.
I don’t want to forget mentioning, PowerStore not only supports a single metro volume session, and different volumen can have different roles for metro volumes on the same array.
For instance, on the PowerStore array in datacenter 1 on the left, Metro Volume 1 has the role “Preferred” while Metro Volume 2 on the same array has the role “Non-Preferred”
On PowerStore in datacenter 2 on the right, the roles are swapped
Metro Witness – Installation
As prerequisite it’s important that PowerStore can reach port 443/tcp for https traffic on the witness server. Possible reasons for a blocked port could be an ACL on router, a physical firewall, or even just a local firewall configured in the host OS of the witness server.
For the installation we can use either the rpm tool itself or a linux distribution dependent package management utility. For the supported linux versions it’s yum on a RHEL linux system, or zypper on SLES.
Using a package management utility is recommended, because it resolves and install all dependencies from the online OS repositories when installing the dell witness service. When using the rpm tool, all dependencies have to be resolved manually before dell-witness service can be installed.
If required, the package manager or rpm tool can also uninstall the witness service
- With rpm tool it’s rpm dash I and the dell-witness-service rpm
Depending on the used linux the package management ultilities are yum or zypper, but the syntax is similar
- For RHEL yum install and the name of the dell-witness-service rpm
- For SLES it’s zypper install and the name of the dell witness service rpm
Metro Witness – Installation
This slide shows an example of the installation with zypper on a SUSE Linux systems
- The first line shows the command
- In the red box we can see the required dependencies found by the rpm package management utility – which is the java openjdk in the shown example.
The second half of the screen below the red box shows the retrieval of required rpms and the installation
Metro Witness – Registration in PowerStore Manager
- Required steps for both PowerStore clusters
- Generate token on witness server (expires after 10 minutes)
- Add new Metro Witness in PowerStore manager
- Protection > Metro Witness > Add Witness
- Individual Name
- IP Address or FQDN for connecting to the Metro Witness
- Security Token
- An optional Description
Confirm SSL certificate
Before we get into the screen where we can see the thumbprint in PowerStore manager, we have to initiate the registration of the witness which is required for both PowerStore cluster involved in the metro configuration.
I already mention the token which is required to register the witness in an earlier slide. To get the token its required to run the generate_token script on the witness server as shown in the example.
The token expires after 10 minutes and could be used to register the witness service in multiple PowerStore managers.
For the metro witness overview in PowerStore maanger nvigate to In PowerStore manager to Protection > Metro Witness. To add a new witness, choose add to start the wizard.
The required fields are Name an individual name for the witness
The IP address or Full qualified domain name for connecting to the metro witness
And the security token, which is the token generated by the generate token script on the witness server.
The description is optional.
After the form is filled click ADD
This leads to the screen shown in previous slide to confirm the certificate.
After confirm PowerStore starts to configure the metro witness.
Metro Witness – Registration in PowerStore
The Metro Witness overview shows the registered Metro witness service. The overview shows the connection state and protected number of metro resources. After installation and initializiation of the witness all metro volume sessions are immediatelty protected by the installed metro wtiness.
When we click on the Name of the service it’s possible to get some additional information which are not available in the overview – like the connection state of the individual nodes.
Metro Witness in PowerStore Manager
Valid Connectivity states for the metro witness are
- Partially connected
- Deleting … and …
Even though the status gives a good indication what it means, let me elaborate on these
- OK is the states we would to see during normal conditions. All nodes on all applinaces can communicate with the witness
- Partially connected is shown when one only some nodes or appliances are successfully ocnnecected to the witness service. The status can also mean that the same metro witness is not registered on the peer powerstore system
- Disconnected when all nodes on all appliances cannot comminitcate with the witness
- Deleting when the witness is current being unregistered and deleted from the cluster, or when the metro witness delete fails
- The status initializing is shown when the nodes are initializing the connection to the witness service – maybe because the witness was added to PowerStore
Manage Metro Witness using PowerStore CLI (pstcli)
When you like to use pstcli to manage a PowerStore, there is a new resource witness available which allows same contral as PowerStore manager GUI
There are options to show, create and delete a metro witness in PowerStore, the set option could be used to change the name or description of a configured metro witness in PowerStore manager
On the bottom an example shows the cli to register a witness by using pstcli
The given settings are the same as required when using the PowerStore manager GUI
Manage Metro Witness using PowerStore REST-API
Metro Witness in PowerStore Manager
Back in PowerStore manager we are now seeing an example of the Metro session overview screen.
There are two additional columns showing the associated witness name and witness state. The column Local
Metro Witness in PowerStore Manager
- Witness is being initialized, but not engaged.
- Note: This state will be used when the witness is configured for the non-preferred system first. It will remain in Initializing until the witness is configured for the preferred system of the metro session.
- Witness is initialized, but not engaged.
- This will be the state when the metro session is fractured, as there is nothing for the witness to do in that case.
- Witness is engaged.
- This is the only state which indicates the witness will be leveraged by the metro session in a failure scenario.
- Not engaged due to incorrect witness configuration on the preferred system of the metro session.
- Possible invalid configurations are that only one PowerStore has the witness configured or the two PowerStores have different witnesses configured.
- Failed to initialize witness with metro session.
- Witness being unconfigured for session.
This screen shows an example of the detailed metro session which is operating normally and a configured metro witness
In addition to the already available information, highlighted with the red box, the details page shows also information about the witness and witness state.
In that example it’S the Wtiness DC3 in status Engaged.
This slide shows a metro volume in fractured state and the Preferred is the winner.
The local preferred shows “Preferred – System Promoted” as PowerStore did the promotion based on the response from witness service.
In when a metro session is fractured or even in paused state, the witness can’t be used for any decision, and the Witness state turns into “Disengaged”
This screenshot shows a metro volume in fractured state and the Non-Preferred is the winner.
It’s very similar, but notice the Preferred tag on the storage icons. We are singed into PowerSTore Manager of the Non-Preferred volume of a metro session which is in status “Fractured”
The local Preferred State shows Non-Preferred – System Promoted to indicate PowerStore did made the descission to Promote the Non-Preferred Volume.
When the connection to metro witness is lost the status is available in Meetro Witness overview.
The example shows a configured metro witness in PowerStore manager which is not reachable – indicated by a connection state “Complete connection loss”
PowerStore raises an alert which is forwarded to external monitoring when configured.
When only some connections for the cluster, details are available in the Metro Witness properties.
The example indicates that both nodes for the appliance lost their connection to the metro witness.
When the connection to the witness service is lost, the witness state for the metro session change to “Disengaged”. When a metro witness for a metro session is in status “disengaged” it falls back to polarization with preferred and non-preferred role.
When metro witness is disengaged and non preferred lose it’s connection to the preferred, the volume turn into offline for the non-preferred and would remain online for preferred when available.
If preferred is not available, an operator has to manually promote the non-preferred to get it back online for host access.
You can download the software and the documentation, using the link below: