XIOS 6.1 – Hosts Paths Monitoring

Internally, I refer to XIOS 6.1 as the “Protection Anywhere” version, you may think about the Native Replication aspect of the XMS / AppSync integration aspect of it but there is another layer, protecting you, the customers from making mistakes, we are after all, humans..

Motivation

Storage arrays provides to hosts several paths for each exposed volume and rely on the host multipath layer to perform load balancing between relevant paths, and in case of a path failover – failover IOs to healthy paths. In order to have proper path redundancy (HA), it is required to have at least two healthy paths residing on different failure domains (in our case – different storage controllers belonging to a given cluster).

If a path fails, but the host only has paths on a single failure domain (e.g. due to host multipath misconfiguration or bugs) then the host may lose all paths, resulting in a service loss (DU) and host application failures.

It is therefore desirable to detect lack of redundancy in host paths failure domains beforehand and especially before initiating maintenance related HA operations that may cause path failures e.g. NDU, replacing a storage controller.

Feature Objective

The purpose of the feature is to monitor Host paths to detect and notify lack of redundancy between host failure domains (different storage controllers) and prevent\stop destructive operation ( e.g. NDU, Replace SC).

Monitor Host paths to detect lack of redundancy between host failure domains (different storage controllers) and Prevent\stop destructive operation that can cause Host side DU.

  • Monitor is based on collection and analysis of I-T-L
    Health Indication
  • I-T-L considered Active\Healthy if IOs or heartbeat received for this I-T-L
  • I-T-L\Path redundancy for Initiator to specific Volume exist if it has at least 2 healthy paths through 2 targets at different SCs
  • Provide indication
    of lack of redundancy per Initiator (Note – we don’t provide indication at Host level)
  • Scope for this Release (XIOS 6.1)
    • Monitor Initiators’ paths periodically (Steady state)
      • Notify\Alert if lack of redundancy detected
      •  Indicate potential problems at Host connectivity or Multipath behavior.
    • Monitor Initiators’ paths at NDU – Main purpose for this release
      • Define Enter criteria to invoke ‘Cluster Upgrade’
      • Monitor Initiator path-redundancy during ‘OS Upgrade’ and Stop ‘OS Upgrade’ if degradation of Initiators’ path redundancy state detected
        • Prevent Host side DU , mitigate loss of access to LUNs.

Below you can see a screenshot flagging the error in the XMS WebUI, for example, here you can see that host / initiator “drm-pod1-esx45_fc0” has lost it’s redundant connectivity

We’ve also added some new CLI commands to monitor that

Initiator path-redundancy – CLI [1]

show-initiators [cluster-id=<id: name or index>] [duration=<seconds>] [filter=<>] [frequency=<seconds>] [prop-list=<>] [vertical=<N/A>]

Initiator-Name Index Port-Type Port-Address IG-Name Index     path-redundancy-state

10:00:00:90:fa:a9:48:aa 1 fc 10:00:00:90:fa:a9:48:aa LG1358 1    non-redundant

  • Remove display of:
    • Chap-Authentication-Initiator-User-Name
    • Chap-Discovery-Initiator-User-Name:
    • Chap-Authentication-Cluster-User-Name:
    • Chap-Discovery-Cluster-User-Name:
    • Initiator-OS: other
  • Display degraded initiators during os upgrade using filter:
    • path_redundancy_state = os_upgrade_disconnected or os_upgrade_non_redundant

show-initiators filter=path_redundancy_state:like:os_upgrade

Similar Posts

Leave a ReplyCancel reply