The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
1. White Paper
Abstract
This white paper provides information about using EMC® VPLEX™ Metro distributed virtual volumes with host clusters. The paper is an overview of configuration and operational information generic to all host clustering products currently supported with VPLEX.
August 2014
CONDITIONS FOR STRETCHED HOSTS CLUSTER SUPPORT ON EMC VPLEX METRO
3. 3
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Table of Contents
Executive summary.................................................................................................. 4
Audience ............................................................................................................................ 4
What Is EMC VPLEX? ................................................................................................ 4
Validated configuration ........................................................................................... 4
Supported use cases for host clusters over distance with VPLEX Metro .............................. 5
Configuration requirements ................................................................................................ 7
Understanding the preferred VPLEX site for distributed virtual volumes ..................... 7
Tested scenarios ..................................................................................................... 9
Terminology ....................................................................................................................... 9
Conclusion ............................................................................................................ 14
References ............................................................................................................ 14
4. 4
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Executive summary
This white paper provides information about using EMC® VPLEX™ Metro distributed virtual volumes with host clusters. The paper is an overview of configuration and operational information generic to all host clustering products currently supported with VPLEX. Details specific to each host cluster can be found in the Host Connectivity Guide for the related host operating system. In addition, links to solution-specific white papers are found in References.
Audience
The reader is assumed to have knowledge of host clustering basics and of the host clustering product to be implemented with VPLEX.
What Is EMC VPLEX?
EMC VPLEX is a federation solution that can be stretched across two geographically dispersed data centers separated by synchronous distances (maximum round trip latency = 5 milliseconds). It provides simultaneous access to storage devices at two sites through creation of VPLEX distributed virtual volumes1, supported on each side by a VPLEX Cluster.
Each VPLEX Cluster is itself highly available, scaling from two directors per VPLEX Cluster up to eight directors per VPLEX Cluster. Furthermore, each director is supported by independent power supplies, fans, and interconnects. Each VPLEX Cluster has no single point of failure.
Validated configuration
Figure 1 illustrates a typical configuration validated by EMC using host clusters with a VPLEX Metro deployment.
1 A VPLEX virtual volume with complete, synchronized copies of data (mirrors), exposed through two geographically separated VPLEX Clusters. Distributed virtual volumes can be simultaneously accessed by servers at two separate data centers.
5. 5
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Figure 1. Stretched cluster with VPLEX Metro
Supported use cases for host clusters over distance with VPLEX Metro
The following table shows the supported use cases with a VPLEX configuration. The two scenarios are shown in Figure 2 and Figure 3.
Local host cluster with hosts and VPLEX in two locations
(Host cluster is physically extended but logically local, for example, in adjacent buildings)
Supported
Geographically extended host cluster
(Host cluster is physically extended and logically defined as extended)
Supported
6. 6
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Figure 2. Local host cluster with hosts and VPLEX in two locations
Figure 3. Geographically extended host cluster
7. 7
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
In addition, the use of VPLEX distributed virtual volumes for disk heartbeating2 is supported with some operating systems. With Windows Failover Clustering, EMC does not support VPLEX distributed virtual volumes for the quorum disk. The Nodes and File Share Witness is the recommended quorum model.
Please see the Understanding the preferred VPLEX site for distributed virtual volumes section for best practices on distributed virtual volumes on VPLEX.
Configuration requirements
For support of this configuration, the following requirements must be met:
• The maximum round trip latency on the Fibre Channel network between the two sites must not exceed 5 ms. The Fibre Channel network is required by inter-cluster links connecting the two VPLEX Clusters within VPLEX Metro.
• In addition to the host cluster requirements, the maximum round trip latency on the IP network between the two sites should not exceed 5 ms. The IP network supports the hosts and the VPLEX Management Console.
• Specific host cluster products may have additional requirements; see the appropriate Host Connectivity Guide and white papers for more details.
The hosts forming the host clusters are on either side of the VPLEX Metro deployment.
Please refer to the next section for additional requirements for VPLEX distributed virtual volume.
Understanding the preferred VPLEX site for distributed virtual volumes
For each distributed virtual volume, VPLEX defines a detach rule. When there is a communication failure between the two clusters in VPLEX Metro, this detach rule identifies which VPLEX Cluster in a VPLEX Metro should detach its mirror leg, thereby allowing service to continue. The detach rule effectively defines a preferred site if VPLEX Clusters lose communication with each other. The purpose of having a defined preferred site is to ensure that there is no possibility of a “split brain” caused by both VPLEX Clusters continuing to allow I/O during communication failure.
After a complete communication failure between the two VPLEX Clusters, the preferred site continues to provide service to the distributed virtual volume. The other VPLEX Cluster will suspend I/O service to the volume and is referred to as the non- preferred site. The detach rule is at the distributed virtual volume level and hence any given site could be the preferred site for some distributed virtual volume and the non- preferred site for others. A VPLEX Metro instance can support several thousand distributed virtual volumes (see the EMC VPLEX with GeoSynchrony 5.x and Point Releases Release Notes for current limits), and each such volume has its own detach
2 In this paper, disk heartbeat is used generically to refer to any host cluster inter-node communication method that uses shared disks. This can include quorum disks, disk heartbeat networks, shared disk I/O fencing, and so on.
8. 8
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
rule. It is therefore possible for the same VPLEX Cluster (and therefore the hosts connected to it) to be on the preferred site with respect to one distributed virtual volume but to be on the non-preferred site with respect to another distributed virtual volume.
It is a best practice to configure the host cluster disk resource definitions and the VPLEX detached rules in parallel. That is, the highest-priority (or owning) host cluster node for a host cluster disk resource should be at the same site as the VPLEX preferred site for the distributed virtual volume(s) in that disk resource. That is, if the preferred host (owning host) for a host cluster disk resource is located at Site 1, all the distributed virtual volumes defined to be in that cluster disk resource should have their detach rules set to have Site 1 as the preferred site. Failure to follow this best practice will increase the number of situations where manual intervention will be required to restore data availability after a system outage.
There are two conditions that can cause the VPLEX Clusters to lose communication:
• Total VPLEX Cluster failure at one site (failure of all directors in a VPLEX Cluster): A complete VPLEX Cluster failure triggers the detach rule behaviors since the surviving VPLEX Cluster does not have the ability to distinguish between interlink communication loss and VPLEX Cluster failure. As a result, distributed virtual volumes whose preferred site is the surviving VPLEX Cluster will continue to service I/O without interruption. The distributed virtual volumes, whose preferred site is the failed VPLEX Cluster site, will enter into I/O suspension until manual intervention is performed. That is, all I/O activity to the virtual volume will be suspended by the VPLEX Cluster
• Failure of the inter-cluster communication links (VPLEX Cluster partition): The VPLEX Cluster partition case will also trigger the execution of the detach rule. Each distributed virtual volume will allow I/O to continue on its preferred site and suspend I/O on its non-preferred site.
When the VPLEX Cluster failure or VPLEX Cluster partition condition is resolved, the VPLEX Metro distributed virtual volume gets re-established, enabling I/O on both VPLEX Metro sites.
9. 9
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Tested scenarios
Terminology
In Table 1, the following terminology is used:
• Hosts running in the preferred site refers to hosts running on the preferred site for the Metro distributed virtual volume supporting the cluster disk resource for those hosts.
• Hosts running in the non-preferred site refers to hosts running on the non- preferred site for the Metro distributed virtual volume supporting the cluster disk resource for those hosts.
All of these scenarios assume that each host in the cluster has been configured with supported multipathing and cluster software with the required settings for failover according to documented high-availability requirements.
Table 1. Failure scenarios and impacts
Scenario
VPLEX behavior
Host cluster impact
Single VPLEX back-end (BE) path failure
VPLEX will switch to alternate paths to the same BE array and continue to provide access to the Metro distributed virtual volumes exposed to the hosts.
None.
Single VPLEX front-end (FE) path failure
VPLEX will continue to provide access to the Metro distributed virtual volume via alternate paths to the same VPLEX Cluster from the cluster host. The cluster host multipathing software will be expected to fail over to the alternate paths.
None.
BE array failure (preferred site for a Metro distributed virtual volume)
VPLEX will continue to provide access to the Metro distributed virtual volume through the non-preferred site BE array. When access to the array is restored, the storage volumes from the preferred site BE array will be resynchronized automatically.
None.
BE array failure (non- preferred site for a Metro distributed virtual volume)
VPLEX will continue to provide access to the Metro distributed virtual volume using the preferred site BE array. When access to the array is restored, the storage volumes from the non-preferred site BE array will be resynchronized automatically.
None.
Single front-end switch failure (preferred site for a Metro distributed virtual volume)
VPLEX will continue to provide access to the Metro distributed virtual volume via alternate paths to the same VPLEX Cluster from the cluster host. The cluster host multipathing software will be expected to fail over to the alternate paths.
None.
10. 10
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Scenario
VPLEX behavior
Host cluster impact
Single front-end switch failure (non-preferred site for a Metro distributed virtual volume)
VPLEX will continue to provide access to the Metro distributed virtual volume via alternate paths to the same VPLEX Cluster from the cluster host. The cluster host multipathing software will be expected to fail over to the alternate paths.
None.
VPLEX director failure
VPLEX will continue to provide access to the Metro distributed virtual volume through front-end paths available through other directors on the same VPLEX Cluster.
None.
Complete VPLEX site failure (where the preferred site for a Metro distributed virtual volume is in the site that has failed)
VPLEX will suspend I/O on the Metro distributed virtual volume on the non- preferred site. Once it is determined by the administrator that the site has failed, and it is not a case of inter-site communication failure, the volumes on the non-preferred site can be unsuspended ("resumed") using the device resume-link-down command.
Note that this process is manual intentionally. While the automated resumption of I/O works in the site failure, it does not work in the VPLEX Cluster partition case. Warning: Issuing the unsuspend command automatically on the non- preferred site would cause both sites to become simultaneously read-writeable, creating a potential split brain condition.
Hosts running in the preferred site: I/O will fail. Configured failover events to the non- preferred site will not be successful until the volumes are unsuspended on the non- preferred site. Manual intervention in the host cluster will probably be required after the manual resumption in the VPLEX Cluster.
Hosts running in the non- preferred site: These hosts will see the I/O as being suspended until the administrator resumes access. Depending on the timeouts configured on the hosts and host cluster, this may cause the host cluster to attempt to initiate the failover events configured for loss of disk access. Since the preferred site has failed, these events will also fail.
Complete VPLEX site failure (where the non- preferred site for a Metro distributed virtual volume is in the site that has failed)
VPLEX will continue to provide I/O access to the preferred site.
Hosts running in the preferred site: No impact.
Hosts running in the non- preferred site: All I/O access is lost.
Add cluster host(s) to the cluster
After the hosts are registered and added to the appropriate VPLEX view, VPLEX will provide access to the provisioned Metro distributed virtual volumes to the newly added host.
None.
11. 11
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Scenario
VPLEX behavior
Host cluster impact
Remove cluster host(s) from the cluster
After the hosts are removed from the appropriate VPLEX view and deregistered, the cluster host can be removed.
No impact if all shared disk resources are administratively moved off the host before the VPLEX configuration change. The host may log I/O errors if it is removed from the VPLEX configuration before it is removed from the host cluster.
Multiple cluster host failure(s) - power off
None.
Dependent on the number of hosts in the cluster and the failover policies configured by the cluster administrator.
Multiple cluster host failure(s) - Network disconnect
None.
If alternate heartbeat networks exist (2nd network, disk heartbeat), host I/O will continue. If the failed network is the only network, the cluster will go into a split brain situation.
(Additional network failure events are product- or implementation-specific and outside the scope of this paper.)
Single cluster host and a VPLEX director failure at the same site
The surviving VPLEX directors on the VPLEX Cluster with the failed director will continue to provide access to the Metro distributed virtual volumes.
The surviving hosts will lose a path, but I/O will continue.
Single director and back- end path failure at the same site
The surviving VPLEX directors on the VPLEX Cluster with the failed director will continue to provide access to the virtual volumes. VPLEX will switch to alternate paths (if available) to the same back end and continue to provide access to the Metro distributed virtual volumes.
The surviving hosts will lose a path, but I/O will continue.
Cluster host all paths down (encountered when the cluster host loses access to its storage volumes, that is, VPLEX volumes in this case)
None.
Ideally the I/Os on the host should resume automatically once the paths are restored. If not, the host cluster will initiate its action for host loss of I/O paths. (This is cluster-specific.)
12. 12
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Scenario
VPLEX behavior
Host cluster impact
VPLEX inter-site link failure, host cluster heartbeat network intact
VPLEX will transition distributed virtual volumes on the non-preferred site to the I/O suspension state. On the preferred site, the distributed virtual volumes will continue to provide access.
Note that in this case, I/O at the non- preferred site should not be manually unsuspended. In this case, given that both VPLEX Clusters survive, the preferred site will continue to allow I/O. Unsuspending I/O on the non-preferred site will result in the same distributed virtual volume to be read- writeable on both legs, creating a potential split brain condition. By restoring the inter- site links, the distributed virtual volume will become unsuspended on the non-preferred site.
Hosts running in the preferred site: No impact.
Hosts running in the non- preferred site: These hosts will see all I/O as suspended. After the I/O times out, these hosts will initiate the failover events configured for loss of disk access.
Complete dual site failure
Upon power on of a single VPLEX Cluster, VPLEX will intentionally keep all distributed virtual volumes in the suspended state even if it is the preferred site until such time as it is able to reconnect to the other site or unless the administrator manually resumes I/Os on these volumes using the device resume-link-down command. This behavior is to account for the possibility that I/Os have continued on the other site (either automatically, if the other site was preferred, or manually, if the other site was non- preferred) and thereby protect against data corruption.
If hosts are powered back on after all the distributed virtual volumes are manually resumed, the host cluster can be restarted/recovered according to its standard procedure. If the cluster hosts are restarted prior to the resumption of the distributed virtual volumes, the failed hosts need to be restarted manually.
Director failure at one site (preferred site for a given distributed virtual volume) and BE array failure at the other site (secondary site for a given distributed virtual volume)
The surviving VPLEX directors within the VPLEX Cluster with the failed director will continue to provide access to the Metro distributed virtual volumes. VPLEX will continue to provide access to the Metro distributed virtual volumes using the preferred site BE array.
None.
Host cluster network partition but VPLEX WAN links remain intact
None.
If the host cluster is using VPLEX distributed virtual volumes for disk heartbeating, the host cluster will log the network down event but continue to function. If the host cluster network is the only heartbeat link, host cluster split brain will result.
13. 13
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
Scenario
VPLEX behavior
Host cluster impact
Host cluster inter-site network as well as VPLEX inter-site network partition
VPLEX will suspend I/O on the non-preferred site for a given distributed virtual volume. The volumes will continue to have access on the distributed virtual volume on its preferred site.
Note that in this case, I/O at the non- preferred site should not be manually unsuspended. In this case, given that both VPLEX Clusters survive, the preferred site will continue to allow I/O. Unsuspending I/O on the non-preferred site will result in the same Metro distributed virtual volume to be read- writeable on both legs, creating a potential split brain condition. By restoring the inter- site networks, the distributed virtual volume will become unsuspended on the non- preferred site.
Hosts running in the preferred site: The hosts will continue to run normally. The hosts at the primary site will assume the hosts at the secondary site are down and initiate the takeover events configured in the host cluster. These resource takeover events should succeed as the primary site will have exclusive access to the disks.
As this is a host-cluster split brain situation, the host cluster’s documented recovery procedures will need to be followed when the inter-site communications links are restored
Hosts running in the non- preferred site: These hosts will see the I/O as being suspended, and will also consider the hosts at the primary site down. The failover events configured to occur for each of these scenarios will fail; the hosts cannot take over the other site’s (presumed) failed resources, and cannot transfer its own failed resources.
The actual series of events will depend on the host cluster configuration. But the net result will be that the hosts at the non- preferred site will be in an error state.
As this a host-cluster split brain situation, the host cluster’s documented recovery procedures will need to be followed when the inter-site communications links are restored.
14. 14
Conditions for Stretched Hosts Cluster Support on EMC VPLEX Metro
In the failure modes described involving VPLEX Cluster failures, after the cluster is recovered, it joins back into the VPLEX Metro instance. If a distributed virtual volume was running I/O on the peer site (either because this was the preferred site or because the administrator had manually chosen to resume I/Os), the joining VPLEX Cluster will recognize this and immediately provide the latest data back to the hosts accessing the same distributed virtual volume through the joining VPLEX Cluster. Any stale data in the joining VPLEX Cluster is discarded and/or overwritten.
Conclusion
By using VPLEX distributed virtual volumes, data can be transparently made available to all nodes in a host cluster divided across two physical locations. The host cluster can be defined either as a local cluster or a geographically extended cluster by the host cluster software; VPLEX supports either scenario.
Like a host clustering product, VPLEX is architected to react to failures in a way that will minimize the risk of data corruption. With VPLEX, the concept of a "preferred site" ensures that one site will have exclusive access to the data in the event of an inter- site failure. To minimize downtime and manual intervention, it is a best practice to configure the host cluster and VPLEX in parallel; that is, for a given resource the VPLEX preferred site and the host cluster preferred node(s) should be set to the same physical location.
References
VPLEX white papers can be on EMC.com and EMC Online Support (access required). Other resources include:
• Using VPLEX Metro with VMware HA (VMware article)
• EMC Host Connectivity Guide for Windows (access required for E-Lab™ Navigator)
• EMC VPLEX with IBM AIX Virtualization and Clustering (EMC white paper)
• Using VPLEX Metro with VMware High Availability and Fault Tolerance for Ultimate Availability