TechBook: EMC VPLEX Metro Witness Technology and High Availability

EMC VPLEX Metro Witness
Technology and High Availability

Version 2.1

• EMC VPLEX Witness
• VPLEX Metro High Availability
• Metro HA Deployment Scenarios

Jennifer Aspesi
Oliver Shorey

Copyright © 2010 - 2012 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS
PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.

For the most up-to-date regulatory document for your product line, go to the Technical Documentation and
Advisories section on EMC Powerlink.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.

All other trademarks used herein are the property of their respective owners.

Part number H7113.2

2 EMC VPLEX Metro Witness Technology and High Availability

Contents

Preface

Chapter 1 VPLEX Family and Use Case Overview
Introduction ....................................................................................... 18
VPLEX value overview .................................................................... 19
VPLEX product offerings ................................................................ 23
VPLEX Local, VPLEX Metro, and VPLEX Geo ......................23
Architecture highlights ..............................................................25
Metro high availability design considerations ............................. 28
Planned application mobility compared with disaster
restart ...........................................................................................29

Chapter 2 Hardware and Software
Introduction ....................................................................................... 32
VPLEX I/O ..................................................................................32
High-level VPLEX I/O flow......................................................32
Distributed coherent cache........................................................33
VPLEX family clustering architecture ....................................33
VPLEX single, dual, and quad engines ...................................35
VPLEX sizing tool.......................................................................35
Upgrade paths.............................................................................36
Hardware upgrades ...................................................................36
Software upgrades......................................................................36
VPLEX management interfaces ...................................................... 37
Web-based GUI ...........................................................................37
VPLEX CLI...................................................................................37
SNMP support for performance statistics...............................38
LDAP /AD support ...................................................................38

EMC VPLEX Metro Witness Technology and High Availability 3

Contents

VPLEX Element Manager API.................................................. 38
Simplified storage management..................................................... 39
Management server user accounts................................................. 40
Management server software.......................................................... 41
Management console ................................................................. 41
Command line interface ............................................................ 43
System reporting......................................................................... 44
Director software .............................................................................. 45
Configuration overview................................................................... 46
Single engine configurations..................................................... 46
Dual configurations.................................................................... 47
Quad configurations .................................................................. 48
I/O implementation ......................................................................... 50
Cache coherence ......................................................................... 50
Meta-directory ............................................................................ 50
How a read is handled............................................................... 50
How a write is handled ............................................................. 52

Chapter 3 System and Component Integrity
Overview............................................................................................ 54
Cluster ................................................................................................ 55
Path redundancy through different ports ..................................... 56
Path redundancy through different directors............................... 57
Path redundancy through different engines................................. 58
Path redundancy through site distribution .................................. 59
Serviceability ..................................................................................... 60

Chapter 4 Foundations of VPLEX High Availability
Foundations of VPLEX High Availability .................................... 62
Failure handling without VPLEX Witness (static preference).... 70

Chapter 5 Introduction to VPLEX Witness
VPLEX Witness overview and architecture .................................. 82
VPLEX Witness target solution, rules, and best practices .......... 85
VPLEX Witness failure semantics................................................... 87
CLI example outputs........................................................................ 93
VPLEX Witness – The importance of the third failure
domain ......................................................................................... 97


Contents

Chapter 6 VPLEX Metro HA
VPLEX Metro HA overview .......................................................... 100
VPLEX Metro HA Campus (with cross-connect) ...................... 101
VPLEX Metro HA (without cross-cluster connection)............... 111

Chapter 7 Conclusion
Conclusion........................................................................................ 120
Better protection from storage-related failures ....................121
Protection from a larger array of possible failures...............121
Greater overall resource utilization........................................122

Glossary


Contents


Figures

Title Page
1 Application and data mobility example ..................................................... 20
2 HA infrastructure example ........................................................................... 21
3 Distributed data collaboration example ..................................................... 22
4 VPLEX offerings ............................................................................................. 24
5 Architecture highlights.................................................................................. 26
6 VPLEX cluster example ................................................................................. 34
7 VPLEX Management Console ...................................................................... 42
8 Management Console welcome screen ....................................................... 43
9 VPLEX single engine configuration............................................................. 47
10 VPLEX dual engine configuration ............................................................... 48
11 VPLEX quad engine configuration .............................................................. 49
12 Port redundancy............................................................................................. 56
13 Director redundancy...................................................................................... 57
14 Engine redundancy ........................................................................................ 58
15 Site redundancy.............................................................................................. 59
16 High level functional sites in communication ........................................... 62
17 High level Site A failure ................................................................................ 63
18 High level Inter-site link failure ................................................................... 63
19 VPLEX active and functional between two sites ....................................... 64
20 VPLEX concept diagram with failure at Site A.......................................... 65
21 Correct resolution after volume failure at Site A....................................... 66
22 VPLEX active and functional between two sites ....................................... 67
23 Inter-site link failure and cluster partition ................................................. 68
24 Correct handling of cluster partition........................................................... 69
25 VPLEX static detach rule............................................................................... 71
26 Typical detach rule setup .............................................................................. 72
27 Non-preferred site failure ............................................................................. 73
28 Volume remains active at Cluster 1............................................................. 74
29 Typical detach rule setup before link failure ............................................. 75
30 Inter-site link failure and cluster partition ................................................. 76


Figures

31 Suspension after inter-site link failure and cluster partition ................... 77
32 Cluster 2 is preferred ..................................................................................... 78
33 Preferred site failure causes full Data Unavailability ............................... 79
34 High Level VPLEX Witness architecture.................................................... 83
35 High Level VPLEX Witness deployment .................................................. 84
36 Supported VPLEX versions for VPLEX Witness ....................................... 86
37 VPLEX Witness volume types and rule support....................................... 86
38 Typical VPLEX Witness configuration ....................................................... 87
39 VPLEX Witness and an inter-cluster link failure....................................... 88
40 VPLEX Witness and static preference after cluster partition................... 89
41 VPLEX Witness typical configuration for cluster 2 detaches .................. 90
42 VPLEX Witness diagram showing cluster 2 failure .................................. 91
43 VPLEX Witness with static preference override........................................ 92
44 Possible dual failure cluster isolation scenarios ........................................ 95
45 Highly unlikely dual failure scenarios that require manual
intervention ..................................................................................................... 96
46 Two further dual failure scenarios that would require manual
intervention ..................................................................................................... 97
47 High-level diagram of a Metro HA campus solution for VMware ...... 101
48 Metro HA campus diagram with failure domains.................................. 104
49 Metro HA campus diagram with disaster in zone A1............................ 105
50 Metro HA campus diagram with failure in zone A2.............................. 106
51 Metro HA campus diagram with failure in zone A3 or B3.................... 107
52 Metro HA campus diagram with failure in zone C1 .............................. 108
53 Metro HA campus diagram with intersite link failure........................... 109
54 Metro HA Standard High-level diagram ................................................. 111
55 Metro HA high-level diagram with fault domains ................................. 113
56 Metro HA high-level diagram with failure in domain A2..................... 114
57 Metro HA high-level diagram with intersite failure.............................. 116


Tables

Title Page
1 Overview of VPLEX features and benefits .................................................. 26
2 Configurations at a glance ............................................................................. 35
3 Management server user accounts ............................................................... 40


Tables


Preface

This EMC Engineering TechBook describes and provides an insightful
discussion on how implementation of VPLEX will lead to a higher level of
availability.
As part of an effort to improve and enhance the performance and capabilities
of its product lines, EMC periodically releases revisions of its hardware and
software. Therefore, some functions described in this document may not be
supported by all versions of the software or hardware currently in use. For
the most up-to-date information on product features, refer to your product
release notes. If a product does not function properly or does not function as
described in this document, please contact your EMC representative.

Audience This document is part of the EMC VPLEX family documentation set,
and is intended for use by storage and system administrators.
Readers of this document are expected to be familiar with the
following topics:
◆ Storage area networks
◆ Storage virtualization technologies
◆ EMC Symmetrix, VNX series, and CLARiiON products

Related Refer the EMC Powerlink website at http://powerlink.emc.com
documentation where the majority of the following documentation can be found
under Support > Technical Documentation and Advisories >
Hardware Platforms > VPLEX Family.
◆ EMC VPLEX Architecture Guide
◆ EMC VPLEX Installation and Setup Guide
◆ EMC VPLEX Site Preparation Guide


Preface

◆ Implementation and Planning Best Practices for EMC VPLEX
Technical Notes
◆ Using VMware Virtualization Platforms with EMC VPLEX - Best
Practices Planning
◆ VMware KB: Using VPLEX Metro with VMware HA
◆ Implementing EMC VPLEX Metro with Microsoft Hyper-V, Exchange
Server 2010 with Enhanced Failover Clustering Support
◆ White Paper: Using VMware vSphere with EMC VPLEX — Best
Practices Planning
◆ Oracle Extended RAC with EMC VPLEX Metro—Best Practices
Planning
◆ White Paper: EMC VPLEX with IBM AIX Virtualization and
Clustering
◆ White Paper: Conditions for Stretched Hosts Cluster Support on EMC
VPLEX Metro
◆ White Paper: Implementing EMC VPLEX and Microsoft Hyper-V and
SQL Server with Enhanced Failover Clustering Support — Applied
Technology

Organization of this This document is divided into the following chapters:
TechBook
◆ Chapter 1, “VPLEX Family and Use Case Overview,”
summarizes the VPLEX family. It also covers some of the key
features of the VPLEX family system, architecture and use cases.
◆ Chapter 2, “Hardware and Software,” summarizes hardware,
software, and network components of the VPLEX system. It also
highlights the software interfaces that can be used by an
administrator to manage all aspects of a VPLEX system.
◆ Chapter 3, “System and Component Integrity,” summarizes how
VPLEX clusters are able to handle hardware failures in any
subsystem within the storage cluster.
◆ Chapter 4, “Foundations of VPLEX High Availability,”
summarizes the concepts of the industry-wide dilemma of
building absolute HA environments and how VPLEX Metro
functionality manually accepts the historical challenge.
◆ Chapter 5, “Introduction to VPLEX Witness,” explains VPLEX
architecture and operation.


Preface

◆ Chapter 6, “VPLEX Metro HA,” explains how VPLEX
functionality can provide the absolute HA capability, by
introducing a “Witness” to the inter-cluster environment.
◆ Chapter 7, “Conclusion,” provides a summary of benefits using
VPLEX technology as related to VPLEX Witness and High
Availability.
◆ Appendix A, “vSphere 5.0 Update 1 Additional Settings,”
provides additional settings needed when using vSphere 5.0
update 1.

Authors This TechBook was authored by the following individuals from the
Enterprise Storage Division, VPLEX Business Unit based at EMC
headquarters, Hopkinton, Massachusetts.
Jennifer Aspesi has over 10 years of work experience with EMC in
Storage Area Networks (SAN), Wide Area Networks (WAN), and
Network and Storage Security technologies. Jen currently manages
the Corporate Systems Engineer team for the VPLEX Business Unit.
She earned her M.S. in Marketing and Technological Innovation from
Worcester Polytech Institute, Massachusetts.
Oliver Shorey has over 11 years experience working within the
Business Continuity arena, seven of which have been with EMC
engineering, designing and documenting high-end replication and
geographically-dispersed clustering technologies. He is currently a
Principal Corporate Systems Engineer in the VPLEX Business Unit.

Additional Additional contributors to this book include:
contributors
Colin Durocher has 8 years of experience in developing software for
the EMC VPLEX product as its predecessor and current state, testing
it, and helping customers implement it. He is currently working on
the product management team for the VPLEX business unit. He has a
B.S. in Computer Engineering from the University of Alberta and is
currently pursuing an MBA from the John Molson School of Business.
Gene Ortenberg has more than 15 years of experience in building
fault-tolerant distributed systems and applications. For the past 8
years he has been designing and developing highly-available storage
virtualization solutions at EMC. He currently holds a position of a
Software Architect for the VPLEX Business Unit under the EMC
Enterprise Storage Division.


Preface

Fernanda Torres has over 10 years of Marketing experience in the
Consumer Products industry, most recently in consumer electronics.
Fernanda is the Product Marketing Manager for VPLEX under the
EMC Enterprise Storage Division. She has undergraduate degree
from the University of Notre Dame and a bilingual degree
(English/Spanish) from IESE in Barcelona, Spain.

Typographical EMC uses the following type style conventions in this document:
conventions
Normal Used in running (nonprocedural) text for:
• Names of interface elements (such as names of windows, dialog
boxes, buttons, fields, and menus)
• Names of resources, attributes, pools, Boolean expressions,
buttons, DQL statements, keywords, clauses, environment
variables, functions, utilities
• URLs, pathnames, filenames, directory names, computer
names, filenames, links, groups, service keys, file systems,
notifications
Bold Used in running (nonprocedural) text for:
• Names of commands, daemons, options, programs, processes,
services, applications, utilities, kernels, notifications, system
calls, man pages
Used in procedures for:
• Names of interface elements (such as names of windows, dialog
boxes, buttons, fields, and menus)
• What user specifically selects, clicks, presses, or types
Italic Used in all text (including procedures) for:
• Full titles of publications referenced in text
• Emphasis (for example a new term)
• Variables
Courier Used for:
• System output, such as an error message or script
• URLs, complete paths, filenames, prompts, and syntax when
shown outside of running text
Courier bold Used for:
• Specific user input (such as commands)
Courier italic Used in procedures for:
• Variables on command line
• User input variables
<> Angle brackets enclose parameter or variable values supplied by
the user
[] Square brackets enclose optional values


Preface

| Vertical bar indicates alternate selections - the bar means “or”

{} Braces indicate content that you must specify (that is, x or y or z)

... Ellipses indicate nonessential information omitted from the example

We'd like to hear from you!
Your feedback on our TechBooks is important to us! We want our
books to be as helpful and relevant as possible, so please feel free to
send us your comments, opinions and thoughts on this or any other
TechBook:
TechBooks@emc.com


Preface


1
VPLEX Family and Use
Case Overview

This chapter provides a brief summary of the main use cases for the
EMC VPLEX family and design considerations for high availability. It
also covers some of the key features of the VPLEX family system.
Topics include:
◆ Introduction ........................................................................................ 18
◆ VPLEX value overview ..................................................................... 19
◆ VPLEX product offerings ................................................................. 23
◆ Metro high availability design considerations .............................. 28

VPLEX Family and Use Case Overview 17

VPLEX Family and Use Case Overview

Introduction
The purpose of this TechBook is to introduce EMC® VPLEX™ high
availability and the VPLEX Witness as it is conceptually
architectured, typically by customer storage administrators and EMC
Solutions Architects. The introduction of VPLEX Witness provides
customers with absolute physical and logical fabric and cache
coherent redundancy if it is properly designed in the VPLEX Metro
environment.
This TechBook is designed to provide an overview of the features and
functionality associated with the VPLEX Metro configuration and the
importance of active/active data resiliency for today’s advanced host
applications.



VPLEX value overview
At the highest level, VPLEX has unique capabilities that storage
administrators value and are seeking to enhance their existing data
centers. It delivers distributed, dynamic and smart functionality into
existing or new data centers to provide storage virtualization across
geographical boundaries.
◆ VPLEX is distributed, because it is a single interface for
multi-vendor storage and it delivers dynamic data mobility,
enabling the ability to move applications and data in real-time,
with no outage required.
◆ VPLEX is dynamic, because it provides data availability and
flexibility as well as maintaining business through failures
traditionally requiring outages or manual restore procedures.
◆ VPLEX is smart, because its unique AccessAnywhere technology
can present and keep the same data consistent within and
between sites and enable distributed data collaboration.
Because of these capabilities, VPLEX delivers unique and
differentiated value to address three distinct requirements within our
target customers’ IT environments:
◆ The ability to dynamically move applications and data across
different compute and storage installations, be they within the
same data center, across a campus, within a geographical region –
and now, with VPLEX Geo, across even greater distances.
◆ The ability to create high-availability storage and a compute
infrastructure across these same varied geographies with
unmatched resiliency.
◆ The ability to provide efficient real-time data collaboration over
distance for such “big data” applications as video, geographic
/oceanographic research, and more.
EMC VPLEX technology is a scalable, distributed-storage federation
solution that provides non-disruptive, heterogeneous
data-movement and volume-management functionality.
Insert VPLEX technology between hosts and storage in a storage area
network (SAN) and data can be extended over distance within,
between, and across data centers.

VPLEX value overview 19


The VPLEX architecture provides a highly available solution suitable
for many deployment strategies including:
◆ Application and Data Mobility — The movement of virtual
machines (VM) without downtime. An example is shown in
Figure 1.

Figure 1 Application and data mobility example

Storage administrators have the ability to automatically balance
loads through VPLEX, using storage and compute resources from
either cluster’s location. When combined with server
virtualization, VPLEX allows users to transparently move and
relocate Virtual Machines and their corresponding applications
and data over distance. This provides a unique capability allowing
users to relocate, share and balance infrastructure resources
between sites, which can be within a campus or between data
centers, up to 5ms apart with VPLEX Metro, or further apart
(50ms RTT) across asynchronous distances with VPLEX Geo.

Note: Please submit an RPQ if VPLEX Metro is required up to 10ms or check
the support matrix for the latest supported latencies.



• HA Infrastructure — Reduces recovery time objective (RTO).
An example is shown in Figure 2.

Figure 2 HA infrastructure example

High availability is a term that several products will claim they
can deliver. Ultimately, a high availability solution is supposed to
protect against a failure and keep an application online. Storage
administrators plan around HA to provide near continuous
uptime for their critical applications, and automate the restart of
an application once a failure has occurred, with as little human
intervention as possible.
With conventional solutions, customers typically have to choose a
Recovery Point Objective and a Recovery Time Objective. But
even while some solutions offer small RTOs and RPOs, there can
still be downtime and, for most customers, any downtime can be
costly.

VPLEX value overview 21


• Distributed Data Collaboration — Increases utilization of
passive data recovery (DR) assets and provides simultaneous
access to data. An example is shown in Figure 3.

Figure 3 Distributed data collaboration example

• This is when a workforce has multiple users at different sites
that need to work on the same data, and maintain consistency
in the dataset when changes are made. Use cases include
co-development of software where the development happens
across different teams from separate locations, and
collaborative workflows such as engineering, graphic arts,
videos, educational programs, designs, research reports, and
so forth.
• When customers have tried to build collaboration across
distance with the traditional solutions, they normally have to
save the entire file at one location and then send it to another
site using FTP. This is slow, can incur heavy bandwidth costs
for large files, or even small files that move regularly, and
negatively impacts productivity because the other sites can sit
idle while they wait to receive the latest data from another site.
If teams decide to do their own work independent of each
other, then the dataset quickly becomes inconsistent, as
multiple people are working on it at the same time and are
unaware of each other’s most recent changes. Bringing all of
the changes together in the end is time-consuming, costly, and
grows more complicated as the data-set gets larger.



VPLEX product offerings
VPLEX first meets high-availability and data mobility requirements
and then scales up to the I/O throughput required for the front-end
applications and back-end storage.
High-availability and data mobility features are characteristics of
VPLEX Local, VPLEX Metro, and VPLEX Geo.
A VPLEX cluster consists of one, two, or four engines (each
containing two directors), and a management server. A dual-engine
or quad-engine cluster also contains a pair of Fibre Channel switches
for communication between directors.
Each engine is protected by a standby power supply (SPS), and each
Fibre Channel switch gets its power through an uninterruptible
power supply (UPS). (In a dual-engine or quad-engine cluster, the
management server also gets power from a UPS.)
The management server has a public Ethernet port, which provides
cluster management services when connected to the customer
network.
This section provides information on the following:
◆ “VPLEX Local, VPLEX Metro, and VPLEX Geo” on page 23
◆ “Architecture highlights” on page 25

VPLEX Local, VPLEX Metro, and VPLEX Geo
EMC offers VPLEX in three configurations to address customer needs
for high-availability and data mobility:
◆ VPLEX Local
◆ VPLEX Metro
◆ VPLEX Geo

VPLEX product offerings 23


Figure 4 provides an example of each.

Figure 4 VPLEX offerings

VPLEX Local
VPLEX Local provides seamless, non-disruptive data mobility and
ability to manage multiple heterogeneous arrays from a single
interface within a data center.
VPLEX Local allows increased availability, simplified management,
and improved utilization across multiple arrays.

VPLEX Metro with AccessAnywhere
VPLEX Metro with AccessAnywhere enables active-active, block
level access to data between two sites within synchronous distances.
The distance is limited as to what Synchronous behavior can
withstand as well as consideration to host application stability and
MAN traffic. It is recommended that depending on the application
that consideration for Metro be less than or equal to 5ms1 RTT.
The combination of virtual storage with VPLEX Metro and virtual
servers enables the transparent movement of virtual machines and
storage across a distance.This technology provides improved
utilization across heterogeneous arrays and multiple sites.

1. Refer to VPLEX and vendor-specific White Papers for confirmation of
latency limitations.



VPLEX Geo with AccessAnywhere
VPLEX Geo with AccessAnywhere enables active-active, block level
access to data between two sites within asynchronous distances.
VPLEX Geo enables better cost-effective use of resources and power.
Geo provides the same distributed device flexibility as Metro but
extends the distance up to and within 50ms RTT. As with any
Asynchronous transport media, bandwidth is also important to
consider for optimal behavior as well as application sharing on the
link.

Note: For the purpose of this TechBook, the focus on technologies is
based on Metro configuration only. VPLEX Witness is supported with
VPLEX Geo; however, it is beyond the scope of this TechBook.

Architecture highlights
VPLEX support is open and heterogeneous, supporting both EMC
storage and common arrays from other storage vendors, such as
HDS, HP, and IBM. VPLEX conforms to established worldwide
naming (WWN) guidelines that can be used for zoning.
VPLEX supports operating systems including both physical and
virtual server environments with VMware ESX and Microsoft
Hyper-V. VPLEX supports network fabrics from Brocade and Cisco,
including legacy McData SANs.

Note: For the latest information please refer to the ESSM (EMC
Simple Support Matrix) for supported host types as well as the
connectivity ESM for fabric and extended fabric support.



An example of the architecture is shown in Figure 5.

Figure 5 Architecture highlights

Table 1 lists an overview of VPLEX features along with the benefits.

Table 1 Overview of VPLEX features and benefits (page 1 of 2)

Features Benefits

Mobility Move data and applications without impact on
users.

Resiliency Mirror across arrays without host impact, and
increase high availability for critical applications.

Distributed cache coherency Automate sharing, balancing, and failover of I/O
across the cluster and between clusters.



Table 1 Overview of VPLEX features and benefits (page 2 of 2)

Features Benefits

Advanced data caching Improve I/O performance and reduce storage array
contention.

Virtual Storage federation Achieve transparent mobility and access in a data
center and between data centers.

Scale-out cluster architecture Start small and grow larger with predictable service
levels.

For all VPLEX products, the appliance-based VPLEX technology:
◆ Presents storage area network (SAN) volumes from back-end
arrays to VPLEX engines
◆ Packages the SAN volumes into sets of VPLEX virtual volumes
with user-defined configuration and protection levels
◆ Presents virtual volumes to production hosts in the SAN via the
VPLEX front-end
◆ For VPLEX Metro and VPLEX Geo products, presents a global,
block-level directory for distributed cache and I/O between
VPLEX clusters.
Location and distance determine high-availability and data mobility
requirements. For example, if all storage arrays are in a single data
center, a VPLEX Local product federates back-end storage arrays
within the data center.
When back-end storage arrays span two data centers, the
AccessAnywhere feature in a VPLEX Metro or a VPLEX Geo product
federates storage in an active-active configuration between VPLEX
clusters. Choosing between VPLEX Metro or VPLEX Geo depends on
distance and data synchronicity requirements.
Application and back-end storage I/O throughput determine the
number of engines in each VPLEX cluster. High-availability features
within the VPLEX cluster allow for non-disruptive software upgrades
and expansion as I/O throughput increases.



Metro high availability design considerations
VPLEX Metro 5.0 (and above) introduces high availability concepts
beyond what is traditionally known as physical high availability.
Introduction of the “VPLEX Witness” to a high availability
environment, allows the VPLEX solution to increase the overall
availability of the environment by arbitrating a pure communication
failure between two primary sites and a true site failure in a multi-site
architecture. EMC VPLEX is the first product to bring to market the
features and functionality provided by VPLEX Witness prevents
failures and asserts the activity between clusters in a multi-site
architecture.
Through this TechBook, administrators and customers gain an
understanding of the high availability solution that VPLEX provides
them:
◆ Enabling of load balancing between their data centers
◆ Active/active use of both of their data centers
◆ Increased availability for their applications (no single points of
storage failure, auto-restart)
◆ Fully automatic failure handling
◆ Better resource utilization
◆ Lower CapEx and lower OpEx as a result
Broadly speaking, when one considers legacy environments one
typically sees “highly” available designs implemented within a data
center, and disaster recovery type functionality deployed between
data centers.
One of the main reasons for this is that within data centers
components generally operate in an active/active (or active/passive
with automatic failover) whereas between data centers legacy
replication technologies use active passive techniques which require
manual failover to use the passive component.
When using VPLEX Metro active/active replication technology in
conjunction with new features, such as VPLEX Witness server (as
described in “Introduction to VPLEX Witness” on page 81), the lines
between local high availability and long distance disaster recovery
are somewhat blurred since HA can be stretched beyond the data



center walls. Since replication is a by-product of federated and
distributed storage disaster avoidance, it is also achievable within
these geographically dispersed HA environments.

Planned application mobility compared with disaster restart
This section compares planned application mobility and disaster
restart.

Planned application An online planned application mobility event is defined as when an
mobility application or virtual machine can be moved fully online without
disruption from one location to another in either the same or remote
data center. This type of movement can only be performed when all
components that participate in this movement are available (e.g., the
running state of the application or VM exists in volatile memory
which would not be the case if an active site has failed) and if all
participating hosts have read/write access at both location to the
same block storage. Additional a mechanism is required to transition
volatile memory data from one system/host to another. When
performing planned online mobility jobs over distance a prerequisite
y is the use of an active/active underlying storage replication
solution (VPLEX Metro only at this publication).
An example of this online application mobility would be VMware
vMotion where a virtual machine would need to be fully operational
before it can be moved. It may sound obvious but if the VM was
offline then movement could not be performed online (This is
important to understand and is the key difference over application
restart).
When vMotion is executed all live components that are required to
make the VM function are copied elsewhere in the background before
cutting the VM over.
Since these types of mobility tasks are totally seamless to the user
some of the use cases associated are for disaster avoidance where an
application or VM can be moved ahead of a disaster (such as,
Hurricane, Tsunami, etc.) as the running state is available to be
copied, or in other cases it can be used to enable the ability to load
balance across multiple systems or even data centers.
Due to the need for the running state to be available for these types of
relocations these movements are always deemed planned activities.

Metro high availability design considerations 29


Disaster restart Disaster restart is where an application or service is re-started in
another location after a failure (be it on a different server or data
center) and will typically interrupt the service/application during the
failover.
A good example of this technology would be a VMware HA Cluster
configured over two geographically dispersed sites using VPLEX
Metro where a cluster will be formed over a number of ESX servers
and either single or multiple virtual machines can run on any of the
ESX servers within the cluster.
If for some reason an active ESX server were to fail (perhaps due to
site failure) then the VM can be re-started on a remaining ESX server
within the cluster at the remote site as the datastore where it was
running spans the two locations since it is configured on a VPLEX
Metro distributed volume. This would be deemed an unplanned
failover which will incur a small outage of the application since the
running state of the VM was lost when the ESX server failed meaning
the service will be unavailable until the VM has restarted elsewhere.
Although comparing a planned application mobility event to an
unplanned disaster restart will result in the same outcome (i.e., a
service relocating elsewhere) it can now be seen that there is a big
difference since the planned mobility job keeps the application online
during the relocation whereas the disaster restart will result in the
application being offline during the relocation as a restart is
conducted.
Compared to active/active technologies the use of legacy
active/passive type solutions in these restart scenarios would
typically require an extra step over and above standard application
failover since a storage failover would also be required (i.e. changing
the status of write disabled remote copy to read/write and reversing
replication direction flow). This is where VPLEX can assist greatly
since it is active/active therefore, in most cases, no manual
intervention at the storage layer is required, this greatly reduces the
complexity of a DR failover solution. If best practices for physical
high available and redundant hardware connectivity are followed the
value of VPLEX Witness will truly provide customers with
“Absolute” availability!


2

Hardware and Software

This chapter provides insight into the hardware and software
interfaces that can be used by an administrator to manage all aspects
of a VPLEX system. In addition, a brief overview of the internal
system software is included. Topics include:
◆ Introduction ........................................................................................ 32
◆ VPLEX management interfaces........................................................ 37
◆ Simplified storage management ...................................................... 39
◆ Management server user accounts .................................................. 40
◆ Management server software ........................................................... 41
◆ Director software................................................................................ 45
◆ Configuration overview.................................................................... 46
◆ I/O implementation .......................................................................... 50

Hardware and Software 31


Introduction
This section provides basic information on the following:
◆ “VPLEX I/O” on page 32
◆ “High-level VPLEX I/O flow” on page 32
◆ “Distributed coherent cache” on page 33
◆ “VPLEX family clustering architecture ” on page 33

VPLEX I/O
VPLEX is built on a lightweight protocol that maintains cache
coherency for storage I/O and the VPLEX cluster provides highly
available cache, processing power, front-end, and back-end Fibre
Channel interfaces.
EMC hardware powers the VPLEX cluster design so that all devices
are always available and I/O that enters the cluster from anywhere
can be serviced by any node within the cluster.
The AccessAnywhere feature in the VPLEX Metro and VPLEX Geo
products extends the cache coherency between data centers at a
distance.

High-level VPLEX I/O flow
VPLEX abstracts a block-level ownership model into a highly
organized hierarchal directory structure that is updated for every I/O
and shared across all engines. The directory uses a small amount of
metadata and tells all other engines in the cluster, in 4k block
transmissions, which block of data is owned by which engine and at
what time.
After a write completes and ownership is reflected in the directory,
VPLEX dynamically manages read requests for the completed write
in the most efficient way possible.
When a read request arrives, VPLEX checks the directory for an
owner. After VPLEX locates the owner, the read request goes directly
to that engine.



On reads from other engines, VPLEX checks the directory and tries to
pull the read I/O directly from the engine cache to avoid going to the
physical arrays to satisfy the read.
This model enables VPLEX to stretch the cluster as VPLEX distributes
the directory between clusters and sites. Due to the Hierarchical
nature of the VPLEX directory VPLEX is efficient with minimal
overhead and enables I/O communication over distance.

Distributed coherent cache
The VPLEX engine includes two directors that each have a total of 36
GB (version 5 hardware, also known as VS2) of local cache. Cache
pages are keyed by volume and go through a lifecycle from staging,
to visible, to draining.
The global cache is a combination of all director caches that spans all
clusters. The cache page holder information is maintained in a
memory data structure called a directory.
The directory is divided into chunks and distributed among the
VPLEX directors and locality controls where ownership is
maintained.
A meta-directory identifies which director owns which directory
chunks within the global directory.

VPLEX family clustering architecture
The VPLEX family uses a unique clustering architecture to help
customers break the boundaries of the data center and allow servers
at multiple data centers to have read/write access to shared block
storage devices. A VPLEX cluster, as shown in Figure 6 on page 34,
can scale up through the addition of more engines, and scale out by
connecting clusters into an EMC VPLEX Metro (two VPLEX Metro
clusters connected within Metro distances).

Introduction 33


Figure 6 VPLEX cluster example

VPLEX Metro transparently moves and shares workloads for a
variety of applications, VMs, databases and cluster file systems.
VPLEX Metro consolidates data centers, and optimizes resource
utilization across data centers. In addition, it provides non-disruptive
data mobility, heterogeneous storage management, and improved
application availability. VPLEX Metro supports up to two clusters,
which can be in the same data center, or at two different sites within
synchronous environments. Also, introduced with these solutions
architected by this TechBook, Geo cluster across distances achieves
the asynchronous partner to Metro. It is out of the scope of this
document to analyze VPLEX Geo capabilities.



VPLEX single, dual, and quad engines
The VPLEX engine provides cache and processing power with
redundant directors that each include two I/O modules per director
and one optional WAN COM I/O module for use in VPLEX Metro
and VPLEX Geo configurations.
The rackable hardware components are shipped in NEMA standard
racks or provided, as an option, as a field rackable product. Table 2
provides a list of configurations.

Table 2 Configurations at a glance

Components Single engine Dual engine Quad engine
Directors 2 4 8
Redundant Engine SPSs Yes Yes Yes
FE Fibre Channel ports (VS1) 16 32 64
FE Fibre Channel ports (VS2) 8 16 32
BE Fibre Channel ports (VS1) 16 32 64
BE Fibre Channel ports (VS2) 8 16 32
Cache size (VS1 Hardware) 64 GB 128 GB 256 GB
Cache size (VS2 Hardware) 72 GB 144 GB 288 GB
Management Servers 1 1 1
Internal Fibre Channel switches (Local Comm) None 2 2
Uninterruptable Power Supplies (UPSs) None 2 2

VPLEX sizing tool
Use the EMC VPLEX sizing tool provided by EMC Global Services
Software Development to configure the right VPLEX cluster
configuration.
The sizing tool concentrates on I/O throughput requirement for
installed applications (mail exchange, OLTP, data warehouse, video
streaming, etc.) and back-end configuration such as virtual volumes,
size and quantity of storage volumes, and initiators.

Introduction 35


Upgrade paths
VPLEX facilitates application and storage upgrades without a service
window through its flexibility to shift production workloads
throughout the VPLEX technology.
In addition, high-availability features of the VPLEX cluster allow for
non-disruptive VPLEX hardware and software upgrades.
This flexibility means that VPLEX is always servicing I/O and never
has to be completely shut down.

Hardware upgrades
Upgrades are supported for single-engine VPLEX systems to dual- or
quad-engine systems.
A single VPLEX Local system can be reconfigured to work as a
VPLEX Metro or VPLEX Geo by adding a new remote VPLEX cluster.
Additionally an entire VPLEX VS1 Cluster (hardware) can be fully
upgraded to VS2 hardware non disruptively.
Information for VPLEX hardware upgrades is in the Procedure
Generator that is available through EMC PowerLink.

Software upgrades
VPLEX features a robust non-disruptive upgrade (NDU) technology
to upgrade the software on VPLEX engines and VPLEX Witness
servers. Management server software must be upgraded before
running the NDU.
Due to the VPLEX distributed coherent cache, directors elsewhere in
the VPLEX installation service I/Os while the upgrade is taking
place. This alleviates the need for service windows and reduces RTO.
The NDU includes the following steps:
◆ Preparing the VPLEX system for the NDU
◆ Starting the NDU
◆ Transferring the I/O to an upgraded director
◆ Completing the NDU



VPLEX management interfaces
Within the VPLEX cluster, TCP/IP-based management traffic travels
through a private network subnet to the components in one or more
clusters. In VPLEX Metro and VPLEX Geo, VPLEX establishes a VPN
tunnel between the management servers of both clusters. When
VPLEX Witness is deployed, the VPN tunnel is extended to a 3-way
tunnel including both Management Servers and VPLEX Witness.

Web-based GUI
VPLEX includes a Web-based graphical user interface (GUI) for
management. The EMC VPLEX Management Console Help provides
more information on using this interface.
To perform other VPLEX operations that are not available in the GUI,
refer to the CLI, which supports full functionality. The EMC VPLEX
CLI Guide provides a comprehensive list of VPLEX commands and
detailed instructions on using those commands.
The EMC VPLEX Management Console contains but is not limited to
the following functions:
◆ Supports storage array discovery and provisioning
◆ Local provisioning
◆ Distributed provisioning
◆ Mobility Central
◆ Online help

VPLEX CLI
VPlexcli is a command line interface (CLI) to configure and operate
VPLEX systems. It also generates the EZ Wizard Setup process to
make installation of VPLEX easier and quicker.
The CLI is divided into command contexts. Some commands are
accessible from all contexts, and are referred to as ‘global commands’.
The remaining commands are arranged in a hierarchical context tree
that can only be executed from the appropriate location in the context
tree.

VPLEX management interfaces 37


The VPlexcli encompasses all capabilities in order to function if the
management station is unavailable. It is fully functional,
comprehensive, supporting full configuration, provisioning and
advanced systems management capabilities.

SNMP support for performance statistics
The VPLEX snmpv2c SNMP agent:
◆ Supports retrieval of performance-related statistics as published
in the VPLEX-MIB.mib.
◆ Runs on the management server and fetches performance related
data from individual directors using a firmware specific
interface.
◆ Provides SNMP MIB data for directors for the local cluster only.

LDAP /AD support
VPLEX offers Lightweight Directory Access Protocol (LDAP) or
Active Directory for an authentication directory service.

VPLEX Element Manager API
VPLEX Element Manager API uses the Representational State
Transfer (REST) software architecture for distributed systems such as
the World Wide Web. It allows software developers and other users to
use the API to create scripts to run VPLEX CLI commands.
The VPLEX Element Manager API supports all VPLEX CLI
commands that can be executed from the root context on a director.



Simplified storage management
VPLEX supports a variety of arrays from various vendors covering
both active/active and active/passive type arrays. VPLEX simplifies
storage management by allowing simple LUNs, provisioned from the
various arrays, to be managed through a centralized management
interface that is simple to use and very intuitive. In addition, a
VPLEX Metro or VPLEX Geo environment that spans data centers
allows the storage administrator to manage both locations through
the one interface from either location by logging in at the local site.

Simplified storage management 39


Management server user accounts
The management server requires the setup of user accounts for access
to certain tasks. Table 3 describes the types of user accounts on the
management server.

Table 3 Management server user accounts

Account type Purpose

admin (customer) • Performs administrative actions, such as user
management
• Creates and deletes Linux CLI accounts
• Resets passwords for all Linux CLI users
• Modifies the public Ethernet settings

service • Starts and stops necessary OS and VPLEX services
(EMC service) • Cannot modify user accounts
• (Customers do have access to this account)

Linux CLI accounts • Uses VPlexcli to manage federated storage

All account types • Uses VPlexcli
• Modifies their own password
• Can SSH or VNC into the management server
• Can SCP files off the management server from directories
to which they have access

Some service and administrator tasks require OS commands that
require root privileges. The management server has been configured
to use the sudo program to provide these root privileges just for the
duration of the command. Sudo is a secure and well-established
UNIX program for allowing users to run commands with root
privileges.
VPLEX documentation will indicate which commands must be
prefixed with "sudo" in order to acquire the necessary privileges. The
sudo command will ask for the user's password when it runs for the
first time, to ensure that the user knows the password for his account.
This prevents unauthorized users from executing these privileged
commands when they find an authenticated SSH login that was left
open.



Management server software
The management server software is installed during manufacturing
and is fully field upgradeable. The software includes:
◆ VPLEX Management Console
◆ VPlexcli
◆ Server Base Image Updates (when necessary)
◆ Call-home software
Each are briefly discussed in this section.

Management console
The VPLEX Management Console provides a graphical user interface
(GUI) to manage the VPLEX cluster. The GUI can be used to
provision storage, as well as manage and monitor system
performance.
Figure 7 on page 42 shows the VPLEX Management Console window
with the cluster tree expanded to show the objects that are
manageable from the front-end, back-end, and the federated storage.

Management server software 41


Figure 7 VPLEX Management Console

The VPLEX Management Console provides online help for all of its
available functions. Online help can be accessed in the following
ways:
◆ Click the Help icon in the upper right corner on the main screen
to open the online help system, or in a specific screen to open a
topic specific to the current task.
◆ Click the Help button on the task bar to display a list of links to
additional VPLEX documentation and other sources of
information.



Figure 8 is the welcome screen of the VPLEX Management Console
GUI, which utilizes a secure http connection via a browser. The
interface uses Flash technology for rapid response and unique look
and feel.

Figure 8 Management Console welcome screen

Command line interface
The VPlexcli is a command line interface (CLI) for configuring and
running the VPLEX system, for setting up and monitoring the
system’s hardware and intersite links (including com/tcp), and for
configuring global inter-site I/O cost and link-failure recovery. The
CLI runs as a service on the VPLEX management server and is
accessible using Secure Shell (SSH).

Management server software 43


For information about the VPlexcli, refer to the EMC VPLEX CLI
Guide.

System reporting
VPLEX system reporting software collects configuration information
from each cluster and each engine. The resulting configuration file
(XML) is zipped and stored locally on the management server or
presented to the SYR system at EMC via call home.
You can schedule a weekly job to automatically collect SYR data
(VPlexcli command scheduleSYR), or manually collect it whenever
needed (VPlexcli command syrcollect).



Director software
The director software provides:
◆ Basic Input/Output System (BIOS ) — Provides low-level
hardware support to the operating system, and maintains boot
configuration.
◆ Power-On Self Test (POST) — Provides automated testing of
system hardware during power on.
◆ Linux — Provides basic operating system services to the Vplexcli
software stack running on the directors.
◆ VPLEX Power and Environmental Monitoring (ZPEM) —
Provides monitoring and reporting of system hardware status.
◆ EMC Common Object Model (ECOM) —Provides management
logic and interfaces to the internal components of the system.
◆ Log server — Collates log messages from director processes and
sends them to the SMS.
◆ EMC GeoSynchrony™ (I/O Stack) — Processes I/O from hosts,
performs all cache processing, replication, and virtualization
logic, interfaces with arrays for claiming and I/O.

Director software 45


Configuration overview
The VPLEX configurations are based on how many engines are in the
cabinet. The basic configurations are single, dual and quad
(previously know as small, medium and large).
The configuration sizes refer to the number of engines in the VPLEX
cabinet. The remainder of this section describes each configuration
size.

Single engine configurations
The VPLEX single engine configuration includes the following:
◆ Two directors
◆ One engine
◆ Redundant engine SPSs
◆ 8 front-end Fibre Channel ports (16 for VS1 hardware)
◆ 8 back-end Fibre Channel ports (16 for VS1 hardware)
◆ One management server
The unused space between engine 1 and the management server as
shown in Figure 9 on page 47 is intentional.



Figure 9 VPLEX single engine configuration

Dual configurations
The VPLEX dual engine configuration includes the following:
◆ Four directors
◆ Two engines

Configuration overview 47


◆ Redundant Fibre Channel COM switches for local COM; UPS for
each Fibre Channel switch
Figure 10 shows an example of a medium configuration.

ON ON
I I
O O
OFF OFF

ON ON
I I
O O
OFF OFF

ON ON
I I
O O
OFF OFF

Fibre Channel switch B

UPS B

Fibre Channel switch A

UPS A
OFF OFF
O O
I I
ON ON

Management server

Engine 2
OFF OFF
O O
I I
ON ON

SPS 2

OFF OFF
O O
I I
ON ON

Engine 1

SPS 1

VPLX-000254

Figure 10 VPLEX dual engine configuration

Quad configurations
The VPLEX quad engine configuration includes the following:
◆ Eight directors
◆ Four engines



◆ Redundant Fibre Channel COM switches for local COM; UPS for
each Fibre Channel switch
Figure 11 shows an example of a quad configuration.

ON
I
O
ON
I
O
Engine 4
OFF OFF

SPS 4
ON ON
I I
O O
OFF OFF

Engine 3

ON ON
I I
O O
OFF OFF

SPS 3

Fibre Channel switch B

UPS B

Fibre Channel switch A

UPS A
OFF OFF
O O
I I
ON ON

Management server

Engine 2
OFF OFF
O O
I I
ON ON

SPS 2

OFF OFF
O O
I I
ON ON

Engine 1

SPS 1

VPLX 000253

Figure 11 VPLEX quad engine configuration

Configuration overview 49


I/O implementation
The VPLEX cluster utilizes a write-through mode when configured
for either VPLEX Local or Metro whereby all writes are written
through the cache to the back-end storage. To maintain data integrity,
a host write is acknowledged only after the back-end arrays (in one
cluster in case of VPLEX Local and in two clusters in case of VPLEX
Metro) acknowledge the write.
This section describes the VPLEX cluster caching layers, roles, and
interactions. It gives an overview of how reads and writes are
handled within the VPLEX cluster and how distributed cache
coherency works. This is important to the introduction of high
availability concepts.

Cache coherence
Cache coherence creates a consistent global view of a volume.
Distributed cache coherence is maintained using a directory. There is
one directory per virtual volume and each directory is split into
chunks (4096 directory entries within each). These chunks exist only
if they are populated. There is one directory entry per global cache
page, with responsibility for:
◆ Tracking page owner(s) and remembering the last writer
◆ Locking and queuing

Meta-directory
Directory chunks are managed by the meta-directory, which assigns
and remembers chunk ownership. These chunks can migrate using
Locality-Conscious Directory Migration (LCDM). This
meta-directory knowledge is cached across the share group (i.e., a
group of multiple directors within the cluster that are exporting a
given virtual volume) for efficiency.

How a read is handled
When a host makes a read request, VPLEX first searches its local
cache. If the data is found there, it is returned to the host.



If the data is not found in local cache, VPLEX searches global cache.
Global cache includes all directors that are connected to one another
within the single VPLEX cluster for VPLEX Local, and all of the
VPLEX clusters for both VPLEX Metro and VPLEX Geo. If there is a
global read hit in the local cluster (i.e. same cluster, but different
director) then the read will be serviced from global cache in the same
cluster. The read could also be serviced by the remote global cache if
the consistency group setting “local read override” is set to false (the
default is true). Whenever the read is serviced from global cache
(same cluster or remote), a copy is also stored in the local cache of the
director from where the request originated.
If a read cannot be serviced from either local cache or global cache, it
is read directly from the back-end storage. In these cases both the
global and local cache are updated.

I/O flow of a local read hit
1. Read request issued to virtual volume from host.
2. Look up in local cache of ingress director.
3. On hit, data returned from local cache to host.

I/O flow of a global read hit
3. On miss, look up in global cache.
4. On hit, data is copied from owner director into local cache.
5. Data returned from local cache to host.

I/O flow of a read miss
3. On miss, look up in global cache.
4. On miss, data read from storage volume into local cache.
5. Data returned from local cache to host.
6. The director that returned the data becomes the chunk owner.

I/O implementation 51

TechBook: EMC VPLEX Metro Witness Technology and High Availability

TechBook: EMC VPLEX Metro Witness Technology and High Availability

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (18)

Similar a TechBook: EMC VPLEX Metro Witness Technology and High Availability

Similar a TechBook: EMC VPLEX Metro Witness Technology and High Availability (20)

Más de EMC

Más de EMC (20)

Último

Último (20)

TechBook: EMC VPLEX Metro Witness Technology and High Availability