Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Welcome to HACMP
Introduction Demo Class
Email: sales@kerneltraining.com
Call us: +91 8099776681

www.kerneltraining.com
Unit objectives
After completing this unit, you should be able to:
Define High Availability and explain why it is needed
List the key considerations when designing and
implementing a high availability cluster
Outline the features and benefits of HACMP for AIX
Describe the components of an HACMP for AIX cluster
Explain how HACMP for AIX operates in typical cases
HACMP

High Availability and HACMP concepts
After completing this topic, you should be able to:
Define High Availability
Recognize that eliminating single points of failure (SPOFs) is part of
the HACMP implementation process
Outline the features and benefits for HACMP for AIX
Describe the HACMP concepts of topology and resources
Give examples of topology components and resources
Provide a brief description of the software and hardware
components of a typical HACMP cluster
HACMP

So, what is High Availability?
High Availability characteristics:
The masking or elimination of both planned and unplanned
downtime
The elimination of single points of failure (SPOFs)
Fault resilience and system hardening
No specialized hardware requirement
HACMP
client
Workload Fallover
WAN
Production
Node/LPAR
Standby Node/LPAR

Eliminating single points of failure
HACMP
Cluster Object Eliminated as a single point of failure by:
Node Using multiple nodes
Power source Using multiple circuits or uninterruptible power
supplies
Network adapter
Network
Using redundant network adapters
Using multiple networks to connect nodes
TCP/IP subsystem Using non-IP networks to connect adjoining nodes
and clients
Disk adapter
Disk
Using redundant disk adapter or multipath hardware
Using multiple disks with mirroring or raid
Application Adding node for takeover; configuring application
monitor
VIO Server Implementing dual VIO Servers
Site Adding an additional site
The fundamental goal of (successful) cluster design is
the elimination of single points of failure (SPOFs).

High availability clusters (HACMP base)
HACMP
System p and AIX RAS features include:
Application and Partition Mobility
First Failure Data Capture (FFDC)
Dynamic CPU Deallocation
Flexible Service Processor
Redundant Power and Cooling
Error Correction Checking Memory
Hot Swap Adapters
Dynamic Kernel
Journaled Filesystem
Redundant Data Paths
Dual Disk Adapters (MPIO)
Data Mirroring and/or Striping
Hot Swap / Hot Spare Storage
Redundant Power/Cooling for Storage Arrays
With High Availability Clustering (HACMP)
Protection against node and OS failure with Redundant nodes
Protection against NIC failure with Redundant Network Adapters
Protection against Network failure with Redundant Networks
Self-healing clusters with Application Monitoring
Protection against Site Failure (typically limited by SAN infrastructure)
or no distance limitations with HACMP/XD

What about site failure?
HACMP
Limited distance (LVM mirroring and SAN): HACMP for AIX
Extended distance: Geographic Clustering Solution
(that is, HACMP/XD)
Distance unlimited
Application, disk, and network independent
Automated site failover and reintegration
A single cluster across two sites
Get more details in HACMP System Administration III – AU620
Toron
to
Bruss
els
Metro Mirror/PPRC
GLVM
GeoRM
Data Replication

IBM's HA solution for AIX
HACMP
HACMP for AIX characteristics:
Stands for High Availability Cluster Multi-processing
Is based on cluster technology (RSCT)
Provides two environments (which can co-exist simultaneously):
Serial (High Availability): the process of ensuring that an application is
available for use through the use of serially accessible shared data and
duplicated resources
Parallel (Cluster Multiprocessing): concurrent access to shared data

Fundamental HACMP concepts
HACMP
Topology: Physical “networking centric” components
Resources: Entities that are being made highly available
Resource group: A collection of resources, which HACMP controls as a single
unit
A given resource can appear only in, at most, one resource group
Resource group policies:
startup policy: which node the resource group is activated on
fallover policy: determines target when there is a failure
fallback policy: determines fallback behavior
Customization
The process of augmenting HACMP, typically via implementing scripts
Minimum: application start and stop scripts
Optional:
Application monitoring scripts (highly recommended!)
Event customization
Notification, pre- and post-event scripts, recovery scripts, user-
defined events, time until warning (config_too_long timeout)

A highly available cluster
HACMP
Shared Storage
clstrmgr clstrmgr
Fallover
Node
Node
Fundamental Concepts
Cluster is comprised of physical components (topology) and logical
components (resource groups and resources).

HACMP's topology components (1 of 2)
HACMP
Communication
Interface
Node
The Topology components consist of a cluster, nodes and the
technology that connects them together.

HACMP’s topology components (2 of 2)
HACMP
Ethernet / Etherchannel
ServerServer
PC
Non -IPServer
Server
Heartbeat on Disk
RS232/422
SAN IBM
RS/6000
RS/6000
DS8000 Fibre
DS4000
Fibre Channel
 Node
 Any-to-any, including LPARs
 Minimum number of physical adapters for
redundancy must be considered
 Networking
 Ethernet
Physical and virtual
Etherchannel
 Non-IP
Heartbeat on disk, RS-232, Target-mode
SCSI
 Shared storage
 Physical
SCSI or Fibre Channel
 Virtual SCSI

What is HACMP?
HACMP
An application which:
Controls where resource groups run
Monitors and reacts to events
Provides tools for cluster-wide configuration and synchronization
Relies on other AIX Subsystems (ODM, LVM, RSCT, TCP/IP, SRC, and
so on)
Cluster Manager Subsystem (clstrmgrES)
Topology
manager
Resource
manager
Event
manager
SNMP
manager
RSCT
(topsvcs, grpsvcs, RMC
subsystems)
snmpd clinfoES
clcomdES
clstat

Additional features of HACMP
HACMP
HACMP is shipped with utilities to simplify configuration, monitoring,
customization, and cluster administration.
OLPW
smit via web
Configuration
Assistant
CSPOC
DARE
clstrmgrES
SNMP
Verification
Auto tests
Tivoli
Integration
Application
Monitoring

Some assembly required
HACMP
HACMP can be used out of the box; however, some assembly is
required.
Minimum:
Application Start/Stop/Monitor scripts
Optional:
Customized pre/post event scripts
Reaction to events
Error notification Methods
User Defined Event’s (UDE’s)
Cluster State Change
HACMP's flexibility allows for complex customization in order to
meet availability goals

Let’s review
HACMP
1. Which of the following items are examples of topology components in
HACMP? (Select all that apply.)
a. Node
b. Network
c. Service IP label
d. Hard disk drive
2. True or False?
All nodes in an HACMP cluster must have roughly equivalent performance
characteristics.
3. Which of the following is a characteristic of high availability?
a. High availability always requires specially designed hardware
components.
b. High availability solutions always require manual intervention to ensure
recovery following fallover.
c. High availability solutions never require customization.
d. High availability solutions use redundant standard equipment (no
specialized hardware).
4. True or False?
A thorough design and detailed planning is required for all high availability
solutions.

Let’s review solutions
HACMP
1. Which of the following items are examples of topology components in
HACMP? (Select all that apply.)
a. Node
b. Network
c. Service IP label
d. Hard disk drive
2. True or False?
All nodes in an HACMP cluster must have roughly equivalent performance
characteristics.a
3. Which of the following is a characteristic of high availability?
a. High availability always requires specially designed hardware
components.
b. High availability solutions always require manual intervention to ensure
recovery following fallover.
c. High availability solutions never require customization.
d. High availability solutions use redundant standard equipment (no
specialized hardware).
4. True or False?
A thorough design and detailed planning is required for all high availability
solutions.

What does HACMP do?
HACMP
After completing this topic, you should be able to:
Describe the failures that HACMP detects directly
Provide an overview of the standby and takeover cluster
configuration options in HACMP
Describe some of the considerations and limits of an
HACMP cluster

Just what does HACMP do?
HACMP
HACMP functions:
Monitors the states of nodes, networks, network adapters and devices
Strives to keep resource groups highly available
Optionally, monitors the state of the applications, and can be customized to
react to every possible failure

What happens when something fails?
HACMP
How the cluster responds to a failure depends on what has failed, what the
resource group's fallover policy is, and if there are any resource group
dependencies:
Typically, another equivalent component takes over duties of failed
component (for example, another node takes over from a failed node).

What happens when a problem is fixed?
HACMP
How the cluster responds to the recovery of a failed component depends on what
has recovered, what the resource group's fallback policy is, and the resource
group dependencies:
Typically, administrators need to indicate or confirm that the fixed component
is approved for use. Some components are integrated automatically; for
instance, when a communication interface recovers.a

Standby (active/passive) with fallback
HACMP
Node USA fails Node UK fails
USA returns UK returns
One node is primary
RG can be configured
to come online on the
primary or any node
(no change)
A
A A
AA

Standby (active/passive) without fallback
HACMP
USA fails
UK failsUSA returns
Eliminates another
outage
Reduces downtime
A
A
A
A UK returns

Mutual takeover: Active/Active
HACMP
UK fails
Very common
No one node/LPAR
is left idle
A B
B
B
B
A
A
A
(with Fallback) (with Fallback)

Concurrent: Multiple active nodes
HACMP
USA, Germany, and UK are all
running Application A, each
using a separate IP Address
A A A
A A AA
If nodes fail, the application remains
continuously available as long as there are
surviving nodes to run on.
Fixed nodes resume running their copy of the
application.
Application must be designed to run simultaneously on
multiple nodes.
This has the potential for essentially zero downtime.

Points to ponder
HACMP
Resource groups:
Must be serviced by at least two nodes
Can have different policies
Can be migrated (manually or automatically) to rebalance loads
Clusters:
Must have at least one IP network and one non-IP network
Need not have any shared storage
Can have any combination of supported nodes *
Can be split across two sites
Might or might not require replicating data (HACMP/XD).
Applications:
Can be restarted via monitoring
Must be manageable via scripts (start/restart and stop)
* Application performance requirements and other operational issues
almost certainly impose practical constraints on the size and
complexity of a given cluster.

Other considerations for HACMP
HACMP
Design, planning, testing
Focus on service and availability
Apply appropriate risk analysis
Disciplined system administration practices
Documented operational procedures
High
availability
Continuous
operation
Continuous
availability
Systems
Management
People
Data
Hardware
Software
Environment
Networking

Things HACMP does not do
HACMP
Back-up and restoration
Time synchronization
Application specific configuration
System administration tasks unique to each node

When is HACMP not the correct solution?
HACMP
Zero downtime required
Maybe a fault tolerant system is the correct choice.
Availability 7x24x365; HACMP occasionally needs to be shut
down for maintenance.
Life-critical environments.
Security issues
Too little security
Many people can change the environment.
Too much security
C2 and B1 environments might not allow HACMP to
function as designed.
Unstable environments
HACMP cannot make an unstable and poorly managed
environment stable.
HACMP tends to reduce the availability of poorly managed
systems.

What do we plan to achieve this week?
HACMP
Your mission this week is to build a two-node mutual takeover
highly available cluster using two previously separate AIX systems,
each of which has an application which needs to be made highly
available.
A
B
A
B

Overview of the implementation process
HACMP
Plan and configure AIX
Elimination of single points of failure
Storage (adapters, LVM volume group, filesystem)
Networks (IP interfaces, /etc/hosts, non-IP networks, and devices)
Application start and stop scripts
Install the HACMP filesets (Note: 5.3 and earlier reboot!)
Configure the HACMP environment
Topology
Cluster, node names, HACMP IP and non-IP networks
Resources and Resource groups:
Identify name, nodes, policies
Resources: Application Server, service label, VG, filesystem
Synchronize, then start HACMP
Note: If using two nodes and one application “Configure the HACMP
environment” can be done in one step.

Hints to get started
HACMP
• Draw a diagram.
• Use (online) planning sheets.
• Focus on eliminating SPOFs.
• Always factor in a non-IP network.
• Ensure that you have multipath
access to shared storage devices.
• Document a test plan.
• Test the cluster carefully.
• Be methodical.
hints
Public Network
Resource Group databaserg contains
Volume Group = dbvg
hdisk3, hdisk4, hdisk5, hdisk6, hdisk7
Major # = 51
JFS Log = dblvlog
Logical Volume = dblv1, dblv2
FS Mount Point = /db, /dbdata
Node Name = nodea
Resource group = dbrg
Applications = database
Resources = cascading
A-B
Priority = 1,2
CWOF = yes
Label = a_tmssa
Device = /dev/tmssa1
Label = a_tty
Device = /dev/tty1
Node Name =nodeb
Resource group = httprg
Applications = http
Resources = cascading
B-A
Priority = 2,1
CWOF = yes
Label = b_tmssa
Device = /dev/tmssa2
Label = a_tty
Device = /dev/tty1
tmssa network
serial network
VG = dbvg
Raid5
100GB
VG =httpvg
Raid1
9GB
rootvg
raid1
9.1GB
rootvg
raid1
9.1GB
user
community
HACMP Cluster
for
the ABC company
Resource Group httprg contains
Volume Group = httpvg
hdisk2,hdisk8
Major # = 50
JFS Log = httplvlog
Logical Volume = httplv
FS Mount Point = /http
Node A IP Label IP Address Netmask
Service webserv 192.168.9.5 255.255.255.0
Boot nodebboot 192.168.9.6 255.255.255.0
Standby nodebstand 192.168.254.3 255.255.255.0
Node A IP Label IP Address Netmask
Service database 192.168.9.3 255.255.255.0
Boot nodeaboot 192.168.9.4 255.255.255.0
Standby nodeastand 192.168.254.3 255.255.255.0

Sources of HACMP information
HACMP
HACMP manuals come with the product
 cluster.doc.en_US.es.html
 cluster.doc.en_US.es.pdf
HACMP documentation also available online
 http://www.ibm.com/servers/eserver/pseries/library/hacmp_docs.html
Release Notes contain important information about the version release
 /usr/es/sbin/cluster/release_notes
Sales manual: http://www.ibm.com/common/ssi
IBM courses:
 HACMP Admin. I: Planning and Implementation (AU540/AU54)
 HACMP Admin II: Admin. and Problem Determination (AU610/AU61)
 HACMP Administration III: Virtualization and Disaster Recovery
(AU620/AU62)
 HACMP V5 Internals (AU60)
IBM Web site:
 http://www-03.ibm.com/systems/p/ha/
Non-IBM sources (not endorsed by IBM but probably worth a look):
 http://lpar.co.uk
 http://portal.explico.de/
 http://www.matilda.com/hacmp/
 http://groups.yahoo.com/group/hacmp/

Checkpoint
HACMP
1. True or False?
Resource Groups can be moved from node to node.
2. True or False?
HACMP/XD is a complete solution for building geographically
distributed clusters.
3. Which of the following capabilities does HACMP not provide?
(Select all that apply.)
a. Time synchronization
b. Automatic recovery from node and network adapter failure
c. System Administration tasks unique to each node; back-up and
restoration
d. Fallover of just a single resource group
4. True or False?
All nodes in a resource group must have equivalent performance
characteristics.

Checkpoint solutions
HACMP
 True or False?
Resource Groups can be moved from node to node.
 True or False?
HACMP/XD is a complete solution for building geographically
distributed clusters.
 Which of the following capabilities does HACMP not provide?
(Select all that apply.):
 Time synchronization
 Automatic recovery from node and network adapter failure
 System Administration tasks unique to each node; back-up and
restoration
 Fallover of just a single resource group
 True or False?
All nodes in a resource group must have equivalent performance
characteristics.

Unit summary
HACMP
Having completed this unit, you should be able to:
Define high availability and explain why it is needed
Outline the various options for implementing high availability
List the key considerations when designing and implementing a high
availability cluster
Outline the features and benefits of HACMP for AIX
Describe the components of an HACMP for AIX cluster
Explain how HACMP for AIX operates in typical casesa

Questions?
HACMP

Email: sales@kerneltraining.com
Call us: +91 8099776681
THANK YOU
for attending
Demo of HACMP
HACMP

Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (6)

Último

Último (20)

Hacmp | IBM AIX PowerHA Introduction | basics | Demo PPT