Boost Fertility New Invention Ups Success Rates.pdf
NetApp cluster failover giveback
1. What is NetApp Cluster-Failover Giveback?
Concept:
If one of the NetApp Filer (HA Pair*) fails for any reason, the other head will take over the failed
head's 'disks' and 'network connections'. Once the failed head has recovered and is booting, it will
pause and wait for the other head to give back its resources. The operational head must be told to
give back the resources, after which the peers will sync up and take over their normal operations.
Determining if the heads are in a failover state:
Systems should receive an e-mail from the remaining head when it takes over from the failed head.
Depending on the configuration, the subject or body will say:
CLUSTER TAKEOVER COMPLETE AUTOMATIC on netapp_head2
You may also see this message on the survival node console.
This means that netapp_head2 is up and has taken over operations for its peer.
If you SSH in to netapp_head2, the prompt should read:
netapp_head2(takeover)>
To indicate that it has taken over the partner.
You can gather more information about the cluster failover status
with the following command:
netapp_head2(takeover)> cf monitor
current time: 03Nov2012 10:28:48
TAKEOVER 00:20:01, partner 'netapp_head1', cluster monitor enabled
As you can see, it is in a TAKEOVER state for partner netapp_head1, and has been for just over 20
minutes.
2. Steps to perform before requesting a giveback:
Before requesting a giveback from the operational head, you should ensure that the failed head is
ready to come back up. You can do this by accessing the downed node's console via serial
connection or RLM/SP. The console should be blank. Pressing 'Enter' should yield the following text:
Waiting for giveback... (Press Ctrl-C to abort wait)
Do NOT press Ctrl-C, as this will leave the host in a very confused state.
Performing a cluster-failover giveback:
On the operational head's console, run the following command:
netapp_head2(takeover)> cf giveback
After a short delay, both heads should start spewing information and sending alerts as the downed
head boots, connects to its disks, and resumes network services.
To confirm that the giveback has been completed, run the following
command:
netapp_head2 > cf monitor
current time: 03Nov2012 10:32:42
UP 00:24:22, partner 'netapp_head1', cluster monitor enabled
VIA Interconnect is up (link 0 up, link 1 up), takeover capability on-line
partner update TAKEOVER_ENABLED (03Nov2012 11:38:41)
This indicates that the partner is up, and has been for just over 24 minutes. This head is ready to
takeover the partner again, if need be.
3. What is HA Pair?
HA pair consists of a pair of matching FAS or V-Series storage controllers (local node and partner
node). Each node is connected to its partner’s disk shelves.
The Data ONTAP and firmware versions on the two nodes must be identical. Similarly, the
interconnect adapters on the nodes must be identical and must be configured with the same
firmware version. Also, the interconnect adapters must be connected properly by the appropriate
interconnect cables. HA pairs provide fault tolerance and enable the performance of nondisruptive
upgrades and maintenance.
On the side note, NetApp's cluster-mode configuration is also based on HA Pair. In other words, the
basic building blocks are still the standard FAS or V-Series HA pairs that we all are so familiar with. A
cluster includes multiple HA pairs. The HA pairs are joined by a namespace that is shared over an
internal network. The network is referred to as "the cluster network."
-Prepared by
ashwinwriter@gmail.com