Short presentation made for OpenStack London "Tokyo Aftermath" meetup, on current upstream activity in the OpenStack HA developers community around high availability for compute nodes.
1. Compute node HA
a.k.a. “can pets survive in OpenStack?”
Adam Spiers
Senior Software Engineer, Cloud & High Availability
aspiers@suse.com
OpenStack London Meetup, Wednesday 18th
November
– short update on upstream development
3. 3
Typical HA control plane in OpenStack
Pacemaker Cluster
Control Node 1 Node
DRBD
PostgreSQL
RabbitMQ
Keystone
Glance
Nova
Dashboard
Cinder
Neutron
Database Cluster
Node 1 Node 2
DRBD or shared storage
Database
Message Queue
Services Cluster
Node 1 Node 2 Node 3
Orchestration
Keystone
Glance
Nova
Dashboard
Cinder
Telemetry
Neutron
• Maximises cloud uptime
• Automatic restart of
OpenStack controller
services
• Active/Active API services
with load balancing
• DB + MQ either
Active/Active or
Active/Passive
4. 4
Under the covers
Services Cluster
Node 1 Node 2 Node 3
• Recommended by
official HA guide
• HAProxy distributes service
requests
• Pacemaker
‒ monitoring and control of nodes
and services
• Corosync
‒ cluster membership /
messaging / quorum /
leadership election
Corosync
Pacemaker
HAProxy
But what I really want to do is keep my workloads up!
5. 6
HA Cluster
Control node
OS
Message queue
Database
Identity
Images
Block storage
Networking
Dashboard
Compute
OS
Compute node
nova-compute
libvirt
HA only on control plane
OS
Compute node
nova-compute
libvirt
OS
Compute node
nova-compute
libvirt
6. 7
HA Cluster
Control node
OS
Message queue
Database
Identity
Images
Block storage
Networking
Dashboard
Compute
OS
Compute node
nova-compute
libvirt
Can we simply extend the cluster?
OS
Compute node
nova-compute
libvirt
OS
Compute node
nova-compute
libvirt
8. 9
Scaling up
• Corosync requires <= 32 nodes
• But we want lots of compute nodes!
• The obvious workarounds are ugly
‒ Multiple compute clusters
‒ introduces unwanted artificial boundaries
‒ Clusters inside / between guest VM instances
‒ requires cloud users to modify guest images (installing & configuring cluster
software)
‒ cluster stacks are not OS-agnostic
‒ cloud is supposed to make things easier not harder!
9. 10
pacemaker_remote to the rescue!
• New(-ish) Pacemaker feature
• Allows arbitrary scalability of an existing
Pacemaker cluster
11. 12
Capabilities
• Increases availability of compute nodes
‒ Detects failed compute services
‒ Automatic recovery of compute services where possible
• “Quarantines” failing compute nodes
‒ STONITH (fencing) extends to remote nodes
• Coordinates with control plane
‒ VMs on dead compute nodes are resurrected elsewhere
‒ In nova, this is described as “evacuation”
15. 16
Public Health Warning
• nova evacuate does not do evacuation
• nova evacuate does resurrection
• In Vancouver, nova developers considered a rename
‒ Hasn't happened yet
‒ Due to impact, seems unlikely to happen any time soon
‒ Whenever you see “evacuate” in a nova-related context,
pretend you saw “resurrect”
16. 17
Existing solutions
• NovaCompute / NovaEvacuate custom OCF RAs
‒ used by Red Hat / SUSE / Intel
‒ works with known limitations
• EvacuationD
‒ PoC to address above limitations
‒ decouples resurrection workflow from Pacemaker
• Masakari (NTT)
‒ similar architecture, different code
‒ monitoring at 3 layers (node, process, hypervisor)
• Approach of AWcloud / ChinaMobile
‒ very different; uses consul / raft / gossip
17. 18
Proposed solutions
• Use Mistral to orchestrate resurrection workflow
• Intel currently working on prototype
• Possibly the most promising approach
‒ Mistral considered pretty solid
‒ This is exactly the kind of thing it was designed for
• However, Mistral currently a SPoF … oops
‒ Don't worry, should be fixed in mitaka cycle
• Feasibility of convergence with Masakari will probably
be analysed within next week or two
18. 19
Community developments
• openstack-resource-agents project now on
stackforge
‒ maintained by me
• New #openstack-ha IRC channel on FreeNode
‒ automatic notifications for activity on HA repositories
• New topic category on openstack-dev@ mailing list
Subject: [HA] i can haz pets in my cloud?
• Weekly IRC meetings at Monday 9am UTC
• HA guide currently undergoing a revamp
• Everyone welcome to get involved!
20. Unpublished Work of SUSE LLC. All Rights Reserved.
This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC.
Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of
their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated,
abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE.
Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.
General Disclaimer
This document is not to be construed as a promise by any participating company to develop, deliver, or market a
product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making
purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document,
and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The
development, release, and timing of features or functionality described for SUSE products remains at the sole
discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at
any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in
this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All
third-party trademarks are the property of their respective owners.