A healthy diet for your Java application Devoxx France.pdf
Apache CloudStack 4.5 Hypervisor Selection
1. CloudStack Day Tokyo 2015
Covering Apache CloudStack 4.5
Selecting the correct hypervisor for your cloud
2. #whoami
Name: Tim Mackey
Current roles: XenServer Community Manager and Evangelist; occasional coder
Cool things I’ve done
• Designed laser communication systems
• Early designer of retail self-checkout machines
• Embedded special relativity algorithms into industrial control system
Find me
• Twitter: @XenServerArmy
• SlideShare: slideshare.net/TimMackey
• LinkedIn: www.linkedin.com/in/mackeytim
3. What are we trying to accomplish?
Building a successful cloud
4. Service Offerings
Clearly define what you want to offer
• What types of applications
• Who has access, and who owns them
• What type of access
Define how templates need to be managed
• Operating system support
• Patching requirements
Define expectations around compliance and availability
• Who owns backup and monitoring
5. Define Tenancy Requirements
Department data local to department
• Where is the application data stored
Data and service isolation
• VM migration and host HA
• Network services
Encryption of PII/PCI
• Where do keys live when data location unknown
• Need encryption designed for the cloud
Showback to stakeholders
• More than just usage, compliance and audits
6. Virtualization Infrastructure
Hypervisor defined by service offerings
• Don’t select hypervisor based on “standards”
• Multiple hypervisors are “OK”
• Bare metal can be a hypervisor
To “Pool” resources or not
• Is there a real requirement for pooled resources
• Can the cloud management solution do better?
• Real cost of shared storage
Primary storage defined by hypervisor
Template storage defined by solution
• Typically low cost options like NFS
8. XenServer 6.5
Feature
Source code model Open Source (GPLv2)
Maximum VM Density 1000
CloudStack VM Density 500
CloudStack integration Direct XAPI calls
Maximum native cluster size 16
Maximum pRAM 1 TB
Largest supported VM 32 vCPU/256GB
Windows Operating System All Windows supported by Microsoft
Linux Operating Systems RHEL, CentOS, Debian, Ubuntu, SLES, OEL
Advanced features supported ovs, Storage XenMotion, DMC, Pool HA
9. vSphere 5.5 (no vSphere 6 yet)
Feature
Source code model Proprietary
Maximum VM Density 512
CloudStack VM Density 128
CloudStack integration vCenter
Maximum native cluster size 32
Maximum pRAM 4 TB
Largest VM 64 vCPU/1TB
Windows Operating Systems DOS, All Windows Server/Client
Linux Operating Systems Most
Advanced features supported HA, DRS, vDS, Storage vMotion
10. KVM
Feature
Source code model Open Source (GPLv2)
Maximum VM Density 10 times the number of pCores
CloudStack VM Density 50
CloudStack integration CloudStack Agent (libvirt)
Maximum native cluster size No native cluster support
Maximum pRAM 2 TB
Largest VM 160 vCPU/2TB
Windows Operating Systems Windows XP and higher
Linux Operating Systems Varies
Advanced features supported None
11. Microsoft Hyper-V
Feature
Source code model Proprietary
Maximum VM Density 1024
CloudStack VM Density 1024
CloudStack integration CloudStack Agent (C# calling WMI)
Maximum native cluster Size 64
Maximum pRAM 4 TB
Largest VM 64 vCPU/1TB
Windows Operating Systems All Windows supported by Microsoft
Linux Operating Systems RHEL, CentOS, Debian, Ubuntu, SLES, OEL
Advanced features supported VHDX, Storage Motion (shared only)
13. Flat Network – Basic Layer 3 Network
Option XenServer vSphere KVM Hyper-V
Security Groups Yes- bridge No Yes Yes
IPv6 Yes No Yes No
Multiple IPs per NIC Yes Yes Yes Yes
Nicira NVP Yes No Yes No
BigSwitch VNS Yes No Yes No
65.11.1.2
65.11.1.3
65.11.1.4
65.11.1.5
Public Network
65.11.0.0/16
Guest VM 1
Guest VM 2
Guest VM 3
Guest VM 4
DHCP,
DNS
CloudStack
Virtual Router
Security Group 1
Security Group 2
14. VLANs for Private Cloud
Option XenServer vSphere KVM Hyper-V
Max VLANs 800 254 1024 4094
IPv6 Yes No Yes No
Multiple IPs per
NIC
Yes Yes Yes Yes
Nicira NVP Yes No Yes No
BigSwitch VNS Yes No Yes No
MidoKura No No Yes No
VPC Yes Yes Yes Yes
NetScaler Yes Yes Yes Yes
F5 BigIP Yes Yes Yes Yes
Juniper SRX No Yes Yes Yes
Juniper EX/QFX No Yes Yes No
Cisco VNMC No Yes No No
GloboDNS Yes No No No
Brocade VDX Yes Yes Yes No
10.1.1.1
10.1.1.3
10.1.1.4
10.1.1.5
Public
Network/Internet
Guest Virtual Network 10.0.0.0/8
VLAN 100
DHCP, DNS
NAT
Load
Balancing
VPN
Public IP
65.37.14.1
Gateway
10.1.1.1
Guest VM
1
Guest VM
2
Guest VM
3
Guest VM
4
CloudStack
Virtual
Router
15. Beyond the VLAN – Software Defined Networking
Option XenServer vSphere KVM Hyper-V
OVS GRE tunnels Yes - ovs No No No
Nicira STT tunnel Yes Yes Yes No
MidoNet No No Yes No
VXLAN No Yes Yes No
NVGRE No No No No
Nexus 1000v No Yes No No
Juniper Contrail Yes No No No
Palo Alto Yes Yes Yes No
Nuage VSP Yes Yes No No
16. Virtual Private Cloud and nTier Applications
Feature XenServer vSphere KVM Hyper-V
PVLAN Yes - ovs Yes ovs Yes –
Hyper-V
VR
required
IPv6 Yes No Yes No
Distributed routing Yes - ovs No ovs No
Web
App
DB
Router
DC1
DC2
DC3
DC4
DC5
DC6
VLAN 1
VLAN 2
VLAN 3
S2S VPN
Private
GW
18. Primary Storage Options
Feature XenServer vSphere KVM Hyper-V
Local storage Yes Yes Yes Yes
NFS Yes Yes Yes No
SMB No No No SMB3
Single path iSCSI Yes Yes Yes No
Multipath iSCSI PreSetup No No No
Direct array No VAAI No No
Shared Mount No No Yes No
SolidFire Plugin Yes Yes Yes No
NetApp Plugin Yes Yes Yes No
CloudBytes Elastistor Yes No No No
Zone wide No Yes Yes No
Ceph RBD No No Yes No
Clustered LVM No No Yes No
Cluster
Host
Host
Primary Storage
19. Secondary Storage Options
Option XenServer vSphere KVM Hyper-V
NFS Yes Yes Yes No
Swift(1) Yes Yes Yes No
S3 compatible (2) Yes Yes Yes No
SMB No No No Yes
Template format VHD OVA QCOW2,
VHD,
VMDK,
RAW,
IMG
VHD,
VHDX
Primary storage
golden cache
Yes No No No
(1) Requires NFS staging area
(2) Can be region wide, but must not have NFS secondary storage in zone
Zone
Secondary Storage
Pod
Cluster
Host
Host
Primary Storage
20. The limits and features which matter
Core virtualization capabilities
21. CloudStack Features
Feature XenServer vSphere KVM Hyper-V
Disk IO Statistics Yes No Yes Yes
Memory Overcommit Yes (4x) Yes No No
Dedicated resources Yes Not with HA/DRS Yes Yes
Disk IO throttling No No Yes No
Disk snapshot (running) Yes Yes No No
Disk snapshot (pluggable) Partial Partial No No
Disk snapshot (Stopped) Yes Yes Yes Yes
Memory snapshot Yes Yes Yes No
Zone wide primary storage No Yes Yes SMB 3.0 only
Resize disk Offline Online Grow Online No
High availability Host + CloudStack Native CloudStack CloudStack
CPU sockets 6.2 and higher Yes Yes Yes
Affinity groups Yes Yes Yes Yes
GPU passthrough/vGPU 6.2 SP1 and higher No No No
AutoScaling VM Instances Native, NetScaler NetScaler NetScaler NetScaler
22. Multiple Hypervisor Support
Networking
• Ensure network labels match
• Topology is intersect of chosen hypervisors
• Hyper-V requires Hyper-V system VMs
Storage
• Force system VMs to specific hypervisor type
• Zone wide primary storage limited
Operations
• vSphere Datacenter can not span zones
• Hyper-V may not be mixed with other hypervisors in a zone
• HA won’t migrate between hypervisors
• Capacity planning at the cluster/pod level more difficult
24. KVM
Primary value proposition:
• Low cost with available vendor support and familiar administration model
• Broad feature set with active development
Cloud use cases:
• Linux centric workloads
• Dev/test clouds
• Web hosting
• Tenant density which dictates SDN options
Weaknesses:
• Requires use of an installed libvirt agent
• Limited native storage options
• No use of advanced native features
25. vSphere
Primary value proposition:
• Broad application and operating system support with large eco-system of vendor partners
• Readily available pool of vSphere administration talent
• Many features are native implementations
• Direct feature integration via vCenter
Cloud use cases:
• Private enterprise clouds
• Dev/test clouds
Weaknesses:
• vSphere up-front license and ongoing support costs, many features require Enterprise Plus
• vCenter integration requires redundant designs
• Single data center per zone model
26. XenServer
Primary value proposition:
• Low cost with available vendor support
• Broad feature set with active development
• Large install base
• Direct integration via XAPI toolstack
Cloud use cases:
• Linux centric workloads
• Dev/test clouds and web hosting providers
• Desktop as a Service clouds
• Large VM density and secure tenant isolation
Weaknesses:
• Minimal use of advanced native features
27. Tying it all Together
1. Define success criteria
2. Select a topology which works
3. Decide on storage options
4. Define supported configurations
5. Select preferred hypervisor(s)
6. Validate matrix
7. Build your Cloud
Good afternoon. Today we’re going to cover some of the decision points you will go through when building your first cloud. Since hypervisors are core to a cloud, we’ll be focusing on which features are available with a given hypervisor, and by extension what CloudStack features can be used with a given hypervisor. This deck was created with CloudStack 4.5 as its base.
My name is Tim Mackey and I currently hold the role of XenServer Community Manager and Evangelist within the Citrix Open Source Business Office. I do still code from time to time and am currently writing in GoLang for the Packer project. You can see some of the cool things I've done, and I’m quite happy to discuss them at any point. If you’re looking for me, Twitter is a good start, and this deck like most of my presentations will be up on my SlideShare shortly.
So let’s start out with what we’re trying to accomplish. Our objective is to build a cloud, but not just any cloud, but a successful one. The definition will vary by organization, but in all circumstances our objective should always be to create a cloud which meets business needs and can grow.
In order to accomplish this goal, we need to focus our attention on the service offerings. These are the applications we want to offer, who will be able access and who actually is responsible for them. These decisions will impact the template definition, including operating system support and how patches are to be managed and by whom. Lastly, and arguably the most important item to address is compliance. Who owns overall compliance auditing, what does it mean to ensure availability while never forgetting backup duties and performance monitoring.
Once we decide the service offerings, the next thing to define are any tenancy requirements. In this context a tenant is a user or group of users, say for example from a given department, who need access to the cloud. They will have somewhat unique template requirements, but they will have unique data requirements. For example, how application data is stored will be defined by a compliance organization and a business may be subject to different data management policies per department. These policies can extend to how networks are defined (for example PCI DSS data existing on the same network segment as VMs which don’t require PCI compliance), and can also impact certain services such as VM migration and virtualization host HA. This all becomes critical when encryption of personally identifiable data is involved since the location of private keys used to encrypt such data could be accessible by other tenants. This is why cloud “ready” applications really need encryption designed for use within a cloud environment. Lastly all clouds at some level are going to come under economic scrutiny, and being able to show usage will be critical. That being said, while usage is important, its important to remember that compliance and audits are equally if not more critical
So with that as the backdrop, many of you will note that I’ve made no mention of vendors, and that’s intentional. Far too often those implementing a cloud attempt to retain current best practices and decide on the vendors first and then look for a compatible cloud solution. When that model is used, the capabilities of the cloud solution are often loosely related to the success criteria presented earlier. It is far better to let the success criteria dictate vendor selection. It’s also important to realize that by focusing a design on users, those users are abstracted from the implementation meaning that you are now free to use the correct hypervisor for the application, and that multiple hypervisors are OK, including bare metal.
The next item to consider is the concept of “pooling”. Most hypervisor vendors support the ability to group hosts into a logical cluster which is centrally managed. Using such a configuration then requires shared storage, which itself can be quite expensive, but more important to our needs is the concept of cluster size. All vendors define a maximum cluster size. Some even market that theirs is bigger than others, as if that is a reason to make a purchase. The reality is that all successful clouds will require more than one cluster, so its far better to think of cluster size as part of a modular design. If a cluster size of five hosts makes sense, then all clusters should have five hosts. The as capacity is required you scale up in clusters containing five hosts. This model support the use of single host clusters and allows a cloud designer the freedom to define not only scalability but also how storage will be used.
So with that as the back drop we’re done with the “marketing” slides and can jump deep into the details of this presentation. The remaining slides are presented as tables. I’m not going to read each option, but will focus my attention on specific features. Since this deck will be published you are free to reference the tables as required.
XenServer is the hypervisor used by the majority of CloudStack users. It offers great scalability, and with the recent release of SP1 for XenServer 6.5, you can now officially scale to 1000 VMs per host. It’s important to note that while XenServer can scale to such densities, CloudStack limits the VM density to 500 by default. This type of limitation will be seen in most hypervisor options. CloudStack directly integrates with the XAPI toolstack used by XenServer which means that no host agent is required.
One important thing to note is that starting with CloudStack 4.4, XenServer clusters must be created outside of CloudStack, and that pool level HA must be enabled. If pool level HA isn’t enabled and the pool master fails or restarts, then CloudStack will loose connection to the entire cluster.
Integration with CloudStack requires that a libvirt based CloudStack agent be installed. Since the integration is via libvirt, the KVM clustering solution of oVirt isn’t an option, and any native cluster features of oVirt aren’t supportable.
As with KVM, a management agent is required for all Hyper-V hosts. This agent is written in C# and interfaces with Hyper-V using WMI. We’re not going to cover Hyper-V in depth, but it’s important to note that Hyper-V is a supported hypervisor with CloudStack and that it does have unique requirements which are captured in the functional tables which follow.
So with all that as backdrop, let start by defining our networks….
Within CloudStack there are two basic network topologies. The first is a flat layer three network which is known as a Basic network which creates a Basic zone. Historically, this is also the network used to provide security groups, and security groups are how tenant isolation works in a Basic network. It’s important to note that when creating a security group with XenServer that the default XenServer network stack needs to be converted to “Bridge” mode. Doing so is a simple command, but one which requires the XenServer host to restart.
The second network topology is an “Advanced” network using VLANs for network isolation. In an advanced network, up to 4094 VLANs are used and the network services are bound to both a virtual router and to any required physical network component such as a NetScaler. When a new network is required, the network is created on the hypervisor and then a virtual router is started for that hypervisor. Once the virtual router is operational, CloudStack automatically configures the external service to use the designated VLAN. This entire process is defined when the network service is created within CloudStack. Since external devices are part of this process, if any external devices are used it’s important to ensure they are compatible with your chosen hypervisor. For example, if you wish to use Midonet, then you are limited to using KVM as your hypervisor.
Of course for some large scale implementations, 4094 VLANs might result in limited tenancy, so SDN solutions are available. It’s important to note that while some options are generally available for all hypervisors, the vendor may have chosen to limit testing to a specific hypervisor. If the hypervisor you prefer doesn’t have the SDN provider you prefer, it’s worth having a conversation with the respective vendors. For example, while XenServer is listed as not supporting VXLAN, the underlying virtual switch within XenServer 6.5 is capable of supporting VXLAN. VXLAN support within KVM is also limited to operating systems with a Linux kernel version of 3.7 or later.
Once you’ve defined the network, the next major topic will be storage.
Within a CloudStack environment, all VMs run on what’s known as “primary storage”. This then means that primary storage is something which must be supported by the hypervisor vendor. By default, CloudStack automates the creation of primary storage on the hypervisor. The two exceptions to this are XenServer which supports the concept of “pre-setup” storage and vSphere when VAAI is used for the datastore. Both vSphere and KVM are able to support “zone wide” storage due to their storage providers being able to create datastores which aren’t tied to a cluster.
Secondary storage is the storage which all templates and snapshots reside. Most installations use NFS storage as secondary storage performance isn’t as important as primary storage. For those installations seeking to use either OpenStack Swift or an S3 compatible storage solution for secondary storage, it’s important to pay attention to the note related to NFS requirements. Each hypervisor has its own preferred template format, and for XenServer a concept of “primary storage golden master” was introduced. This allows a host with multiple primary storage repositories to designate one of them as a repository containing golden masters. Doing so reduces the IO on the secondary storage system when a host will only be using known templates.
CloudStack has a rich feature set, and this table shows only a few of the major capabilities. What we’re focusing on here are those items which have a direct bearing upon the hypervisor choices. To highlight a few of the scenarios lets consider:
HA: If used with XenServer, and you wish to have a resource pool, HA must be enabled at the pool level, but disabled for VM protection. If you don’t enable HA at the pool level, and a pool master failure occurs, XenServer won’t elect a new pool master and CloudStack won’t be able to manage the pool. If used with vSphere, this will prevent dedicated resources from functioning correctly as vSphere could move a VM from a dedicated host to a general purpose one.
Zone wide primary storage: Zone wide primary storage is primary storage which is available to all hosts within an availability zone. This also means the storage is available across clusters. vSphere datacenter level storage supports such a concept as does KVM with shared mount points. XenServer doesn’t support such a concept which prevents its use with XenServer.
Auto-scaling of VM instances is the ability to define a load evaluator which understand the inbound traffic load and is able to increase or decrease VM capacity based on that load. With all hypervisor types, this can be done using a NetScaler while XenServer is able to accomplish with all load balancers.
New with 4.3:
- Quiesed snapshots on vSphere can be performed with “quiese” option for both VM only. Volume only works for both if hardware storage plugin supports feature
- Queised snapshots on XenServer don’t call XenServer queise API, so quiese snapshots work if the hardware storage plugin supports feature
CloudStack fully supports the use of multiple hypervisor types within a single cloud, but the feature set available may limit multiple hypervisors within a single zone. I personally have used vSphere, XenServer and KVM within a single zone so I know it can be done. One of the key requirements is to ensure the chosen network topology matches and is configured correctly. Correct configuration requires that all network labels are set correctly on the hosts, and within the physical network setup for the zone. When creating a zone, you’ll only be given the option to define a network label for the initial hypervisor type, so do remember to set labels for any additional hypervisors.
The network label will impact the virtual routers which are system VMs. By default system VMs can exist on any hypervisor in a zone, and with multiple hypervisors that means not only can a system VM exist on different hypervisors, but that if restarted the system VM could be deployed on a different hypervisor than it originally was. This can become problematic when the capacity of a given hypervisor is limited, or when placing hypervisors into maintenance mode. For this reason, it’s recommended that you force all system VMs to be on the same hypervisor type.
One of the more interesting expectations that multi-hypervisor support seems to create is one of migration between hypervisors. CloudStack contains no conversion tools, so whatever the original hypervisor type is, that is what it will always be.
Lastly, when performing capacity planning in a zone with multiple hypervisors, its important to understand not only the physical limitations of the hosts and associated storage, but also the limitations imposed by the hypervisor.
Image: http://cce.clark.edu/blog-tags/team-styles
So which one is best?
Image: Crossroads: Success or Failure Please give attribution to 'StockMonkeys.com' (and point the link towww.stockmonkeys.com). Thanks!