In this talk, Wei looks at the new VM autoscaling functionality in CloudStack (due for the 4.18 release) that gives VM autoscaling without relying on any external devices.
Wei Zhou is a committer and PMC member of Apache CloudStack project, and works for ShapeBlue as a Software Architect.
-----------------------------------------
CloudStack Collaboration Conference 2022 took place on 14th-16th November in Sofia, Bulgaria and virtually. The day saw a hybrid get-together of the global CloudStack community hosting 370 attendees. The event hosted 43 sessions from leading CloudStack experts, users and skilful engineers from the open-source world, which included: technical talks, user stories, new features and integrations presentations and more.
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
VM Autoscaling With CloudStack VR As Network Provider
1. Feature first look: VM AutoScaling using
CloudStack Virtual Router
CloudStack Collaboration Conference, 14 - 16 November 2022
2. Who am I ?
Wei Zhou
2010.07 Started on Cloud Computing
2012.12 Started on Apache CloudStack
2013.05 Became an Apache CloudStack committer
2017.03 Became an Apache CloudStack PMC member
2021.05 joined Shapeblue as Software Architect
Email: weizhou@apache.org
weizhouapache@gmail.com
5. What is VM AutoScaling
×CloudStack documentation says:
“AutoScaling allows you to scale your back-end services or application VMs up or
down seamlessly and automatically according to the conditions you define.”
• When: the conditions you define are matched
• What: scale up or down
• How: seamlessly and automatically, no manual intervention.
דScale Up” means Create new VM (it is called “Scale Out” in some articles)
דScale Down” means Destroy underused VM (it is called “Scale In” in some
articles)
6. History in CloudStack
×A feature introduced in Apache CloudStack 4.1, contributed by Citrix
×Use Case 1: Citrix Netscaler as Load balancer provider
• Linux User CPU - percentage, Linux System CPU - percentage, Linux CPU Idle -
percentage (source: snmp)
• Response Time - microseconds (source: netscaler)
×User Case 2: VM AutoScaling without Netscaler on XenServer
• Linux User CPU - percentage - native (source: cpu, retrieved from XenCenter)
• Linux User RAM - percentage - native (source: memory, retrieved from XenCenter)
7. VM AutoScaling using CloudStack Virtual Routers
×New functionality in CloudStack 4.18.0
• Get metrics from hosts and CloudStack virtual routers as well
• Hypervisor support: KVM, VMware, XenServer/XCP-ng
• New CloudStack UI (vue3)
• Support more VM parameters
• PR to CloudStack 4.18.0:
https://github.com/apache/cloudstack/pull/6571
9. List of glossaries (resources)
Name Description
Counter Performance counters to be used for monitoring the health of
the guest VMs.
Condition Conditions express criteria for triggering an autoscale action.
Condition uses Counters above.
AutoScale Policy AutoScale Policy defines a policy for taking an autoscale action
by combining conditions.
AutoScale VM Profile AutoScale Vm Profile is a set of various settings to be used for
VMs while taking scale up or scale down action.
For example, service offering, template, ssh keys
AutoScale VM Group AutoScale Vm Group associates scaleup and scaledown
policies with a load balancing rule.
10. Autoscale Policy
I have a group of VMs. I wish CloudStack to automatically
×Create new VM (a.k.a Scale Up), if in last 300 seconds
• “VM CPU - average percentage” > 50, AND
• “VM Memory - average percentage” > 80
×The new VM has 4CPU, 8GB RAM, created from CentOS 8 template.
×The new VM is added to a load balancer (10.10.10.10/443/80) as a web server.
Conditition-1
Conditition-2
Example: AutoScale VM group
Counter-1
Counter-2
Autoscale VM profile
14. Autoscaling policy
×Properties
• Name (*)
• Conditions (1 or more)
• Duration (in seconds)
• Action: ScaleUp, ScaleDown
• Quiettime (300 seconds by default)
×A counter can be used only once in a policy. It can be used in other policies.
×A counter can be used only once in a policy. It can be used in other policies.
15. Autoscaling VM profile
×Properties:
• zoneid (required)
• serviceofferingid (required)
• templateid (required)
• autoscaleuserid (Netscaler
only)
• counterParam (Netscaler
only)
• snmpcommunity
• snmpport
• userdata (*)
• expungevmgraceperiod
• otherDeployParams
• rootdisksize (*)
• diskofferingid (*)
• disksize (data) (*)
• ssh keypairs (*)
• affinitygroupids (*)
• networkids (*)
×Dynamic service offering is not supported.
×userdata is a base64-encoded string.
×LB network is always the default network of VMs.
16. Autoscaling VM group
×Properties:
• lbRuleId
• minMembers
• maxMembers
• interval
• scaleUpPolicyIds
• scaleDownPolicyIds
• profileId
×Each vm group must have at least 1 scaleup policy and at least 1 scaledown policy.
×Autoscaling VM group can have multiple scaleup and scaledown policies.
17. Supported APIs
×Counter: Create, List, Delete (ROOT admin only)
×Condition: Create, List, Update (*), Delete
×Autoscaling Policy: Create, List, Update, Delete
×Autoscaling VM profile: Create, List, Update, Delete
×Autoscaling VM group: Create, List, Update, Delete. Enable,
Disable
• New parameter: cleanup of deleteAutoScaleVmGroup API
• It is not recommended to create new counter as it is not supported in backend
19. Prerequisites
×Create a load balancer rule
×Upload the VM template
×Create other resources, if needed
• service offering, disk offering
• ssh keypairs
• affinity groups
• other networks, etc
×Acquire an IP if isolated network does not have a Public IP.
×VmAutoScaling capability is enabled by default for network offerings with load
balanacer. To disable it, please create a network offering without VmAutoScaling
support.
20. Prerequisites
If memory usage is used in scaleup or scaledown policies,
×KVM:
• VM template has virtio driver installed,
• add the following line to /etc/cloudstack/agent/agent.properties
• vm.memballoon.disable = false (false by default)
• vm.memballoon.stats.period = positive number (Disabled by default, 5/10/60 are
tested ok)
• Note: Windows VMs might be stuck after live migration
×XenServer:
• VM template has PV driver installed
× Bug fix (4.18.0+): Fix memory stats for KVM (#6358)
× Bug fix (4.17.2+) : XenServer/XCP-ng: fix vm memory usage is always 99.9x% (#6852)
× Bug fix (4.17.1.0+): Fix VMware memory retrieval (#6414)
21. (1) Create AutoScale VM group
×Steps
• create Autoscale vm profile
• create conditions
• create scaleup policies
• create scaledown policies
• create Autoscale vm group
22. (2) Monitor AutoScale VM group
×MonitorTask runs every `interval` (in seconds)
×MonitorTask is created when a VM group is created or enabled.
×MonitorTask is shutdown and removed when a VM group is disabled or removed.
×Stopped VMs are not considered in max/min VM check
23. Monitor AutoScale VM group: About Metrics
×VM memory usage percentage
• KVM/XenServer/XCP-ng : (total memory - free memory) * 100 / total memory
• VMware: Active Guest Memory (on vCenter) * 100 / total memory
×Public Network Received/Transmit
• Added iptables rules for each public interface
• Get counter of iptables rules
×Load balancer connections
• Enabled haproxy socket in haproxy.cfg
• Get Load balancer statistics via haproxy socket
×Known issue when CloudStack gets average load balancer connections from
CloudStack Virtual Routers, see
https://github.com/apache/cloudstack/issues/6849
×Other VMs or services in the isolated network might have some impact on
the counters “Public Network Received/Transmit”.
24. (3) Check AutoScale VM group
×AutoScaleMonitor checks all VM groups in parallel
×ScaleUp policies are checked before ScaleDown policies.
×Each network has a network rate setting.
×A policy will be skipped if there are Inactive records for the group or the policy
25. Check VM Group: Data types and processing
×These VMs are considered as available: Running, Starting, Stopping, Migrating
×Stopped VMs are not available, therefore it is not considered in the calculation
דAggregated data for individual VM” is not used currently.
Data type Data calculation
Instant data for individual VM
(e.g. VM CPU, Memory utilization)
sum(value) / count(value)
Aggregated data for individual VM
(e.g. VM network mbps / disk iops)
(last value - first value) / (last timestamp - first
timestamp)
Instant data for VM group
(e.g. Average LB connections per VM)
sum(value) / count(value) / count of available VMs
Aggregated data for VM group
(e.g. Public network received/transmit
mbps per VM)
(last value - first value) / (last timestamp - first
timestamp) / count of available VMs
26. (4) Scale Up
×VM name format: autoScaleVm-<Group name>-<seq number>-<6 random letters>
×An Inactive record is inserted when VM group is scaled up.
27. (5) Scale Down
×An Inactive record is inserted when VM group is scaled down.
28. (6) Manage AutoScale VM group
×Supported actions:
• Disable VM group
• Update VM group, VM profile, policies, conditions
• Add new policy, Remove policy
• Add condition, Remove condition
• Enable VM group
×Adding VMs to VM group is NOT allowed.
×VM can be removed from AutoScale VM group and LB rule only if VM group is
Disabled.
×An Inactive record is inserted when VM group is enabled/disabled.
32. Summary
×Enhancement in CloudStack 4.18.0
• CloudStack Virtual router is used by LB provider, instead of Netscaler
• Supports more performance counters from CloudStack virtual router
• Supports hypervisor: KVM, VMware, XenServer/XCP-ng
• Supported on CloudStack UI (vue3)
33. Future work
×Support more counters
• VM disk write/read bps/iops
• VM nic interfaces transmit/received
• Specific Public IP/port Transmit/received
×Support UserData ids
• Userdata is a first-class resource since 4.18.0.0
×Support Shared networks
• Lb is optional
• metrics from virtual routers are not available any more
37. Appendix: Quick Tips
×rootdisksize is exclusive with overriderootdiskoffering
×Not tested
• Deploy VM with OVF images on VMware
• AutoScaling with netscaler as LB provider
• AutoScaling without netscaler
×Not implemented:
• UI: load existng vm profile when create vm group
• UI: Manage conditions/policies and VM profiles
• Server: Check conditions by formula in scale policies: Not only AND