Despite all the buzz about it, building a horizontally scalable application for cloud deployment isn't all that different from building one for a physical deployment, except in its ability to change size on-the-fly. Bigger applications have been using commodity hardware and fault-tolerant design to achieve high availability and scalability for a while, but provisioning capacity remains troublesome there. The real addition the cloud brings architecturally is the ability to add new resources instantly, and even change your provisioning profile algorithmically.
3. About me
● Sebastian Stadil
● Founder of meetup.com/cloudcomputing
● Founder of Scalr
● sebastian@stadil.com
● Slashdotted at 14
4. About
● Simple, powerful cloud management suite
● Helps you design & manage resilient,
scalable infrastructure
● For apps deployed in public & private clouds
● Over 2,000,000 instances launched
● Applications vary from 1 to 10,000 instances
● Started out as simple auto-scaling system
6. A brief history of autoscaling
Lessons learned from 5 years of it
7. Load Average
● Combination of CPU, disk IO, number of
processes running
● Represents system utilization.
● Good for most applications.
● Most widely used.
8. CPU
● Good for services with dominant CPU
consumption (duh)
● Data processing, video processing, etc..
9. Response times
● Rarely used metric
● Many factors screw it up (network
throughput, system resources, different
application queues)
● When response only depends on hardware,
can work
● Downscaling is problematic
10. RAM
● Good for RAM based databases and caches
● Beware of invalidating keys
● Memcached, Redis, etc.
11. Schedule
● Good for services with predictable traffic
● Advertising campaigns, product launches
● When you know that you will get extra traffic
at specific time or day
● When traffic changes throughout the day
12. Queue size
● Maintain processing rate, esp. SLA*
*Processing rate = queue size / servers (given
that each server can process X tasks per hour).
● Good for processing services such as video
encoding or sending messages
13. Bandwidth
● Limited channel per server (1Gbit anyone?)
● Need higher download capacity
● Known traffic per user
16. Custom metrics
● Read a file
● Execute a script
● Example: # of threads / connections
17. Custom algorithms
● OR for upscaling
● AND for downscaling
● Configurable cooldowns
● Configurable steps
● Example: scale up early, scale down slowly