SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
A brief history of autoscaling
Lessons learned from 5 years of it
First,
some context
About me

● Sebastian Stadil
● Founder of meetup.com/cloudcomputing
● Founder of Scalr
● sebastian@stadil.com
● Slashdotted at 14
About
● Simple, powerful cloud management suite
● Helps you design & manage resilient,
scalable infrastructure
● For apps deployed in public & private clouds
● Over 2,000,000 instances launched
● Applications vary from 1 to 10,000 instances
● Started out as simple auto-scaling system
And now
our feature presentation
A brief history of autoscaling
Lessons learned from 5 years of it
Load Average

● Combination of CPU, disk IO, number of
processes running
● Represents system utilization.
● Good for most applications.
● Most widely used.
CPU

● Good for services with dominant CPU
consumption (duh)
● Data processing, video processing, etc..
Response times
● Rarely used metric
● Many factors screw it up (network
throughput, system resources, different
application queues)
● When response only depends on hardware,
can work
● Downscaling is problematic
RAM

● Good for RAM based databases and caches
● Beware of invalidating keys
● Memcached, Redis, etc.
Schedule

● Good for services with predictable traffic
● Advertising campaigns, product launches
● When you know that you will get extra traffic
at specific time or day
● When traffic changes throughout the day
Queue size
● Maintain processing rate, esp. SLA*
*Processing rate = queue size / servers (given
that each server can process X tasks per hour).
● Good for processing services such as video
encoding or sending messages
Bandwidth

● Limited channel per server (1Gbit anyone?)
● Need higher download capacity
● Known traffic per user
Disk io

● Cassandra
● Certain Hadoop jobs
● Stuff that hits the disk
Fuck it
Build your own algorithms
Custom metrics

● Read a file
● Execute a script
● Example: # of threads / connections
Custom algorithms

● OR for upscaling
● AND for downscaling
● Configurable cooldowns
● Configurable steps
● Example: scale up early, scale down slowly
Examples

● Social gaming
● Enterprise services
Another example (mysql)

● Take master out
● Take backing up slave out
Summary
Started with general, went specific,
then went custom

Más contenido relacionado

La actualidad más candente

Managing Global Distributed Network
Managing Global Distributed NetworkManaging Global Distributed Network
Managing Global Distributed Network
Jimmy Lim
 

La actualidad más candente (19)

Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
 
WSO2Con USA 2015: WSO2 DevOps: How to Deploy, Manage, Administer and Monitor ...
WSO2Con USA 2015: WSO2 DevOps: How to Deploy, Manage, Administer and Monitor ...WSO2Con USA 2015: WSO2 DevOps: How to Deploy, Manage, Administer and Monitor ...
WSO2Con USA 2015: WSO2 DevOps: How to Deploy, Manage, Administer and Monitor ...
 
M|18 Where and How to Optimize for Performance
M|18 Where and How to Optimize for PerformanceM|18 Where and How to Optimize for Performance
M|18 Where and How to Optimize for Performance
 
Google Cloud & Your Data
Google Cloud & Your DataGoogle Cloud & Your Data
Google Cloud & Your Data
 
Mobile 3: Launch Like a Boss!
Mobile 3: Launch Like a Boss!Mobile 3: Launch Like a Boss!
Mobile 3: Launch Like a Boss!
 
Managing Global Distributed Network
Managing Global Distributed NetworkManaging Global Distributed Network
Managing Global Distributed Network
 
Rust, Wright's Law, and the Future of Low-Latency Systems
Rust, Wright's Law, and the Future of Low-Latency SystemsRust, Wright's Law, and the Future of Low-Latency Systems
Rust, Wright's Law, and the Future of Low-Latency Systems
 
Automating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas SiravaraAutomating Gluster @ Facebook - Shreyas Siravara
Automating Gluster @ Facebook - Shreyas Siravara
 
Running MySQL in AWS
Running MySQL in AWSRunning MySQL in AWS
Running MySQL in AWS
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
ClustrixDB at Samsung Cloud
ClustrixDB at Samsung CloudClustrixDB at Samsung Cloud
ClustrixDB at Samsung Cloud
 
How We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and FasterHow We Made Scylla Maintenance Easier, Safer and Faster
How We Made Scylla Maintenance Easier, Safer and Faster
 
Lightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning SpeedLightweight Transactions at Lightning Speed
Lightweight Transactions at Lightning Speed
 
Neoito — Scaling node.js
Neoito — Scaling node.jsNeoito — Scaling node.js
Neoito — Scaling node.js
 
MySQL on AWS RDS
MySQL on AWS RDSMySQL on AWS RDS
MySQL on AWS RDS
 
Cassandra On EPAM Cloud - VDAY 2017
Cassandra On EPAM Cloud - VDAY 2017Cassandra On EPAM Cloud - VDAY 2017
Cassandra On EPAM Cloud - VDAY 2017
 

Similar a Scalr: Setting Up Automated Scaling

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 

Similar a Scalr: Setting Up Automated Scaling (20)

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Amazon WorkSpaces-Virtual Desktops in Cloud
Amazon WorkSpaces-Virtual Desktops in CloudAmazon WorkSpaces-Virtual Desktops in Cloud
Amazon WorkSpaces-Virtual Desktops in Cloud
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Redis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs TalksRedis Developers Day 2014 - Redis Labs Talks
Redis Developers Day 2014 - Redis Labs Talks
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Writing and deploying serverless python applications
Writing and deploying serverless python applicationsWriting and deploying serverless python applications
Writing and deploying serverless python applications
 
PyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applicationsPyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applications
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Design Like a Pro: How to Pick the Right System Architecture
Design Like a Pro: How to Pick the Right System ArchitectureDesign Like a Pro: How to Pick the Right System Architecture
Design Like a Pro: How to Pick the Right System Architecture
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Ansible and CloudStack
Ansible and CloudStackAnsible and CloudStack
Ansible and CloudStack
 
Introduction to OpenStack Storage
Introduction to OpenStack StorageIntroduction to OpenStack Storage
Introduction to OpenStack Storage
 
Getting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud StorageGetting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud Storage
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud TrainingLinux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 

Más de Hakka Labs

Más de Hakka Labs (20)

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data Science
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at Instacart
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scale
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineering
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data Structures
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with Ourselves
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Scalr: Setting Up Automated Scaling

  • 1. A brief history of autoscaling Lessons learned from 5 years of it
  • 3. About me ● Sebastian Stadil ● Founder of meetup.com/cloudcomputing ● Founder of Scalr ● sebastian@stadil.com ● Slashdotted at 14
  • 4. About ● Simple, powerful cloud management suite ● Helps you design & manage resilient, scalable infrastructure ● For apps deployed in public & private clouds ● Over 2,000,000 instances launched ● Applications vary from 1 to 10,000 instances ● Started out as simple auto-scaling system
  • 5. And now our feature presentation
  • 6. A brief history of autoscaling Lessons learned from 5 years of it
  • 7. Load Average ● Combination of CPU, disk IO, number of processes running ● Represents system utilization. ● Good for most applications. ● Most widely used.
  • 8. CPU ● Good for services with dominant CPU consumption (duh) ● Data processing, video processing, etc..
  • 9. Response times ● Rarely used metric ● Many factors screw it up (network throughput, system resources, different application queues) ● When response only depends on hardware, can work ● Downscaling is problematic
  • 10. RAM ● Good for RAM based databases and caches ● Beware of invalidating keys ● Memcached, Redis, etc.
  • 11. Schedule ● Good for services with predictable traffic ● Advertising campaigns, product launches ● When you know that you will get extra traffic at specific time or day ● When traffic changes throughout the day
  • 12. Queue size ● Maintain processing rate, esp. SLA* *Processing rate = queue size / servers (given that each server can process X tasks per hour). ● Good for processing services such as video encoding or sending messages
  • 13. Bandwidth ● Limited channel per server (1Gbit anyone?) ● Need higher download capacity ● Known traffic per user
  • 14. Disk io ● Cassandra ● Certain Hadoop jobs ● Stuff that hits the disk
  • 15. Fuck it Build your own algorithms
  • 16. Custom metrics ● Read a file ● Execute a script ● Example: # of threads / connections
  • 17. Custom algorithms ● OR for upscaling ● AND for downscaling ● Configurable cooldowns ● Configurable steps ● Example: scale up early, scale down slowly
  • 18. Examples ● Social gaming ● Enterprise services
  • 19. Another example (mysql) ● Take master out ● Take backing up slave out
  • 20. Summary Started with general, went specific, then went custom