SlideShare a Scribd company logo
1 of 20
Download to read offline
1




Auto-Scaling to Minimize Cost and
  Meet Application Deadlines in
        Cloud Workflows

                        SC 11
                   (Nov 16, TCC 305)




        Ming Mao, Marty Humphrey
        CS Department, University of Virginia
Introduction
2


       Resource provisioning questions are not trivial
           Under-provisioning → hurt performance
           Over-provisioning → pay more than necessary

         How much resources?
         What types of resources?

         When to acquire or release?

         How to use them?

       A performance-resource mapping problem
Auto-Scaling
3


       Schedule-based and rule-based auto-scaling
           E.g. “run 10 instances between 8AM to 6PM everyday and
            2 instances all the other time.”
           E.g. “add (remove) 2 instances when the average CPU
            utilization is above 70% (below 20%) for 5 minutes.”
           Simple and convenient, works well for simple applications
           What if the relationship between the performance and
            resources utilization indicators is complex
           The resource utilization indicators are low-level and may
            not be expressive enough
           They do not consider the user budgets well
Auto-Scaling
4


       Goals of auto-scaling mechanisms
             Balance performance and cost
                   E.g. meet performance goals with minimum cost or maximize
                    utilities with the limited budget
             Reflect different options for computing resources
                   E.g. VMs have different processing power and price
             Be aware of practical considerations
                   E.g. VM may takes several min to be ready to use
             Be aware of the cloud billing model
                   E.g. billed by instance-hours
             Support specific application performance requirements
                   E.g. deadlines, the number of concurrent users, communication
                    latency
Cloud application model
5

                                                                                            Credit
                                                                 Cloud                      History
                                                                                                      Third Party
                                                                                                      Evaluation
                                                                                                                      Complete
                                                                                                                       Model
                         Gold                                                                 (5)         (8)
                                                                                                                        (10)
                        Members                          Authentication
                                                                            Loading
                                                                             Profile        Health
                                                              (2)
                                                                               (4)          Record
                                                                                                      Advanced
                                                                                              (6)      Model
                         Silver           Entry
                        Members          Point (1)                                                       (9)                     Response
                                                                                                                                   (11)
                                                                       Data                 Base
                                                                     Validation             Model
                         Non-                                            (3)                 (7)
                        Member
                                                                                                      Auto-Scaling


                             Non-Member Job          Silver Member Job            Gold Member Job

                                                                                                                    Cloud VMs
       App consists of service units
       Job consists of tasks
       Jobs are categorized into classes (deadline and processing flow)
       Cloud offers multiple VM types (price and processing power)
       App has no knowledge on the workload info in advance
       VM takes time to start up (VM acquisition delay) and are billed by hours
Problem definition
6

        Cloud application
            app = {Si}
                                                    Job class
                                                        J = {DAG(Si), deadline | Si ∈ app}
        Cloud VM
                          𝑆
            VMv = {[𝑗 𝐽 𝑖 ]v , cv , lagv}
                                                    Workload
                                                                         𝑆𝑖
                                                        Wt =   𝑆𝑖   𝐽 𝑗𝐽
        Scaling plan
            Scalingt = {VMv , Nv}

                                                    Scheduling plan
                                                                              𝑆
                                                        Schedulet = { 𝑗 𝐽 𝑖 →VMv}
        Goal
            Min(C) = Min(        𝑣   𝑐 𝑣 𝑁 𝑣)
Solution
7


       SCS (Scaling – Consolidation - Scheduling)
         Task bundling
         Deadline assignment

         Scaling

         Instance consolidation

         Scheduling
Solution – Step 1
8


       Task bundling
         Idea – force tasks run on the same instance to improve
          performance and save data transfer cost

         Example

                      T6                T8       T6                     T8
                                                   Bundle task as T6'


                    Server 1        Server 2   Server 1           Server 1
                               Before                     After
Solution – Step 2
9


       Deadline assignment
         Idea – to break task dependencies, assign deadlines
          proportionally based on task running time (on their cost-
          efficient machines)
         Example
                                                                          T3
                           T3                                                  T7
                                T7
                                                                          T4                        T11
                           T4              T11

                                                 T13        T1    T2           T8      T10                   T13
                 T1   T2        T8   T10
                           T5              T12                            T5                        T12
                                T9                                             T9

                           T6                                             T6


            3:00PM
              3:00                                4:30   3:00 3:10 3:20         3:50         4:00         4:20 4:30
                           Before                                                   After
           Task upgrading
                                                   𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 −𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑎𝑓𝑡𝑒𝑟
                                      𝑟𝑎𝑛𝑘 =
                                                         𝑐𝑜𝑠𝑡 𝑎𝑓𝑡𝑒𝑟 −𝑐𝑜𝑠𝑡 𝑏𝑒𝑓𝑜𝑟𝑒
Solution – Step 3
10


        Determine the number of instances
          From   deadline assignment, we have
            Task running time – tm
            Task execution interval – [T0 ,T1 ]

          Load   vector
            LVm =  [tm/( T1 – T0 )]
            # of instances = [LVm]

          Example

                    T1     0   0                     0.25            0   0
                    T2     0   0          0           0.5      0     0   0
                                   3:00       3:15          3:45 4:00
                                                                             VM1
                    All    0   0      0.25           0.75     0.25   0   0
Solution – Step 5
11


        Instance consolidation
          Idea – put tasks on the same instance even if some
           task may not run the most cost-efficiently on that
           machine

          Example
                                      T11                                  Idle
                                            High-CPU 3:00 PM                      4:00 PM
                       Before
                                      T12                        Idle
                                                       3:00 PM                    4:00 PM
                                            Standard

                      After     T11   T12                               Idle
                                            Standard   3:00 PM                    4:00 PM
Solution – Step 6
12


        Scheduling – Earliest Deadline First
          The dynamic scaling feature can make sure that the
           tasks facing missed deadlines can be found in time

                                       𝑡𝑖
                                                  <1
                          𝑖   𝑇 𝑒𝑛𝑑_𝑖 − 𝑇 𝑠𝑡𝑎𝑟𝑡_𝑖
Solution – Overview
13


                            Parallelism   reduction
Evaluation
14

        Workload patterns




        Application models
                                                        VM Type            Price
                                                          Micro         $0.02/hour
                                                        Standard        $0.085/hour
                                                        High-CPU        $0.68/hour
                                                      High-Memory       $0.50/hour

      Base line     Time            Task   execution         VM        lag
        Greedy          72 hours       Randomly generated           8 min
        GAIN
Evaluation
15




      SCS cost saving ranges from 6.8% to 40.4%
      The performance difference is larger with longer deadlines
Evaluation – High volume V.S. Low volume
16


        High workload (10X ) V.S. low workload (X)
          Pipeline,        1-hour deadline
            Cost ($)
                         High Volume V.S. Low Volume
           120                                            Greedy-
                                                          High
           100                                            GAIN-
                                                          High
            80
                                                          SCS-High
            60
                                                          Greedy-
            40                                            Low
                                                          GAIN-
            20
                                                          Low
              0                                           SCS-Low
                       Stable   Growing   Cycle   OnOff
Evaluation – Imprecise parameters
17

                 Deadline(0.5hour) Non-Miss Rate for           Pipeline application, 20% variance
     Non-miss
     Rate (%)    Imprecise Task Execution Estimation
     100.0%                                                     in estimated execution time, 0.5-
      90.0%
      80.0%
                                                                hour deadline
                                                     Greedy
                                                               SCS can finish jobs before
      70.0%
      60.0%
      50.0%                                          GAIN
      40.0%                                          SCS
                                                                deadlines for more than 90%,
      30.0%
      20.0%                                                     much better than Greedy(40%)
      10.0%
       0.0%
                                                                and GAIN(50%)
                 Stable    Growing   Cycle   OnOff


                 Deadeline(1 hour) Non-Miss Rate for           Pipeline application, 20% variance
      Non-miss
      Rate(%)     Imprecise Instance Acquisition Lag            in the estimate VM acquisition
     100.0%
      90.0%                                                     time, 1-hour deadline
      80.0%
      70.0%                                          Greedy    SCS beats Greedy and GAIN
      60.0%
      50.0%
                                                     GAIN
                                                               The performance is more affected
      40.0%                                          SCS
      30.0%                                                     by the VM acquisition time
      20.0%
      10.0%
       0.0%
                  Stable   Growing   Cycle   OnOff
Related work
18


        Dynamic resource provisioning in virtualized
         environment
              Multi-tier web applications, queuing theory, control theory
        Workflow scheduling in Grid environment with
         deadline and budget constraints
            Single workflow instance
            Resource pool is limited
        Cloud economics
              Cloud provider side V.S. cloud user side
        Current cloud auto-scaling mechanisms
              E.g. AWS auto-scaling, RightScale, enStratus, Scalr, AzureScale
               project, etc.
Conclusion and future work
19

        Conclusions
            SCS cost saving ranges from 6.8% to 40.4%
            SCS can better handle different workload volume and imprecise
             parameters
            Choosing proper VM types based on the workload saves cost
            Instance consolidation can help save partial instance hours
            VM acquisition time plays a very important role

        Future work
            Different scheduling approaches
            Real scientific applications
            Insufficient budget cases - maximize cloud user benefits/utilities
             under budget constraints
            Data-intensive applications
20




     Thank you!

More Related Content

What's hot

(BDT305) Amazon EMR Deep Dive and Best Practices
(BDT305) Amazon EMR Deep Dive and Best Practices(BDT305) Amazon EMR Deep Dive and Best Practices
(BDT305) Amazon EMR Deep Dive and Best PracticesAmazon Web Services
 
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)Amazon Web Services
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDBMasahiro Akita
 
20210526 AWS Expert Online マルチアカウント管理の基本
20210526 AWS Expert Online マルチアカウント管理の基本20210526 AWS Expert Online マルチアカウント管理の基本
20210526 AWS Expert Online マルチアカウント管理の基本Amazon Web Services Japan
 
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL Compatibility
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL CompatibilityAWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL Compatibility
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL CompatibilityAmazon Web Services Japan
 
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Amazon Web Services
 
Iam presentation
Iam presentationIam presentation
Iam presentationAWS UG PK
 
AWS Black Belt Online Seminar 2017 AWS X-Ray
AWS Black Belt Online Seminar 2017 AWS X-RayAWS Black Belt Online Seminar 2017 AWS X-Ray
AWS Black Belt Online Seminar 2017 AWS X-RayAmazon Web Services Japan
 
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...Amazon Web Services
 
Auto scaling using Amazon Web Services ( AWS )
Auto scaling using Amazon Web Services ( AWS )Auto scaling using Amazon Web Services ( AWS )
Auto scaling using Amazon Web Services ( AWS )Harish Ganesan
 

What's hot (20)

AWS Elastic Beanstalk
AWS Elastic BeanstalkAWS Elastic Beanstalk
AWS Elastic Beanstalk
 
(BDT305) Amazon EMR Deep Dive and Best Practices
(BDT305) Amazon EMR Deep Dive and Best Practices(BDT305) Amazon EMR Deep Dive and Best Practices
(BDT305) Amazon EMR Deep Dive and Best Practices
 
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)
AWS re:Invent 2016: Building Complex Serverless Applications (GPST404)
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Wells fargo banking system ER Diagram
Wells fargo banking system ER DiagramWells fargo banking system ER Diagram
Wells fargo banking system ER Diagram
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
 
Become an IAM Policy Ninja
Become an IAM Policy NinjaBecome an IAM Policy Ninja
Become an IAM Policy Ninja
 
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB
日本最大の即レスサービス「アンサー」を支える Amazon DynamoDB
 
20210526 AWS Expert Online マルチアカウント管理の基本
20210526 AWS Expert Online マルチアカウント管理の基本20210526 AWS Expert Online マルチアカウント管理の基本
20210526 AWS Expert Online マルチアカウント管理の基本
 
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL Compatibility
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL CompatibilityAWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL Compatibility
AWS Black Belt Online Seminar 2017 Amazon Aurora with PostgreSQL Compatibility
 
Introduction to AWS X-Ray
Introduction to AWS X-RayIntroduction to AWS X-Ray
Introduction to AWS X-Ray
 
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
 
Iam presentation
Iam presentationIam presentation
Iam presentation
 
AWS Black Belt Online Seminar 2017 AWS X-Ray
AWS Black Belt Online Seminar 2017 AWS X-RayAWS Black Belt Online Seminar 2017 AWS X-Ray
AWS Black Belt Online Seminar 2017 AWS X-Ray
 
AWS SQS SNS
AWS SQS SNSAWS SQS SNS
AWS SQS SNS
 
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...
AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns ...
 
Introducao à Nuvem da Amazon Web Services
Introducao à Nuvem da Amazon Web ServicesIntroducao à Nuvem da Amazon Web Services
Introducao à Nuvem da Amazon Web Services
 
Amazon SQS overview
Amazon SQS overviewAmazon SQS overview
Amazon SQS overview
 
Auto scaling using Amazon Web Services ( AWS )
Auto scaling using Amazon Web Services ( AWS )Auto scaling using Amazon Web Services ( AWS )
Auto scaling using Amazon Web Services ( AWS )
 

Similar to Auto-Scaling to Minimize Cost in Cloud Workflows

IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...IRJET Journal
 
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld
 
Serenity Project: Security in Software Enginering
Serenity Project: Security in Software EngineringSerenity Project: Security in Software Enginering
Serenity Project: Security in Software EngineringFrancisco Sanchez Cid
 
Cloud computing and CloudStack
Cloud computing and CloudStackCloud computing and CloudStack
Cloud computing and CloudStackMahbub Noor Bappy
 
SaaS transformation with OCE - uEngineCloud
SaaS transformation with OCE - uEngineCloudSaaS transformation with OCE - uEngineCloud
SaaS transformation with OCE - uEngineClouduEngine Solutions
 
Spring boot microservice metrics monitoring
Spring boot   microservice metrics monitoringSpring boot   microservice metrics monitoring
Spring boot microservice metrics monitoringOracle Korea
 
Spring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics MonitoringSpring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics MonitoringDonghuKIM2
 
Dc architecture for_cloud
Dc architecture for_cloudDc architecture for_cloud
Dc architecture for_cloudAlain Geenrits
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET Journal
 
Performance and Cost Analysis of Modern Public Cloud Services
Performance and Cost Analysis of Modern Public Cloud ServicesPerformance and Cost Analysis of Modern Public Cloud Services
Performance and Cost Analysis of Modern Public Cloud ServicesMd.Saiedur Rahaman
 
RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
 RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
RTC/CLM 5.0 Adoption Paths: Deploying in 16 StepsStéphane Leroy
 
Eci Service Architecture Evolution 1
Eci Service Architecture Evolution 1Eci Service Architecture Evolution 1
Eci Service Architecture Evolution 1David Sprott
 
Vulnerability Advisor: DevSecOps Integration
Vulnerability Advisor: DevSecOps IntegrationVulnerability Advisor: DevSecOps Integration
Vulnerability Advisor: DevSecOps IntegrationCanturk Isci
 
Building your private cloud the ncs experience harrison lee
Building your private cloud the ncs experience harrison leeBuilding your private cloud the ncs experience harrison lee
Building your private cloud the ncs experience harrison leeMicrosoft Singapore
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalElastic Grid, LLC.
 
Distributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile leeDistributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile leeHui Cheng
 
Danile lee -open stackblocklevelstorage
Danile lee -open stackblocklevelstorageDanile lee -open stackblocklevelstorage
Danile lee -open stackblocklevelstorageOpenCity Community
 
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...Amazon Web Services
 
RTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
RTC/CLM 2012 Adoption Paths : Deploying in 16 StepsRTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
RTC/CLM 2012 Adoption Paths : Deploying in 16 StepsStéphane Leroy
 

Similar to Auto-Scaling to Minimize Cost in Cloud Workflows (20)

IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
 
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
 
Serenity Project: Security in Software Enginering
Serenity Project: Security in Software EngineringSerenity Project: Security in Software Enginering
Serenity Project: Security in Software Enginering
 
Cloud computing and CloudStack
Cloud computing and CloudStackCloud computing and CloudStack
Cloud computing and CloudStack
 
SaaS transformation with OCE - uEngineCloud
SaaS transformation with OCE - uEngineCloudSaaS transformation with OCE - uEngineCloud
SaaS transformation with OCE - uEngineCloud
 
Spring boot microservice metrics monitoring
Spring boot   microservice metrics monitoringSpring boot   microservice metrics monitoring
Spring boot microservice metrics monitoring
 
Spring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics MonitoringSpring Boot - Microservice Metrics Monitoring
Spring Boot - Microservice Metrics Monitoring
 
Dc architecture for_cloud
Dc architecture for_cloudDc architecture for_cloud
Dc architecture for_cloud
 
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...IRJET-  	  Scheduling of Independent Tasks over Virtual Machines on Computati...
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
 
Performance and Cost Analysis of Modern Public Cloud Services
Performance and Cost Analysis of Modern Public Cloud ServicesPerformance and Cost Analysis of Modern Public Cloud Services
Performance and Cost Analysis of Modern Public Cloud Services
 
RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
 RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
 
Eci Service Architecture Evolution 1
Eci Service Architecture Evolution 1Eci Service Architecture Evolution 1
Eci Service Architecture Evolution 1
 
Vulnerability Advisor: DevSecOps Integration
Vulnerability Advisor: DevSecOps IntegrationVulnerability Advisor: DevSecOps Integration
Vulnerability Advisor: DevSecOps Integration
 
Building your private cloud the ncs experience harrison lee
Building your private cloud the ncs experience harrison leeBuilding your private cloud the ncs experience harrison lee
Building your private cloud the ncs experience harrison lee
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 Final
 
Resume_Mohan Selvamoorthy_Sec
Resume_Mohan Selvamoorthy_SecResume_Mohan Selvamoorthy_Sec
Resume_Mohan Selvamoorthy_Sec
 
Distributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile leeDistributed Block-level Storage Management for OpenStack, by Danile lee
Distributed Block-level Storage Management for OpenStack, by Danile lee
 
Danile lee -open stackblocklevelstorage
Danile lee -open stackblocklevelstorageDanile lee -open stackblocklevelstorage
Danile lee -open stackblocklevelstorage
 
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...
AWS re:Invent 2016: Fueling Migration: Shortcutting your Application Portfoli...
 
RTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
RTC/CLM 2012 Adoption Paths : Deploying in 16 StepsRTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
RTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Auto-Scaling to Minimize Cost in Cloud Workflows

  • 1. 1 Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows SC 11 (Nov 16, TCC 305) Ming Mao, Marty Humphrey CS Department, University of Virginia
  • 2. Introduction 2  Resource provisioning questions are not trivial  Under-provisioning → hurt performance  Over-provisioning → pay more than necessary  How much resources?  What types of resources?  When to acquire or release?  How to use them?  A performance-resource mapping problem
  • 3. Auto-Scaling 3  Schedule-based and rule-based auto-scaling  E.g. “run 10 instances between 8AM to 6PM everyday and 2 instances all the other time.”  E.g. “add (remove) 2 instances when the average CPU utilization is above 70% (below 20%) for 5 minutes.”  Simple and convenient, works well for simple applications  What if the relationship between the performance and resources utilization indicators is complex  The resource utilization indicators are low-level and may not be expressive enough  They do not consider the user budgets well
  • 4. Auto-Scaling 4  Goals of auto-scaling mechanisms  Balance performance and cost  E.g. meet performance goals with minimum cost or maximize utilities with the limited budget  Reflect different options for computing resources  E.g. VMs have different processing power and price  Be aware of practical considerations  E.g. VM may takes several min to be ready to use  Be aware of the cloud billing model  E.g. billed by instance-hours  Support specific application performance requirements  E.g. deadlines, the number of concurrent users, communication latency
  • 5. Cloud application model 5 Credit Cloud History Third Party Evaluation Complete Model Gold (5) (8) (10) Members Authentication Loading Profile Health (2) (4) Record Advanced (6) Model Silver Entry Members Point (1) (9) Response (11) Data Base Validation Model Non- (3) (7) Member Auto-Scaling Non-Member Job Silver Member Job Gold Member Job Cloud VMs  App consists of service units  Job consists of tasks  Jobs are categorized into classes (deadline and processing flow)  Cloud offers multiple VM types (price and processing power)  App has no knowledge on the workload info in advance  VM takes time to start up (VM acquisition delay) and are billed by hours
  • 6. Problem definition 6  Cloud application  app = {Si}  Job class  J = {DAG(Si), deadline | Si ∈ app}  Cloud VM 𝑆  VMv = {[𝑗 𝐽 𝑖 ]v , cv , lagv}  Workload 𝑆𝑖  Wt = 𝑆𝑖 𝐽 𝑗𝐽  Scaling plan  Scalingt = {VMv , Nv}  Scheduling plan 𝑆  Schedulet = { 𝑗 𝐽 𝑖 →VMv}  Goal  Min(C) = Min( 𝑣 𝑐 𝑣 𝑁 𝑣)
  • 7. Solution 7  SCS (Scaling – Consolidation - Scheduling)  Task bundling  Deadline assignment  Scaling  Instance consolidation  Scheduling
  • 8. Solution – Step 1 8  Task bundling  Idea – force tasks run on the same instance to improve performance and save data transfer cost  Example T6 T8 T6 T8 Bundle task as T6' Server 1 Server 2 Server 1 Server 1 Before After
  • 9. Solution – Step 2 9  Deadline assignment  Idea – to break task dependencies, assign deadlines proportionally based on task running time (on their cost- efficient machines)  Example T3 T3 T7 T7 T4 T11 T4 T11 T13 T1 T2 T8 T10 T13 T1 T2 T8 T10 T5 T12 T5 T12 T9 T9 T6 T6 3:00PM 3:00 4:30 3:00 3:10 3:20 3:50 4:00 4:20 4:30 Before After  Task upgrading 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 −𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑎𝑓𝑡𝑒𝑟 𝑟𝑎𝑛𝑘 = 𝑐𝑜𝑠𝑡 𝑎𝑓𝑡𝑒𝑟 −𝑐𝑜𝑠𝑡 𝑏𝑒𝑓𝑜𝑟𝑒
  • 10. Solution – Step 3 10  Determine the number of instances  From deadline assignment, we have  Task running time – tm  Task execution interval – [T0 ,T1 ]  Load vector  LVm = [tm/( T1 – T0 )]  # of instances = [LVm]  Example T1 0 0 0.25 0 0 T2 0 0 0 0.5 0 0 0 3:00 3:15 3:45 4:00 VM1 All 0 0 0.25 0.75 0.25 0 0
  • 11. Solution – Step 5 11  Instance consolidation  Idea – put tasks on the same instance even if some task may not run the most cost-efficiently on that machine  Example T11 Idle High-CPU 3:00 PM 4:00 PM Before T12 Idle 3:00 PM 4:00 PM Standard After T11 T12 Idle Standard 3:00 PM 4:00 PM
  • 12. Solution – Step 6 12  Scheduling – Earliest Deadline First  The dynamic scaling feature can make sure that the tasks facing missed deadlines can be found in time 𝑡𝑖 <1 𝑖 𝑇 𝑒𝑛𝑑_𝑖 − 𝑇 𝑠𝑡𝑎𝑟𝑡_𝑖
  • 13. Solution – Overview 13  Parallelism reduction
  • 14. Evaluation 14  Workload patterns  Application models VM Type Price Micro $0.02/hour Standard $0.085/hour High-CPU $0.68/hour High-Memory $0.50/hour  Base line  Time  Task execution  VM lag  Greedy  72 hours  Randomly generated  8 min  GAIN
  • 15. Evaluation 15  SCS cost saving ranges from 6.8% to 40.4%  The performance difference is larger with longer deadlines
  • 16. Evaluation – High volume V.S. Low volume 16  High workload (10X ) V.S. low workload (X)  Pipeline, 1-hour deadline Cost ($) High Volume V.S. Low Volume 120 Greedy- High 100 GAIN- High 80 SCS-High 60 Greedy- 40 Low GAIN- 20 Low 0 SCS-Low Stable Growing Cycle OnOff
  • 17. Evaluation – Imprecise parameters 17 Deadline(0.5hour) Non-Miss Rate for  Pipeline application, 20% variance Non-miss Rate (%) Imprecise Task Execution Estimation 100.0% in estimated execution time, 0.5- 90.0% 80.0% hour deadline Greedy  SCS can finish jobs before 70.0% 60.0% 50.0% GAIN 40.0% SCS deadlines for more than 90%, 30.0% 20.0% much better than Greedy(40%) 10.0% 0.0% and GAIN(50%) Stable Growing Cycle OnOff Deadeline(1 hour) Non-Miss Rate for  Pipeline application, 20% variance Non-miss Rate(%) Imprecise Instance Acquisition Lag in the estimate VM acquisition 100.0% 90.0% time, 1-hour deadline 80.0% 70.0% Greedy  SCS beats Greedy and GAIN 60.0% 50.0% GAIN  The performance is more affected 40.0% SCS 30.0% by the VM acquisition time 20.0% 10.0% 0.0% Stable Growing Cycle OnOff
  • 18. Related work 18  Dynamic resource provisioning in virtualized environment  Multi-tier web applications, queuing theory, control theory  Workflow scheduling in Grid environment with deadline and budget constraints  Single workflow instance  Resource pool is limited  Cloud economics  Cloud provider side V.S. cloud user side  Current cloud auto-scaling mechanisms  E.g. AWS auto-scaling, RightScale, enStratus, Scalr, AzureScale project, etc.
  • 19. Conclusion and future work 19  Conclusions  SCS cost saving ranges from 6.8% to 40.4%  SCS can better handle different workload volume and imprecise parameters  Choosing proper VM types based on the workload saves cost  Instance consolidation can help save partial instance hours  VM acquisition time plays a very important role  Future work  Different scheduling approaches  Real scientific applications  Insufficient budget cases - maximize cloud user benefits/utilities under budget constraints  Data-intensive applications
  • 20. 20 Thank you!