SlideShare una empresa de Scribd logo
1 de 37
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Madhuri Peri, Sr. Specialist Solution Architect, AWS
June 2018
Accelerating Containerized Workloads
With
EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
Recap of EC2 Spot
• What is EC2 Spot? & Why use Spot?
• Best Practices
Provision EC2 Spot
Handling Interruptions
Container workloads with EC2 Spot
• ECS
• EKS
• Batch
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Recap of EC2 Spot Concepts
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is EC2 Spot & Why Spot
Turbo Boost your ResultsSpare EC2 Capacity
that AWS can reclaim
with 2-minutes notice
Savings up to 90% off
the On Demand Price
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Understanding EC2 Capacity & Spot Pools
Instance Flexibility and Diversification Made
Easy with Spot Fleet
0
10
20
30
40
50
60
70
2007 2012 2017
Each instance family, each instance size, in each Availability Zone, in every
Region is a separate Spot pool
100s of Instance Options
R4.8xlarge
M4.4xlarge
C5.9xlarge
C4.xlar
ge
C4.4xlarge
R4.2xlarge
M4.xlar
ge
M4.xl
arge
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Best Practices for Running Apps on Spot Instances
Stateless Fault-Tolerant
Can you Restart ?
Flexible
Multi-AZ and Instance
Flexibility
Loosely Coupled
Distributed? Do you
Checkpoint?
Save more on distributed and flexible applications
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Provisioning - EC2/Spot Fleet
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Provisioning - RunInstances
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot Interruptions
EC2 Spot is interrupted when
• On-Demand requires capacity back
• EC2 Spot price exceeds specified Max Price in the request
Notified of interruptions in 2 ways
• CloudWatch events
• Instance metadata
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS concept (recap) Worker nodes on
EC2 Spot or On-
Demand
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot - Provisioning
• ECS is integrated via Console with Spot Fleet
• Spot Fleet launches instances
• Each EC2 Instance registers with ECS Cluster
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot - Scaling
• 2 dimensions to scaling
• EC2 Instance and Container
• EC2 Instance via Automatic Scaling
• 2 scaling policies
• Target tracking scaling
• Step scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot – Scaling (cont…)
• Target tracking scaling
• Select a metric, set target value
• Spot Fleet creates & manages CW alarms that trigger scaling
policy
• Adds/removes capacity to keep metric at/close to target value
specified.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot - Scaling (cont…)
• Step scaling
• You specify the CloudWatch alarms or trigger the scaling
policies
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ECS with EC2 Spot – Termination/Interruption
• ECS Service availability is maintained
• Need interruption to be managed
• Stop sending new traffic to terminating instance
• Active connections are drained
• Place a new copy on another running EC2 instance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Termination/Interruption (cont…)
• Spot Fleet will make new requests to fulfill target
capacity
• ECS
• Move EC2 instance to DRAINING state
• This prevents any new tasks to be scheduled on the host.
• Service scheduler will place tasks on other hosts in
cluster
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability Zone 1 Availability Zone 2
Internet
Amazon ECS
Agent
OD Container instance
Docker
Task
Container
Amazon ECS
Agent
EC2 Spot Container
instance
Docker
Task
Container
Amazon ECS
Agent
EC2 Spot Container
instance
Docker
Task Task
ContainerContainer
Application, Networking, or Classic Load
Balancers
Amazon ECS Agents
Amazon ECS
Agent communication service API
Cluster management engine
Key/Value store
User/Scheduler
Task
Container
Task
Container
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS with EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability
Zone 1
Etcd
Master
Etcd
Master
Availability
Zone 2
Availability
Zone 3
Etcd
Master
EKS – Master Nodes (Recap)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
mycluster.eks.amazonaws.com
Availability
Zone 1
Availability
Zone 2
Availability
Zone 3
Kubectl
EKS – Logical layout of Master Endpoint and Worker Nodes
Worker nodes on
EC2 Spot or On-
Demand
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS with EC2 Spot - Provisioning
• Worker node user data needs to connect to the Master
node endpoint
• If you use Cloudformation EC2::Instance, then EC2 Spot
instance is requested via RunInstances API
• If you use LaunchConfig, then EC2 Spot instance is
requested via ASG.
• ASG makes the Spot Instance Request (sir-*) on your behalf
• K8’s optionally uses label to manage pod placement
• --node-labels lifecycle=Ec2Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS Spot provisioning (cont…)
• BUT Wait……..
• Remember EC2 Spot best practice?...
• Use Instance flexibility!!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Spot provisioning (cont…)
• Put each of the instance type in a separate ASG with
EC2 Spot.
• Gives you multiple instance types – best practice.
• ASG makes additional requests to replace interrupted
instances
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS – Scaling
• Horizontal pod scaling
• Automatically scales number of pods in replication controller
• Cluster scaling (Scaling EC2 Instances)
• Adjusts the number of nodes in the cluster when
• Pods failed to run in cluster due to insufficient resources
• Nodes are underutilized, for an extended period of time
• And pods could be placed elsewhere on other nodes
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
mycluster.eks.amazonaws.com
Availability
Zone 1
Availability
Zone 2
Availability
Zone 3
Kubectl
m4.large Spot ASG –
Min: 1 Max: 10
t2.medium spot ASG –
Min: 1 Max: 10
On-Demand ASG -
Min: 1 Max: 10
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS – EC2 Spot scaling
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EKS with EC2 Spot – Termination/Interruptions
• Catch interruption on EC2 Instance
• Have a pod running in daemon mode on each Spot
Node
• Pod will intercept the interruption notice
• Set K8 Node to drain state
• Marks Node unschedulable
• Graceful termination
• Vacates pods via eviction API
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Batch with EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch with EC2 Spot
• With Batch you have 3 main aspects:
• Jobs/Job definitions
• Job queues
• Compute environments
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch with EC2 Spot
• Managed compute
• Unmanaged compute
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch with EC2 Spot - Managed Compute
• Managed Compute
• AWS Batch manages configuring & scaling of instances
• Under the hood, uses ECS cluster with EC2 instances
• You define the compute provisioning model
• On-Demand or EC2 Spot
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch Managed Compute with EC2 Spot
• Uses Spot Fleet
• You define the instance types or leave it as optimal
• If Optimal,
• Batch chooses best fit based on vCPU, mem requirements
• Match job with most-appropriate instance
• For example –
• 2 vCPU, and 16GB for each job, and 300 jobs
• Picks biggest instance type that can fit most # of jobs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch Managed Compute with EC2 Spot
• If single instance type, then doesn’t align with Spot Fleet
best practices
• Override optimal setting*
• Assign multiple Managed CE to same job queue with order
• Up to 5 can be assigned
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch Un-Managed Compute with EC2 Spot
• You configure & manage scaling of the ECS Cluster
• ECS integrated with Spot Fleet directly
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Batch with EC2 Spot – Manage interruptions
• Managed compute – optimal setting
• If interruption was because OD needed capacity, no progress
• Managed compute – multiple CE
• Diversified instance types likely to have capacity, jobs
progress
• Unmanaged compute – Spot Fleet
• Diversified instance types likely to have capacity, jobs
progress
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

[NEW LAUNCH!] Lambda Layers (SRV375) - AWS re:Invent 2018
[NEW LAUNCH!] Lambda Layers (SRV375) - AWS re:Invent 2018[NEW LAUNCH!] Lambda Layers (SRV375) - AWS re:Invent 2018
[NEW LAUNCH!] Lambda Layers (SRV375) - AWS re:Invent 2018
 
Ionic and React Hybrid Web/Native Mobile Applications with Mobile Hub - AWS O...
Ionic and React Hybrid Web/Native Mobile Applications with Mobile Hub - AWS O...Ionic and React Hybrid Web/Native Mobile Applications with Mobile Hub - AWS O...
Ionic and React Hybrid Web/Native Mobile Applications with Mobile Hub - AWS O...
 
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
 Learn how Maxwell Health Protects its MongoDB Workloads on AWS Learn how Maxwell Health Protects its MongoDB Workloads on AWS
Learn how Maxwell Health Protects its MongoDB Workloads on AWS
 
Thomson Reuters Shows How It Hosted a .NET App on Amazon ECS Using Windows Co...
Thomson Reuters Shows How It Hosted a .NET App on Amazon ECS Using Windows Co...Thomson Reuters Shows How It Hosted a .NET App on Amazon ECS Using Windows Co...
Thomson Reuters Shows How It Hosted a .NET App on Amazon ECS Using Windows Co...
 
Scaling Your Production Application with Amazon Lightsail - AWS Online Tech T...
Scaling Your Production Application with Amazon Lightsail - AWS Online Tech T...Scaling Your Production Application with Amazon Lightsail - AWS Online Tech T...
Scaling Your Production Application with Amazon Lightsail - AWS Online Tech T...
 
SID305 AWS Certificate Manager Private CA
SID305 AWS Certificate Manager Private CASID305 AWS Certificate Manager Private CA
SID305 AWS Certificate Manager Private CA
 
SRV314 Containerized App Development with AWS Fargate
SRV314 Containerized App Development with AWS FargateSRV314 Containerized App Development with AWS Fargate
SRV314 Containerized App Development with AWS Fargate
 
BDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWSBDA309 Build Your First Big Data Application on AWS
BDA309 Build Your First Big Data Application on AWS
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration
 
Amazon RDS_Deep Dive - SRV310
Amazon RDS_Deep Dive - SRV310 Amazon RDS_Deep Dive - SRV310
Amazon RDS_Deep Dive - SRV310
 
AWS Certificate Management and Private Certificate Authority Deep Dive (SEC41...
AWS Certificate Management and Private Certificate Authority Deep Dive (SEC41...AWS Certificate Management and Private Certificate Authority Deep Dive (SEC41...
AWS Certificate Management and Private Certificate Authority Deep Dive (SEC41...
 
Improving Backup & DR – AWS Storage Gateway - AWS Online Tech Talks
Improving Backup & DR – AWS Storage Gateway - AWS Online Tech TalksImproving Backup & DR – AWS Storage Gateway - AWS Online Tech Talks
Improving Backup & DR – AWS Storage Gateway - AWS Online Tech Talks
 
Using AWS Management Tools to Enable Governance, Compliance, Operational, and...
Using AWS Management Tools to Enable Governance, Compliance, Operational, and...Using AWS Management Tools to Enable Governance, Compliance, Operational, and...
Using AWS Management Tools to Enable Governance, Compliance, Operational, and...
 
Deep Dive on New Features in Amazon S3 & Glacier - AWS Online Tech Talks
Deep Dive on New Features in Amazon S3 & Glacier - AWS Online Tech TalksDeep Dive on New Features in Amazon S3 & Glacier - AWS Online Tech Talks
Deep Dive on New Features in Amazon S3 & Glacier - AWS Online Tech Talks
 
STG301_Deep Dive on Amazon S3 and Glacier Architecture
STG301_Deep Dive on Amazon S3 and Glacier ArchitectureSTG301_Deep Dive on Amazon S3 and Glacier Architecture
STG301_Deep Dive on Amazon S3 and Glacier Architecture
 
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech TalksDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
Deep Dive: Hybrid Cloud Storage with AWS Storage Gateway - AWS Online Tech Talks
 
Building a DevOps Pipeline on AWS (DEV326) - AWS re:Invent 2018
Building a DevOps Pipeline on AWS (DEV326) - AWS re:Invent 2018Building a DevOps Pipeline on AWS (DEV326) - AWS re:Invent 2018
Building a DevOps Pipeline on AWS (DEV326) - AWS re:Invent 2018
 
Deep Dive on AWS PrivateLink - AWS Online Tech Talks
Deep Dive on AWS PrivateLink - AWS Online Tech TalksDeep Dive on AWS PrivateLink - AWS Online Tech Talks
Deep Dive on AWS PrivateLink - AWS Online Tech Talks
 
Control for Your Cloud Environment Using AWS Management Tools (ENT226-R1) - A...
Control for Your Cloud Environment Using AWS Management Tools (ENT226-R1) - A...Control for Your Cloud Environment Using AWS Management Tools (ENT226-R1) - A...
Control for Your Cloud Environment Using AWS Management Tools (ENT226-R1) - A...
 
ENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWSENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWS
 

Similar a Accelerating Containerized Workloads with Amazon EC2 Spot Instances - AWS Online Tech Talks

Similar a Accelerating Containerized Workloads with Amazon EC2 Spot Instances - AWS Online Tech Talks (20)

Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
 
Amazon EC2 Spot with Amazon EKS (CON406-R1) - AWS re:Invent 2018
Amazon EC2 Spot with Amazon EKS (CON406-R1) - AWS re:Invent 2018Amazon EC2 Spot with Amazon EKS (CON406-R1) - AWS re:Invent 2018
Amazon EC2 Spot with Amazon EKS (CON406-R1) - AWS re:Invent 2018
 
Amazon EC2 Spot Instances
Amazon EC2 Spot InstancesAmazon EC2 Spot Instances
Amazon EC2 Spot Instances
 
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
 
Running Enterprise Test/Dev on Amazon EC2 Spot Instances (CMP407-R1) - AWS re...
Running Enterprise Test/Dev on Amazon EC2 Spot Instances (CMP407-R1) - AWS re...Running Enterprise Test/Dev on Amazon EC2 Spot Instances (CMP407-R1) - AWS re...
Running Enterprise Test/Dev on Amazon EC2 Spot Instances (CMP407-R1) - AWS re...
 
Deep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep DiveDeep dive ECS & Fargate Deep Dive
Deep dive ECS & Fargate Deep Dive
 
Cost Optimize EC2 with Amazon EC2 Spot Instances
Cost Optimize EC2 with Amazon EC2 Spot InstancesCost Optimize EC2 with Amazon EC2 Spot Instances
Cost Optimize EC2 with Amazon EC2 Spot Instances
 
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
 
AWS 微服務中的 Container 選項比較 (Level 400)
AWS 微服務中的 Container 選項比較   (Level 400)AWS 微服務中的 Container 選項比較   (Level 400)
AWS 微服務中的 Container 選項比較 (Level 400)
 
Easy and Efficient Batch Computing on AWS
Easy and Efficient Batch Computing on AWSEasy and Efficient Batch Computing on AWS
Easy and Efficient Batch Computing on AWS
 
Compute@Scale
Compute@ScaleCompute@Scale
Compute@Scale
 
AWS Compute Evolved Week: Cost Optimize EC2 with Amazon EC2 Spot Instances
AWS Compute Evolved Week: Cost Optimize EC2 with Amazon EC2 Spot InstancesAWS Compute Evolved Week: Cost Optimize EC2 with Amazon EC2 Spot Instances
AWS Compute Evolved Week: Cost Optimize EC2 with Amazon EC2 Spot Instances
 
Executando Kubernetes com Amazon EKS - DEV303 - Sao Paulo Summit
Executando Kubernetes com Amazon EKS -  DEV303 - Sao Paulo SummitExecutando Kubernetes com Amazon EKS -  DEV303 - Sao Paulo Summit
Executando Kubernetes com Amazon EKS - DEV303 - Sao Paulo Summit
 
Module 2 - AWSome Day Online Conference 2018
Module 2 - AWSome Day Online Conference 2018Module 2 - AWSome Day Online Conference 2018
Module 2 - AWSome Day Online Conference 2018
 
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
 
Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018Comparing Compute Options for Microservices - AWS Summti Sydney 2018
Comparing Compute Options for Microservices - AWS Summti Sydney 2018
 
Getting started with AWS Foundational Services
Getting started with AWS Foundational ServicesGetting started with AWS Foundational Services
Getting started with AWS Foundational Services
 
Optimize Amazon EC2 for Fun and Profit
Optimize Amazon EC2 for Fun and Profit Optimize Amazon EC2 for Fun and Profit
Optimize Amazon EC2 for Fun and Profit
 
Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
 
AWSome Day Online 2020_โมดูล 2: เริ่มต้นใช้งานบน AWS Cloud
AWSome Day Online 2020_โมดูล 2: เริ่มต้นใช้งานบน AWS CloudAWSome Day Online 2020_โมดูล 2: เริ่มต้นใช้งานบน AWS Cloud
AWSome Day Online 2020_โมดูล 2: เริ่มต้นใช้งานบน AWS Cloud
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Accelerating Containerized Workloads with Amazon EC2 Spot Instances - AWS Online Tech Talks

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Madhuri Peri, Sr. Specialist Solution Architect, AWS June 2018 Accelerating Containerized Workloads With EC2 Spot
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Recap of EC2 Spot • What is EC2 Spot? & Why use Spot? • Best Practices Provision EC2 Spot Handling Interruptions Container workloads with EC2 Spot • ECS • EKS • Batch
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Recap of EC2 Spot Concepts
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is EC2 Spot & Why Spot Turbo Boost your ResultsSpare EC2 Capacity that AWS can reclaim with 2-minutes notice Savings up to 90% off the On Demand Price
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Understanding EC2 Capacity & Spot Pools Instance Flexibility and Diversification Made Easy with Spot Fleet 0 10 20 30 40 50 60 70 2007 2012 2017 Each instance family, each instance size, in each Availability Zone, in every Region is a separate Spot pool 100s of Instance Options R4.8xlarge M4.4xlarge C5.9xlarge C4.xlar ge C4.4xlarge R4.2xlarge M4.xlar ge M4.xl arge
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Best Practices for Running Apps on Spot Instances Stateless Fault-Tolerant Can you Restart ? Flexible Multi-AZ and Instance Flexibility Loosely Coupled Distributed? Do you Checkpoint? Save more on distributed and flexible applications
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Provisioning - EC2/Spot Fleet
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Provisioning - RunInstances
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot Interruptions EC2 Spot is interrupted when • On-Demand requires capacity back • EC2 Spot price exceeds specified Max Price in the request Notified of interruptions in 2 ways • CloudWatch events • Instance metadata
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS concept (recap) Worker nodes on EC2 Spot or On- Demand
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot - Provisioning • ECS is integrated via Console with Spot Fleet • Spot Fleet launches instances • Each EC2 Instance registers with ECS Cluster
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot - Scaling • 2 dimensions to scaling • EC2 Instance and Container • EC2 Instance via Automatic Scaling • 2 scaling policies • Target tracking scaling • Step scaling
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot – Scaling (cont…) • Target tracking scaling • Select a metric, set target value • Spot Fleet creates & manages CW alarms that trigger scaling policy • Adds/removes capacity to keep metric at/close to target value specified.
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot - Scaling (cont…) • Step scaling • You specify the CloudWatch alarms or trigger the scaling policies
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ECS with EC2 Spot – Termination/Interruption • ECS Service availability is maintained • Need interruption to be managed • Stop sending new traffic to terminating instance • Active connections are drained • Place a new copy on another running EC2 instance
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Termination/Interruption (cont…) • Spot Fleet will make new requests to fulfill target capacity • ECS • Move EC2 instance to DRAINING state • This prevents any new tasks to be scheduled on the host. • Service scheduler will place tasks on other hosts in cluster
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Availability Zone 1 Availability Zone 2 Internet Amazon ECS Agent OD Container instance Docker Task Container Amazon ECS Agent EC2 Spot Container instance Docker Task Container Amazon ECS Agent EC2 Spot Container instance Docker Task Task ContainerContainer Application, Networking, or Classic Load Balancers Amazon ECS Agents Amazon ECS Agent communication service API Cluster management engine Key/Value store User/Scheduler Task Container Task Container
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS with EC2 Spot
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Availability Zone 1 Etcd Master Etcd Master Availability Zone 2 Availability Zone 3 Etcd Master EKS – Master Nodes (Recap)
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. mycluster.eks.amazonaws.com Availability Zone 1 Availability Zone 2 Availability Zone 3 Kubectl EKS – Logical layout of Master Endpoint and Worker Nodes Worker nodes on EC2 Spot or On- Demand
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS with EC2 Spot - Provisioning • Worker node user data needs to connect to the Master node endpoint • If you use Cloudformation EC2::Instance, then EC2 Spot instance is requested via RunInstances API • If you use LaunchConfig, then EC2 Spot instance is requested via ASG. • ASG makes the Spot Instance Request (sir-*) on your behalf • K8’s optionally uses label to manage pod placement • --node-labels lifecycle=Ec2Spot
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS Spot provisioning (cont…) • BUT Wait…….. • Remember EC2 Spot best practice?... • Use Instance flexibility!!
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Spot provisioning (cont…) • Put each of the instance type in a separate ASG with EC2 Spot. • Gives you multiple instance types – best practice. • ASG makes additional requests to replace interrupted instances
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS – Scaling • Horizontal pod scaling • Automatically scales number of pods in replication controller • Cluster scaling (Scaling EC2 Instances) • Adjusts the number of nodes in the cluster when • Pods failed to run in cluster due to insufficient resources • Nodes are underutilized, for an extended period of time • And pods could be placed elsewhere on other nodes
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. mycluster.eks.amazonaws.com Availability Zone 1 Availability Zone 2 Availability Zone 3 Kubectl m4.large Spot ASG – Min: 1 Max: 10 t2.medium spot ASG – Min: 1 Max: 10 On-Demand ASG - Min: 1 Max: 10
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS – EC2 Spot scaling
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EKS with EC2 Spot – Termination/Interruptions • Catch interruption on EC2 Instance • Have a pod running in daemon mode on each Spot Node • Pod will intercept the interruption notice • Set K8 Node to drain state • Marks Node unschedulable • Graceful termination • Vacates pods via eviction API
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Batch with EC2 Spot
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch with EC2 Spot • With Batch you have 3 main aspects: • Jobs/Job definitions • Job queues • Compute environments
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch with EC2 Spot • Managed compute • Unmanaged compute
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch with EC2 Spot - Managed Compute • Managed Compute • AWS Batch manages configuring & scaling of instances • Under the hood, uses ECS cluster with EC2 instances • You define the compute provisioning model • On-Demand or EC2 Spot
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch Managed Compute with EC2 Spot • Uses Spot Fleet • You define the instance types or leave it as optimal • If Optimal, • Batch chooses best fit based on vCPU, mem requirements • Match job with most-appropriate instance • For example – • 2 vCPU, and 16GB for each job, and 300 jobs • Picks biggest instance type that can fit most # of jobs
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch Managed Compute with EC2 Spot • If single instance type, then doesn’t align with Spot Fleet best practices • Override optimal setting* • Assign multiple Managed CE to same job queue with order • Up to 5 can be assigned
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch Un-Managed Compute with EC2 Spot • You configure & manage scaling of the ECS Cluster • ECS integrated with Spot Fleet directly
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Batch with EC2 Spot – Manage interruptions • Managed compute – optimal setting • If interruption was because OD needed capacity, no progress • Managed compute – multiple CE • Diversified instance types likely to have capacity, jobs progress • Unmanaged compute – Spot Fleet • Diversified instance types likely to have capacity, jobs progress
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!