SlideShare a Scribd company logo
1 of 59
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Spot
Save Up to 90% on your Amazon EC2 Bill
with Spot Instances
Tipu Qureshi
Jafar Shameem
19th August 2015
Name your own price for EC2
Compute
• A market where price of compute
changes based upon Supply and
Demand
• When Bid Price exceeds Spot
Market Price, instance is launched
• Instance is terminated (with 2
minute warning) if market price
exceeds bid price
• Unused On-Demand Instances
What is Spot?
• Spot prices are determined via supply and demand
• There are hundreds of uncorrelated Spot markets
• Prices can, but often don’t fluctuate wildly
About Spot…
General-purpose: M1, M3 , T2
Compute-optimized: C1, CC2, C3, C4
Memory-optimized: M2, CR1, R3, M4
Dense-storage: HS1, D2
I/O-optimized: HI1, I2
GPU: CG1, G2
Micro: T1, T2
.micro
.medium
.large
.xlarge
.2xlarge
.4xlarge
.8xlarge
Windows
Linux
-1a
-1b
-1c
….
Type Size OS AZ
Spot is not one market
Each instance family (r3) and size (4xlarge),
in each Availability Zone (US-East-1b)
Uncorrelated pools of Spot Capacity
50% Bid
70% Bid
You pay the
market
price
Bid Price and Market Price
cc2.8xlarge
32 cores, 60.5 GB
memory
On-Demand
Price:
$2.00/hr
$0.00936/core/h
r
On average, AWS adds enough new
server capacity every day to support
Amazon’s global infrastructure when
it was a $7B business.
EC2 Spot - best practices
Check the Price History
Describe Spot Price History API:
• Provides historical prices on a per-pool basis
• Goes back 90 days (3 months)
• Popular instance types tend to have Spot prices that are
somewhat more volatile
• Older generations (including c1.8xlarge, m1.small,
cr1.8xlarge, and cc2.8xlarge) tend to be much more
stable and have lower cost in general
Capacity pools
Set of EC2 instances of the same properties:
• Availability zone
• Product/Operating system (Linux/Unix or Windows)
• EC2 instance type
Each EC2 capacity pool has it’s own:
• Availability – number of Spot instances
• Price – based on supply and demand
Use Multiple Capacity Pools
• Run applications across multiple capacity pools to
reduce your application’s sensitivity to price spikes that
affect a pool
• In general, there is very little correlation between prices
in different capacity pools.
• For example, if you run in five different pools your price
swings and interruptions can be cut by 80%.
Use Multiple Capacity Pools
Run across multiple availability zones in conjunction
• Auto Scaling
• Spot Fleet API
Run application across different sizes of instances within
the same family
• Amazon EMR takes this approach
Your application could figure out how many vCPUs it is
running on, and then launch enough worker threads to
keep all of them occupied.
CPU and cores
• What kind of performance does your application require?
How many cores does your application need?
Memory/core
• How much memory per core does your application need?
Networking
• Does your application need high, moderate, low network
bandwidth?
Disk
• How much local disk does your application need?
Use Normalized pools of Compute
You only pay what the Market price is
But, bid what you are willing to pay
You pay for the price as you enter the hour
And pay for it at the end of the hour
If you get interrupted, you don’t pay for that hour
Bid only what you are willing to pay.
(by default, bid limited to 10 * On Demand Price)
What about Bidding Strategy?
AWS Spot Labs
• https://github.com/awslabs/aws-spot-labs
Helps to find capacity pools (defined as instance type and AZ) with
lower price volatility by ordering these pools based on duration of time
since the Spot price last exceeded the bid price. It uses AWS CLI to
programmatically obtain Spot price history data.
Finding the best pools of Compute Capacity
python get_spot_duration.py 
--region us-east-1 
--product-description 'Linux/UNIX' 
--bids
c3.xlarge:0.105,c3.2xlarge:0.21,c3.4xlarge:0.42,c3.8xlarge:0.84,c4.xlarge:0.110,c4.2xlarge:0.220,c4.
4xlarge:0.440,c4.8xlarge:0.880,cc2.8xlarge:1.000,c1.xlarge:0.26 
--hours 168
Note:
• Price as of 8/15/2015
• AZ mappings may differ
• 168 hours = 1 week
• In this example, bidding
the on-demand price
Using the Spot Tools Lab
Build stateless, distributed, scalable applications
Choose which instance types fit your workload the best
Ingest price feed data for AZs and regions
Make run time decisions on which Spot pools to launch in based on
price and volatility
Manage interruptions
Monitor and manage market prices across Azs and instance types
Manage the capacity footprint in the fleet
And all of this while you don’t know where the capacity is
Serve your customers
Helping with the undifferentiated heavy lifting
UNDIFFERENTIATED
HEAVY LIFTING
Instead of writing all that code to manage Spot Instances,
simply specify:
Target Capacity - The number of EC2 instances that you want in
your fleet.
Maximum Bid Price - The maximum bid price that you are willing
to pay.
Launch Specifications - # of and types of instances, AMI id, VPC,
subnets or AZs, etc.
IAM Fleet Role - The name of an IAM role. It must allow EC2 to
launch and terminate instances on your behalf.
Introducing Spot Fleet
EC2 Spot - Use Cases
Stateless Web/App Server Fleets
Hadoop Workloads
Continuous Integration (CI)
High Performance Computing (HPC)
Grid Computing
Media Rendering / Transcoding
Spot Use Cases
EC2 Spot - Web Architecture
Considerations
Highly availability
Cost
Elasticity
Stateless Web tier
Parallelism
Stateless Web/App/API Architecture with Spot
Elastic Load
Balancing
Stateless
Web Servers
Stateless
Web Servers
On Demand Auto
Scaling group
Session
State Data
Stateless Web
Servers (Spot)
Stateless Web
Servers (Spot)
Spot Auto
Scaling group
Availability Zone A
Availability Zone B
Stateless Web
Servers (Spot)
Stateless Web
Servers (Spot)
Spot Auto
Scaling group
Web Application - Auto Scaling
Multiple Auto Scaling groups
• On-demand instances for fallback.
• Multiple EC2 Spot instance Auto Scaling groups
• Each Spot Auto Scaling group using a different capacity pool
(e.g. AZ, bid, Instance size, Instance type)
Auto Scaling groups behind the same Elastic Load
Balancer.
Pick the right instance time for the job based on the price
history.
Auto Scaling Policies
Aggressive scaling policies for Spot Auto Scaling Groups
e.g. Scale up at 75% CPU utilization and scale down when at
25% CPU utilization with a large capacity range)
More conservative scaling policies for On-Demand Auto
Scaling groups.
Session state for the web application can be stored in
DynamoDB.
• Data replicated across availability zones.
You can also choose other databases to maintain state in
your architecture.
• Amazon RDS using Multi-AZ deployments
• Amazon Elasticache
Where to store the state?
Spot termination considerations
Availability of Spot instances can vary based on supply and
demand
Architect application to be resilient to instance termination
When the Spot price exceeds the price you named (i.e. the
bid price), the instance will receive a two-minute warning
that the instance will be terminated
Spot termination considerations
Check for the 2 minute spot instance termination
notification every 5 seconds leveraging a script invoked at
instance launch. Upon notification:
• Place any session information into DynamoDB
• Use IAM roles so that the spot instances can de-register
themselves from the ELB upon termination notification
Since the Auto Scaling groups span across multiple
availability zones, we highly recommend enabling cross-
zone load balancing for the load balancer.
To allow in-flight requests to complete when de-registering
Spot instances that are about to be terminated, connection
draining can be enabled on the load balancer with a
timeout of 90 seconds.
Elastic Load Balancing
Sample script
#!/bin/bash
while true
do
if curl -s http://169.254.169.254/latest/meta-data/spot/termination-
time |  grep -q .*T.*Z; then instance_id=$(curl -s
http://169.254.169.254/latest/meta-data/instance-id);  aws elb deregister-
instances-from-load-balancer  --load-balancer-name my-load-balancer  --
instances $instance_id; /env/bin/flushsessiontoDBonterminationscript.sh;
else
# Spot instance not yet marked for termination.
sleep 5
fi
done
Web Application Architecture with Spot
Elastic Load
Balancing
Stateless
Web Servers
Stateless
Web Servers
On Demand Auto
Scaling group
Session
State Data
Stateless Web
Servers (Spot)
Stateless Web
Servers (Spot)
Spot Auto
Scaling group
Availability Zone A
Availability Zone B
Stateless Web
Servers (Spot)
Stateless Web
Servers (Spot)
Spot Auto
Scaling group
Studyplus Case Study
Batch Processing with
Amazon EC2 Spot
Batch oriented applications can leverage on-demand
processing using EC2 Spot to save up to 90% cost:
• Claims processing
• Large scale transformation
• Media processing
• Multi-part data processing work
You can also leverage EMR with spot instances.
Batch Processing with Amazon EC2 Spot
• Multi-part job processing architecture
• Auto Scaling groups to setup a heterogeneous, scalable
“grid” of EC2 spot instances with multiple capacity pools
as worker nodes
• Use S3 to invoke AWS Lambda upon object upload
• Use SQS for decoupling
• DynamoDB for tracking job status
• Complete large batch processing tasks in parallel
Batch Processing with Amazon EC2 Spot
About Lambda and SQS
AWS Lambda is a compute service that runs your code in
response to events and automatically manages the
compute resources for you, making it easy to build
applications that respond quickly to new information.
Amazon Simple Queue Service (SQS) is a fast, reliable,
scalable, fully managed message queuing service to
decouple components.
Depending on the application’s needs, multiple SQS queues
might be required for functions and priorities.
Batch Processing with Amazon EC2 Spot
On Demand Auto-
Scaling group
Output S3
bucket
Spot Auto-
Scaling group 2
Availability Zone A
Availability Zone B
Spot Auto-
Scaling group 1
Upload object
into input S3
bucket
Job SQS Queue
Auto Scaling groups will scale up based
on queue depth and scale down based on
CPU utilization CW metrics
Workers will
check for
jobs in the
queue
Workers will update Job status
(start time, SLA end time, etc)
in DynamoDB
Uploads to S3 will
trigger a Lamda
function to put jobs in
SQS and DynamoDB
EFS
EC2 instance
worker fleet
IAM Role for Lambda Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1438283855455",
"Action": [
"dynamodb:PutItem"
],
"Effect": "Allow",
"Resource": "arn:aws:dynamodb:us-east-1::table/demojobtable"
},
{
"Sid": "Stmt1438283929844",
"Action": [
"sqs:SendMessage"
],
"Effect": "Allow",
"Resource": "arn:aws:sqs:us-east-1::demojobqueue"
}
]
}
AWS Lambda function for SQS and DynamoDB
updates
// dependencies
var AWS = require('aws-sdk');
// get reference to clients
var s3 = new AWS.S3();
var sqs = new AWS.SQS();
var dynamodb = new AWS.DynamoDB();
console.log ('Loading function');
exports.handler = function(event, context) {
// Read options from the event.
var srcBucket = event.Records[0].s3.bucket.name;
// Object key may have spaces or unicode non-ASCII characters.
var srcKey = decodeURIComponent(event.Records[0].s3.object.key.replace(/+/g, " "));
// prepare SQS message
var params = {
MessageBody: 'object '+ srcKey + ' ',
QueueUrl: 'https://sqs.us-east-1.amazonaws.com//demojobqueue',
DelaySeconds: 0
};
//send SQS message
sqs.sendMessage(params, function (err, data) {
if (err) {
console.error('Unable to put object' + srcKey + ' into SQS queue due to an error: ' +
err);
context.fail(srcKey, 'Unable to send message to SQS');
} // an error occurred
else {
//define DynamoDB table variables
var tableName = "demojobtable";
var datetime = new Date().getTime().toString();
AWS Lambda function for SQS and DynamoDB
updates
//Put item into DynamoDB table where srcKey is the hash key and datetime is the range key
dynamodb.putItem({
"TableName": tableName,
"Item": {
"srcKey": {"S": srcKey },
"datetime": {"S": datetime },
}
}, function(err, data) {
if (err) {
console.error('Unable to put object' + srcKey + ' into DynamoDB table due to an error: '
+ err);
context.fail(srcKey, 'Unable to put data to DynamoDB Table');
}
else {
console.log('Successfully put object' + srcKey + ' into SQS queue and DynamoDB');
context.succeed(srcKey, 'Data put into SQS and DynamoDB');
}
});
}
});
};
AWS Lambda function for SQS and DynamoDB
updates
Batch Processing with Amazon EC2 Spot
• Worker nodes get job parts from the SQS and perform
single tasks based on the job task state in DynamoDB
• Store the input objects in a file system such as Amazon
Elastic File System (Amazon EFS), local instance store
or Amazon Elastic Block Store (EBS)
• Each job can be further split into multiples sub-parts if
there is a mechanism to stitch the outputs together
• Once completed, the objects will be uploaded back to S3
using multi-part upload.
Batch Processing with Amazon EC2 Spot
On Demand Auto-
Scaling group
Output S3
bucket
Spot Auto-
Scaling group 2
Availability Zone A
Availability Zone B
Spot Auto-
Scaling group 1
Upload object
into input S3
bucket
Job SQS Queue
Auto Scaling groups will scale up based
on queue depth and scale down based on
CPU utilization CW metrics
Workers will
check for
jobs in the
queue
Workers will update Job status
(start time, SLA end time, etc)
in DynamoDB
Uploads to S3 will
trigger a Lamda
function to put jobs in
SQS and DynamoDB
EFS
EC2 instance
worker fleet
More automation?
Use a Lambda function to dynamically manage Auto
Scaling groups based on the Spot market
• The Lambda function could periodically invoke the EC2 Spot
APIs to assess market prices and availability and respond by
creating new Auto Scaling launch configurations and groups
automatically.
• This function could also delete any Spot Auto Scaling groups
and launch configurations that have no instances.
AWS Data Pipeline can be used to invoke the Lambda
function using the AWS CLI at regular intervals by
scheduling pipelines
Automated Batch Architecture with Spot
Worker
Worker
On Demand
Autoscaling group
Output S3
bucket
Worker (spot)
Worker(spot)
Spot Autoscaling
group 2
Availability Zone A
Availability Zone B
Worker(spot)
Worker (spot)
Spot Autoscaling
group 1
Upload object
into input S3
bucket
Job SQS Queue
AutoScaling groups will scale up
based on queue depth and scale
down based on CPU utilization
CW metrics
Workers will
check for
jobs in the
queue
Workers will update Job status
(start time, SLA end time, etc)
in DynamoDB
DataPipeline can invoke a Lambda
function in a scheduled manner
which can manage AutoScaling
groups based on the spot market
Uploads to S3 will
trigger a Lamda
function to put jobs in
DynamoDB and SQS EFS
Further cost optimization with Trusted Advisor
Save money on AWS by eliminating unused and idle resources
Cost Optimization TA Checks:
• Amazon EC2 Reserved Instances Optimization
• Low Utilization Amazon EC2 Instances
• Idle Load Balancers
• Underutilized Amazon EBS Volumes
• Unassociated Elastic IP Addresses
• Amazon RDS Idle DB Instances
AWS re:Invent 2015 – October 6-9
AWS re:Invent is the largest annual gathering of the global cloud community. Whether you are an existing customer or new
to the cloud, AWS re:Invent will provide you with the knowledge and skills to refine your cloud strategy, improve developer
productivity, increase application performance and security, and reduce infrastructure costs.
Though AWS re:Invent tickets are sold out, you can still register to view the Live Stream Broadcasts of the keynote
addresses and select technical sessions on October 7 and October 8. Register now.
Details:
Wednesday, October 7
9:00am - 10:30am PT: Andrew Jassy, Sr. Vice President, AWS
11:00am - 5:15pm PT: 5 of the most popular breakout sessions (to be announced)
Thursday, October 8
9:00am - 10:30am PT: Dr. Werner Vogels, CTO, Amazon
11:00am - 6:15pm PT: 6 of the most popular breakout sessions (to be announced)
Register now for the Live Stream Broadcast by submitting your email where prompted on the AWS re:Invent home page.
Stay Connected: Follow event activities on Twitter @awsreinvent (#reinvent), or like us on Facebook.
Thank you!
Questions?
What have customers done
with Spot?
Some case studies..
EBS
Submit jobs, orchestrate
HPC clusters over VPC
Run 1 Million drive head
designs = 70.75 core-years
90x throughput:
Ran in 8 hours, not 30 days
3 days from idea to running
70,908 cores, 729 TFLOPS
c3, r3 with Intel E5-2670 v2
Cost: $5,594
Spot Instances
New Drive
Head
Design
Workloads
World’s Largest F500 Cloud Run
Transforming drive design to store the world’s data
Encrypt, route data to
AWS, return results
Cluster
70,908 Cores
with
Spot
Instances
AWS Delivered Unheard-of Processing
39 years of science
10,600 AWS Instances
Saved equivalent of $40M infrastructure
10 Million compounds screened
39 drug design years in 11 hours for a cost of… $4,232
3 promising compounds identified
Scaling Hadoop Jobs with Spot
http://engineering.bloomreach.com/strategies-for-reducing-your-amazon-emr-costs/
Bloomreach
launches 1,500 to
2,000 Amazon EMR
clusters and run
6,000 Hadoop jobs
every day.
Continuous Integration & Testing with Spot
• Tapjoy - Premier Mobile Ad Network Across iOS & Android
• Global Network (435 Million Monthly Reach)
• Jenkins + Spot Instances
• https://github.com/bwall/ec2-plugin (thanks to an RIT senior project)
• Go wide during business hours, scale back in the evenings.
Automatically kicks online at 06:00ET
• Workers scale horizontally to support dozens of simultaneous regression
tests spread out over dozens of workers
• Jenkins automatically guards against spot termination
Ooyala
• Video technology platform
that serves ESPN,
Bloomberg, ...
• Uses combo of OD/RI/Spot to
ensure it can cover predicted
volumes while keeping costs
low
• http://aws.amazon.com/solutions/case-
studies/ooyala/
Vevo
• Library of over 75,000 HD
videos
• Must be able to rapidly
transcode library to a new
screen format
• Can spin up 100s of Spot
instances to transcode entire
library in a matter of days
(instead of the weeks)
Queue-based media transcoding
Using Spot Fleet
An example..
Using Spot Fleet
Create EC2 Spot Fleet IAM Role
Requesting a fleet:
• aws ec2 request-spot-fleet --spot-fleet-request-config
file://mySmallFleet.json
Describe fleet:
• aws ec2 describe-spot-fleet-requests
• aws ec2 describe-spot-fleet-requests --spot-fleet-request-ids <sfr-………..>
Describe instances within the fleet
• aws ec2 describe-spot-fleet-instances --spot-fleet-request-id <sfr-…………>
Cancel Spot Fleet (with termination):
• aws ec2 cancel-spot-fleet-requests --spot-fleet-request-ids <sfr-…………..>
-terminate-instances
mySpotFleet.json
{
"TargetCapacity": 5,
"SpotPrice": "1.00",
"IamFleetRole": "arn:aws:iam::962872214910:role/fleetRole",
"LaunchSpecifications": [
{
"ImageId": "ami-ff527ecf",
"InstanceType": "m1.small"
},
{
"ImageId": "ami-ff527ecf",
"InstanceType": "m1.medium"
},
{
"ImageId": "ami-ff527ecf",
"InstanceType":"m1.large"
}
]
}

More Related Content

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

AWS August Webinar Series - EC2 Spot Instances - 08192015

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Spot Save Up to 90% on your Amazon EC2 Bill with Spot Instances Tipu Qureshi Jafar Shameem 19th August 2015
  • 2. Name your own price for EC2 Compute • A market where price of compute changes based upon Supply and Demand • When Bid Price exceeds Spot Market Price, instance is launched • Instance is terminated (with 2 minute warning) if market price exceeds bid price • Unused On-Demand Instances What is Spot?
  • 3. • Spot prices are determined via supply and demand • There are hundreds of uncorrelated Spot markets • Prices can, but often don’t fluctuate wildly About Spot…
  • 4. General-purpose: M1, M3 , T2 Compute-optimized: C1, CC2, C3, C4 Memory-optimized: M2, CR1, R3, M4 Dense-storage: HS1, D2 I/O-optimized: HI1, I2 GPU: CG1, G2 Micro: T1, T2 .micro .medium .large .xlarge .2xlarge .4xlarge .8xlarge Windows Linux -1a -1b -1c …. Type Size OS AZ Spot is not one market
  • 5. Each instance family (r3) and size (4xlarge), in each Availability Zone (US-East-1b) Uncorrelated pools of Spot Capacity
  • 6. 50% Bid 70% Bid You pay the market price Bid Price and Market Price
  • 7. cc2.8xlarge 32 cores, 60.5 GB memory On-Demand Price: $2.00/hr $0.00936/core/h r
  • 8. On average, AWS adds enough new server capacity every day to support Amazon’s global infrastructure when it was a $7B business.
  • 9.
  • 10. EC2 Spot - best practices
  • 11. Check the Price History Describe Spot Price History API: • Provides historical prices on a per-pool basis • Goes back 90 days (3 months) • Popular instance types tend to have Spot prices that are somewhat more volatile • Older generations (including c1.8xlarge, m1.small, cr1.8xlarge, and cc2.8xlarge) tend to be much more stable and have lower cost in general
  • 12. Capacity pools Set of EC2 instances of the same properties: • Availability zone • Product/Operating system (Linux/Unix or Windows) • EC2 instance type Each EC2 capacity pool has it’s own: • Availability – number of Spot instances • Price – based on supply and demand
  • 13. Use Multiple Capacity Pools • Run applications across multiple capacity pools to reduce your application’s sensitivity to price spikes that affect a pool • In general, there is very little correlation between prices in different capacity pools. • For example, if you run in five different pools your price swings and interruptions can be cut by 80%.
  • 14. Use Multiple Capacity Pools Run across multiple availability zones in conjunction • Auto Scaling • Spot Fleet API Run application across different sizes of instances within the same family • Amazon EMR takes this approach Your application could figure out how many vCPUs it is running on, and then launch enough worker threads to keep all of them occupied.
  • 15. CPU and cores • What kind of performance does your application require? How many cores does your application need? Memory/core • How much memory per core does your application need? Networking • Does your application need high, moderate, low network bandwidth? Disk • How much local disk does your application need? Use Normalized pools of Compute
  • 16. You only pay what the Market price is But, bid what you are willing to pay You pay for the price as you enter the hour And pay for it at the end of the hour If you get interrupted, you don’t pay for that hour Bid only what you are willing to pay. (by default, bid limited to 10 * On Demand Price) What about Bidding Strategy?
  • 17. AWS Spot Labs • https://github.com/awslabs/aws-spot-labs Helps to find capacity pools (defined as instance type and AZ) with lower price volatility by ordering these pools based on duration of time since the Spot price last exceeded the bid price. It uses AWS CLI to programmatically obtain Spot price history data. Finding the best pools of Compute Capacity
  • 18. python get_spot_duration.py --region us-east-1 --product-description 'Linux/UNIX' --bids c3.xlarge:0.105,c3.2xlarge:0.21,c3.4xlarge:0.42,c3.8xlarge:0.84,c4.xlarge:0.110,c4.2xlarge:0.220,c4. 4xlarge:0.440,c4.8xlarge:0.880,cc2.8xlarge:1.000,c1.xlarge:0.26 --hours 168 Note: • Price as of 8/15/2015 • AZ mappings may differ • 168 hours = 1 week • In this example, bidding the on-demand price Using the Spot Tools Lab
  • 19. Build stateless, distributed, scalable applications Choose which instance types fit your workload the best Ingest price feed data for AZs and regions Make run time decisions on which Spot pools to launch in based on price and volatility Manage interruptions Monitor and manage market prices across Azs and instance types Manage the capacity footprint in the fleet And all of this while you don’t know where the capacity is Serve your customers Helping with the undifferentiated heavy lifting UNDIFFERENTIATED HEAVY LIFTING
  • 20. Instead of writing all that code to manage Spot Instances, simply specify: Target Capacity - The number of EC2 instances that you want in your fleet. Maximum Bid Price - The maximum bid price that you are willing to pay. Launch Specifications - # of and types of instances, AMI id, VPC, subnets or AZs, etc. IAM Fleet Role - The name of an IAM role. It must allow EC2 to launch and terminate instances on your behalf. Introducing Spot Fleet
  • 21. EC2 Spot - Use Cases
  • 22. Stateless Web/App Server Fleets Hadoop Workloads Continuous Integration (CI) High Performance Computing (HPC) Grid Computing Media Rendering / Transcoding Spot Use Cases
  • 23. EC2 Spot - Web Architecture
  • 25. Stateless Web/App/API Architecture with Spot Elastic Load Balancing Stateless Web Servers Stateless Web Servers On Demand Auto Scaling group Session State Data Stateless Web Servers (Spot) Stateless Web Servers (Spot) Spot Auto Scaling group Availability Zone A Availability Zone B Stateless Web Servers (Spot) Stateless Web Servers (Spot) Spot Auto Scaling group
  • 26. Web Application - Auto Scaling Multiple Auto Scaling groups • On-demand instances for fallback. • Multiple EC2 Spot instance Auto Scaling groups • Each Spot Auto Scaling group using a different capacity pool (e.g. AZ, bid, Instance size, Instance type) Auto Scaling groups behind the same Elastic Load Balancer. Pick the right instance time for the job based on the price history.
  • 27. Auto Scaling Policies Aggressive scaling policies for Spot Auto Scaling Groups e.g. Scale up at 75% CPU utilization and scale down when at 25% CPU utilization with a large capacity range) More conservative scaling policies for On-Demand Auto Scaling groups.
  • 28. Session state for the web application can be stored in DynamoDB. • Data replicated across availability zones. You can also choose other databases to maintain state in your architecture. • Amazon RDS using Multi-AZ deployments • Amazon Elasticache Where to store the state?
  • 29. Spot termination considerations Availability of Spot instances can vary based on supply and demand Architect application to be resilient to instance termination When the Spot price exceeds the price you named (i.e. the bid price), the instance will receive a two-minute warning that the instance will be terminated
  • 30. Spot termination considerations Check for the 2 minute spot instance termination notification every 5 seconds leveraging a script invoked at instance launch. Upon notification: • Place any session information into DynamoDB • Use IAM roles so that the spot instances can de-register themselves from the ELB upon termination notification
  • 31. Since the Auto Scaling groups span across multiple availability zones, we highly recommend enabling cross- zone load balancing for the load balancer. To allow in-flight requests to complete when de-registering Spot instances that are about to be terminated, connection draining can be enabled on the load balancer with a timeout of 90 seconds. Elastic Load Balancing
  • 32. Sample script #!/bin/bash while true do if curl -s http://169.254.169.254/latest/meta-data/spot/termination- time | grep -q .*T.*Z; then instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id); aws elb deregister- instances-from-load-balancer --load-balancer-name my-load-balancer -- instances $instance_id; /env/bin/flushsessiontoDBonterminationscript.sh; else # Spot instance not yet marked for termination. sleep 5 fi done
  • 33. Web Application Architecture with Spot Elastic Load Balancing Stateless Web Servers Stateless Web Servers On Demand Auto Scaling group Session State Data Stateless Web Servers (Spot) Stateless Web Servers (Spot) Spot Auto Scaling group Availability Zone A Availability Zone B Stateless Web Servers (Spot) Stateless Web Servers (Spot) Spot Auto Scaling group
  • 36. Batch oriented applications can leverage on-demand processing using EC2 Spot to save up to 90% cost: • Claims processing • Large scale transformation • Media processing • Multi-part data processing work You can also leverage EMR with spot instances. Batch Processing with Amazon EC2 Spot
  • 37. • Multi-part job processing architecture • Auto Scaling groups to setup a heterogeneous, scalable “grid” of EC2 spot instances with multiple capacity pools as worker nodes • Use S3 to invoke AWS Lambda upon object upload • Use SQS for decoupling • DynamoDB for tracking job status • Complete large batch processing tasks in parallel Batch Processing with Amazon EC2 Spot
  • 38. About Lambda and SQS AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information. Amazon Simple Queue Service (SQS) is a fast, reliable, scalable, fully managed message queuing service to decouple components. Depending on the application’s needs, multiple SQS queues might be required for functions and priorities.
  • 39. Batch Processing with Amazon EC2 Spot On Demand Auto- Scaling group Output S3 bucket Spot Auto- Scaling group 2 Availability Zone A Availability Zone B Spot Auto- Scaling group 1 Upload object into input S3 bucket Job SQS Queue Auto Scaling groups will scale up based on queue depth and scale down based on CPU utilization CW metrics Workers will check for jobs in the queue Workers will update Job status (start time, SLA end time, etc) in DynamoDB Uploads to S3 will trigger a Lamda function to put jobs in SQS and DynamoDB EFS EC2 instance worker fleet
  • 40. IAM Role for Lambda Policy { "Version": "2012-10-17", "Statement": [ { "Sid": "Stmt1438283855455", "Action": [ "dynamodb:PutItem" ], "Effect": "Allow", "Resource": "arn:aws:dynamodb:us-east-1::table/demojobtable" }, { "Sid": "Stmt1438283929844", "Action": [ "sqs:SendMessage" ], "Effect": "Allow", "Resource": "arn:aws:sqs:us-east-1::demojobqueue" } ] }
  • 41. AWS Lambda function for SQS and DynamoDB updates // dependencies var AWS = require('aws-sdk'); // get reference to clients var s3 = new AWS.S3(); var sqs = new AWS.SQS(); var dynamodb = new AWS.DynamoDB(); console.log ('Loading function'); exports.handler = function(event, context) { // Read options from the event. var srcBucket = event.Records[0].s3.bucket.name; // Object key may have spaces or unicode non-ASCII characters. var srcKey = decodeURIComponent(event.Records[0].s3.object.key.replace(/+/g, " "));
  • 42. // prepare SQS message var params = { MessageBody: 'object '+ srcKey + ' ', QueueUrl: 'https://sqs.us-east-1.amazonaws.com//demojobqueue', DelaySeconds: 0 }; //send SQS message sqs.sendMessage(params, function (err, data) { if (err) { console.error('Unable to put object' + srcKey + ' into SQS queue due to an error: ' + err); context.fail(srcKey, 'Unable to send message to SQS'); } // an error occurred else { //define DynamoDB table variables var tableName = "demojobtable"; var datetime = new Date().getTime().toString(); AWS Lambda function for SQS and DynamoDB updates
  • 43. //Put item into DynamoDB table where srcKey is the hash key and datetime is the range key dynamodb.putItem({ "TableName": tableName, "Item": { "srcKey": {"S": srcKey }, "datetime": {"S": datetime }, } }, function(err, data) { if (err) { console.error('Unable to put object' + srcKey + ' into DynamoDB table due to an error: ' + err); context.fail(srcKey, 'Unable to put data to DynamoDB Table'); } else { console.log('Successfully put object' + srcKey + ' into SQS queue and DynamoDB'); context.succeed(srcKey, 'Data put into SQS and DynamoDB'); } }); } }); }; AWS Lambda function for SQS and DynamoDB updates
  • 44. Batch Processing with Amazon EC2 Spot • Worker nodes get job parts from the SQS and perform single tasks based on the job task state in DynamoDB • Store the input objects in a file system such as Amazon Elastic File System (Amazon EFS), local instance store or Amazon Elastic Block Store (EBS) • Each job can be further split into multiples sub-parts if there is a mechanism to stitch the outputs together • Once completed, the objects will be uploaded back to S3 using multi-part upload.
  • 45. Batch Processing with Amazon EC2 Spot On Demand Auto- Scaling group Output S3 bucket Spot Auto- Scaling group 2 Availability Zone A Availability Zone B Spot Auto- Scaling group 1 Upload object into input S3 bucket Job SQS Queue Auto Scaling groups will scale up based on queue depth and scale down based on CPU utilization CW metrics Workers will check for jobs in the queue Workers will update Job status (start time, SLA end time, etc) in DynamoDB Uploads to S3 will trigger a Lamda function to put jobs in SQS and DynamoDB EFS EC2 instance worker fleet
  • 46. More automation? Use a Lambda function to dynamically manage Auto Scaling groups based on the Spot market • The Lambda function could periodically invoke the EC2 Spot APIs to assess market prices and availability and respond by creating new Auto Scaling launch configurations and groups automatically. • This function could also delete any Spot Auto Scaling groups and launch configurations that have no instances. AWS Data Pipeline can be used to invoke the Lambda function using the AWS CLI at regular intervals by scheduling pipelines
  • 47. Automated Batch Architecture with Spot Worker Worker On Demand Autoscaling group Output S3 bucket Worker (spot) Worker(spot) Spot Autoscaling group 2 Availability Zone A Availability Zone B Worker(spot) Worker (spot) Spot Autoscaling group 1 Upload object into input S3 bucket Job SQS Queue AutoScaling groups will scale up based on queue depth and scale down based on CPU utilization CW metrics Workers will check for jobs in the queue Workers will update Job status (start time, SLA end time, etc) in DynamoDB DataPipeline can invoke a Lambda function in a scheduled manner which can manage AutoScaling groups based on the spot market Uploads to S3 will trigger a Lamda function to put jobs in DynamoDB and SQS EFS
  • 48. Further cost optimization with Trusted Advisor Save money on AWS by eliminating unused and idle resources Cost Optimization TA Checks: • Amazon EC2 Reserved Instances Optimization • Low Utilization Amazon EC2 Instances • Idle Load Balancers • Underutilized Amazon EBS Volumes • Unassociated Elastic IP Addresses • Amazon RDS Idle DB Instances
  • 49. AWS re:Invent 2015 – October 6-9 AWS re:Invent is the largest annual gathering of the global cloud community. Whether you are an existing customer or new to the cloud, AWS re:Invent will provide you with the knowledge and skills to refine your cloud strategy, improve developer productivity, increase application performance and security, and reduce infrastructure costs. Though AWS re:Invent tickets are sold out, you can still register to view the Live Stream Broadcasts of the keynote addresses and select technical sessions on October 7 and October 8. Register now. Details: Wednesday, October 7 9:00am - 10:30am PT: Andrew Jassy, Sr. Vice President, AWS 11:00am - 5:15pm PT: 5 of the most popular breakout sessions (to be announced) Thursday, October 8 9:00am - 10:30am PT: Dr. Werner Vogels, CTO, Amazon 11:00am - 6:15pm PT: 6 of the most popular breakout sessions (to be announced) Register now for the Live Stream Broadcast by submitting your email where prompted on the AWS re:Invent home page. Stay Connected: Follow event activities on Twitter @awsreinvent (#reinvent), or like us on Facebook.
  • 51. What have customers done with Spot? Some case studies..
  • 52. EBS Submit jobs, orchestrate HPC clusters over VPC Run 1 Million drive head designs = 70.75 core-years 90x throughput: Ran in 8 hours, not 30 days 3 days from idea to running 70,908 cores, 729 TFLOPS c3, r3 with Intel E5-2670 v2 Cost: $5,594 Spot Instances New Drive Head Design Workloads World’s Largest F500 Cloud Run Transforming drive design to store the world’s data Encrypt, route data to AWS, return results Cluster 70,908 Cores with Spot Instances
  • 53. AWS Delivered Unheard-of Processing 39 years of science 10,600 AWS Instances Saved equivalent of $40M infrastructure 10 Million compounds screened 39 drug design years in 11 hours for a cost of… $4,232 3 promising compounds identified
  • 54. Scaling Hadoop Jobs with Spot http://engineering.bloomreach.com/strategies-for-reducing-your-amazon-emr-costs/ Bloomreach launches 1,500 to 2,000 Amazon EMR clusters and run 6,000 Hadoop jobs every day.
  • 55. Continuous Integration & Testing with Spot • Tapjoy - Premier Mobile Ad Network Across iOS & Android • Global Network (435 Million Monthly Reach) • Jenkins + Spot Instances • https://github.com/bwall/ec2-plugin (thanks to an RIT senior project) • Go wide during business hours, scale back in the evenings. Automatically kicks online at 06:00ET • Workers scale horizontally to support dozens of simultaneous regression tests spread out over dozens of workers • Jenkins automatically guards against spot termination
  • 56. Ooyala • Video technology platform that serves ESPN, Bloomberg, ... • Uses combo of OD/RI/Spot to ensure it can cover predicted volumes while keeping costs low • http://aws.amazon.com/solutions/case- studies/ooyala/ Vevo • Library of over 75,000 HD videos • Must be able to rapidly transcode library to a new screen format • Can spin up 100s of Spot instances to transcode entire library in a matter of days (instead of the weeks) Queue-based media transcoding
  • 57. Using Spot Fleet An example..
  • 58. Using Spot Fleet Create EC2 Spot Fleet IAM Role Requesting a fleet: • aws ec2 request-spot-fleet --spot-fleet-request-config file://mySmallFleet.json Describe fleet: • aws ec2 describe-spot-fleet-requests • aws ec2 describe-spot-fleet-requests --spot-fleet-request-ids <sfr-………..> Describe instances within the fleet • aws ec2 describe-spot-fleet-instances --spot-fleet-request-id <sfr-…………> Cancel Spot Fleet (with termination): • aws ec2 cancel-spot-fleet-requests --spot-fleet-request-ids <sfr-…………..> -terminate-instances
  • 59. mySpotFleet.json { "TargetCapacity": 5, "SpotPrice": "1.00", "IamFleetRole": "arn:aws:iam::962872214910:role/fleetRole", "LaunchSpecifications": [ { "ImageId": "ami-ff527ecf", "InstanceType": "m1.small" }, { "ImageId": "ami-ff527ecf", "InstanceType": "m1.medium" }, { "ImageId": "ami-ff527ecf", "InstanceType":"m1.large" } ] }