AWS Webcast - An Introduction to High Performance Computing on AWS

KD Singh
AWS Solutions Architect

• High performance and high throughput
computing on AWS
• Integrating on-premise HPC environments
with AWS
• HPC ecosystem – partners and tools
• Demo
Agenda

HPC and HTC on AWS
Concepts, Patterns & Practices

Take a typical big computation task…

…that an average cluster is too small (or
simply takes too long to complete)…

…optimization of algorithms can give some
leverage…

…and complete the task in hand…

AWS instance clusters can be balanced to
the job in hand…

…with multiple clusters running at the
same time

…HPC clusters are too small when
you need them most,
…and too large the rest of the time
Jason Stowe, Cycle Computing

Why AWS for HPC?
Low cost with flexible pricing Efficient clusters
Unlimited infrastructure
Faster time to results
Concurrent Clusters on-demand
Increased collaboration

Elastic Cloud-Based Resources
Actual demand
Resources scaled to demand
Waste Customer
Dissatisfaction
Actual Demand
Predicted Demand
Rigid On-Premises Resources
Benefits of Agility

Pay As You Go Model
Use only what you need
Multiple pricing models
On-Premises
Capital Expense Model
High upfront capital cost
High cost of ongoing support
Cost Benefits of HPC in the Cloud

Reserved
Make a low, one-time
payment and receive
a significant discount
on the hourly charge
For committed
utilization
Free Tier
Get Started on AWS
with free usage &
no commitment
For POCs and
getting started
On-Demand
Pay for compute
capacity by the hour
with no long-term
commitments
For spiky workloads,
or to define needs
Spot
Bid for unused
capacity, charged at
a Spot Price which
fluctuates based on
supply and demand
For time-insensitive
or transient
workloads
Dedicated
Launch instances
within Amazon VPC
that run on hardware
dedicated to a single
customer
For highly sensitive or
compliance related
workloads
Many Pricing Models to Support Different Workloads

Customers running HPC Workloads on AWS

484.14 TFLOPS
76th fastest supercomputer in
the world
June 2014 Top500 list
26496 cores cluster of C3 instances
On-Demand Supercomputer!

• 8 Regions; 156,314 cores; 16,788 instances
• 1.21 petaFLOPS RPeak
• 264 Compute years in 18 hours
• Supercomputing environment worth $68M cost $33K
1 c|net news
http://news.cnet.com/8301-1001_3-57611919-92/supercomputing-simulation-employs-156000-amazon-
processor-cores/
“Supercomputing simulation employs 156,000 Amazon
processor cores
To simulate 205,000 molecules as quickly as possible for a
USC simulation, Cycle Computing fired up a mammoth
amount of Amazon servers around the globe.” 1

Characterizing HPC
Tightly
Coupled
Loosely
Coupled
Supporting
Services
Embarrassingly
parallel
Elastic
Batch workloads
Data management
Task distribution
Workflow
management
Interconnected jobs
Network sensitivity
Job specific
algorithms

Feature Details
Flexible Run windows or Linux distributions
Scalable Wide range of instance types from micro to cluster compute
Machine
Images
Configurations can be saved as machine images (AMIs) from which new
instances can be created
Full control Full root or administrator rights
Secure Full firewall control via Security Groups
Monitoring Publishes metrics to Cloud Watch
Inexpensive On-demand, Reserved and Spot instance types
VM
Import/Export
Import and export VM images to transfer configurations in and out of EC2
Compute
Elastic Compute Cloud (EC2)
Basic unit of compute capacity
Range of CPU, memory & local disk options
35+ Instance types available, from micro to cluster
compute
c3.8xlarge
c3.2xlarge
c3.large
Vertical Scaling

Automation & Control
ec2-run-instances ami-xxxxxxxx
--instance-count 3
--availability-zone eu-west-1a
--instance-type m3.medium
http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/
CLI, API and Console
Scripted configurations

Auto Scaling
as-create-auto-scaling-group MyGroup
--launch-configuration MyConfig
--availability-zones eu-west-1a
--min-size 2
--max-size 200
Automatic re-sizing of compute clusters
based upon demand

Monitoring & Alerting
CloudWatch alerts based upon CPU load,
memory, I/O & user defined triggers
Trigger
scaling
policy
X

Time: +00h
<10 cores
Elastic Capacity

Time: +24h
>1500
cores
Elastic Capacity

Time: +72h
<10 cores
Elastic Capacity

Time: +120h
>600 cores
Elastic Capacity

Computational Chemistry project for
Cancer treatment
Estimated computation time: 39 years
Estimate project cost: $40 million
87,000 Core AWS Cluster
Spot Instances
Completed in 9 hours
Total Cost $4,232

Import Export
Glacier
S3 EC2
RedshiftDynamoDB
EMR
Data Pipeline
S3Direct Connect
Kinesis
AWS Big Data Portfolio
When data sets and data analytics need to
scale to the point that you have to start
innovating around how to collect, store,
organize, analyze and share it
COLLECT | STORE | ANALYZE | SHARE

Analyzed more than 3 billion data
points in 2.8 seconds instead of weeks
or months
SEC used Tradeworx and
the AWS Cloud to create an
analytics platform at 10%
the cost of a traditional
environment in less than 4
months
AWS gives Tradeworx the
ability to collect and analyze
billions of data over years,
allowing the SEC to
reconstruct any market event,
down to the individual record

What if you need to:
Implement MPI?
Code for GPUs?

Tightly coupled
Enhanced Networking EC2 Instances
Single Root I/O Virtualization (SR-IOV)
Higher Packets per Seconds, lower latencies, low network jitter
Implement HVM process execution
10 Gigabit Ethernet
R3 instances
Intel Xeon E5-2670
v2 2.5GHz
32 vCPUs
640GB SSD Local
Disk
244 GB RAM C3 instances
Intel Xeon E5-2680
v2 2.8 GHz
32 vCPUs
640GB SSD Local
Disk
60GB RAM
I2 instances
Intel Xeon E5-2670
v2 2.5GHz
32 vCPUs
1.6TB SSD Local
Disk
244 GB RAM

Tightly coupled
Network Placement Groups
Cluster instances can be launched within a
Placement Group. All instances launched in a
Placement Group have low latency, full
bisection, 10 Gbps bandwidth between
instances.
10Gbps

Compute-intensive clinical trial
simulations that previously took 60
hours are finished in only 1.2 hours on
the AWS Cloud
http://aws.amazon.com/solutions/case-studies/bristol-myers-squibb/
BMS used AWS to build a
secure, self-provisioning portal
for hosting research so
scientists can run clinical trial
simulations on-demand while
BMS is able to establish rules
that keep compute costs low.
Running simulations 98%
faster has led to more
efficient and less costly
clinical trials and better
conditions for patients.

GPU Computing
GPU compute instances
Intel® Xeon processors
NVIDIA GPUs
CUDA, OpenCL frameworks
Cluster GPU CG1
Intel Xeon X5570
16 vCPUs
10 Gigabit Ethernet
2x NVIDIA Tesla Fermi
M2050 448 cores each
G2 instances
Intel Xeon E5-2670
2.5 GHz
8 vCPUs, on-board
Hardware encoder
1,536 CUDA cores
15 GB RAM, 4GB
Video memory

CUDA & OpenCL
CUDA & OpenCL
Massive parallel clusters running in GPUs
NVIDIA GRID and Tesla cards in specialized
instance types

National Taiwan University
50 x cg1.4xlarge instance types
100 nvidia Tesla M2050
“Our purpose is to break the record of solving the shortest vector problem
(SVP) in Euclidean lattices…the vectors we found are considered the hardest
SVP anyone has solved so far.”
Prof. Chen-Mou Cheng, the Principal Investigator of Fast Crypto Lab
$2,300 for using 100 Tesla M2050 for ten hours

Coming Soon…
New Compute-Optimized EC2 Instances
C4 family
C4 instances
Intel Xeon E5-2666
v3 Haswell, custom
36 vCPUs
60GB RAM
2.9GHz, up to 3.5GHz
with Turbo boost
Larger and Faster Elastic Block Store (EBS)
Volumes
Up to 16TB per volume
Up to 10,000 baseline IOPS per volume
Up to 20,000 provisioned IOPS per volume

Middleware Services
Data management
Fully managed SQL, NoSQL and object storage
Relational Database Service
Fully managed database
(MySQL, Oracle, MSSQL)
DynamoDB
NoSQL, Schemaless,
Provisioned throughput
database
S3
Object datastore up to 5TB
per object
99.999999999% durability

Collection CollaborationComputation
Moving computation closer to the data
“Big Data” changes dynamic of computation and data sharing
Direct Connect
Import/Export
S3
DynamoDB
EC2
GPUs
Elastic Map Reduce
CloudFormation
Simple Workflow
S3
Zocalo

Middleware Services
Feeding workloads
Using highly available Simple Queue
Service to feed EC2 nodes
Amazon SQS
Processing
task/processing trigger
Processing results

Middleware Services
Coordinating workloads & task clusters
Handle long running processes across many nodes and task steps
with Simple Workflow
Task A
Task B
(Auto-
scaling)
Task C
2
3
1
Grid Engine
cfncluster
LSF
OpenLava
Bright Cluster Manager

Legacy
Data Centers
On-Premises
Resources
Cloud
ResourcesIntegration
Cloud isn’t an ‘all or nothing’ choice

Active Directory Shibboleth
/ SAML
Network Configuration
Encryption
Backup Appliances
Your On-Premises
Apps
Legacy
Data Centers
Users & Access Rules (IAM)
Your Private Network (VPC)
Encryption (S3, RDS, HSM)
Backups (Storage Gateway)
Your Cloud Apps
AWS Direct Connect
VPN
Integrating AWS with your existing on-premises
infrastructure

AZ-1
AZ-2
Public
Public
Private
Private
Private
Private
Customer
Gateway
VPN
Gateway
Internet
Gateway
Amazon S3
VPN
Connection
SpotMaster
SpotClustered Storage
Server
Clustered Storage
Server
Internet
Example HPC Design Pattern

HPC Software on AWS Marketplace

Use your current development tools
NVIDIA CUDA drivers pre-loaded
Intel MPI and Intel MKL® libraries
OpenMPI and MPICH2
Applications/Services
MathWorks MatLab, Intel Lustre, OrangeFS, Ansys Fluent,
COMSOL, OpenFOAM etc.
Use your favorite batch scheduler and configuration
management tools
cfncluster Univa Sun Grid
Engine
HTCondor MIT StarCluster
Torque Slurm Rocks+
(StackIQ)
AWS
CloudFormation
Openlava Chef Puppet Elasticluster
HPC Applications and Tools

Oil and Gas
Seismic Data
Processing
Reservoir
Simulations,
Modeling
Manufacturing
& Engineering
Computational
Fluid
Dynamics
(CFD)
Finite Element
Analysis (FEA)
Life Sciences
Media &
Entertainment
Transcoding
and Encoding
DRM,
Encryption
Rendering
Scientific
Computing
Computational
Chemistry
High Energy
Physics
Stochastic
Modeling
Quantum
Analysis
Climate Models
EDA
Simulation
Verification
Genome
Analysis
Molecular
Modeling
Protein Docking
Popular HPC Workloads on AWS

kdsingh@amazon.com
cloud formation cluster
(cfncluster) demo
https://github.com/awslabs/cfncluster

AWS Webcast - An Introduction to High Performance Computing on AWS

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a AWS Webcast - An Introduction to High Performance Computing on AWS

Similar a AWS Webcast - An Introduction to High Performance Computing on AWS (20)

Más de Amazon Web Services

Más de Amazon Web Services (20)

Último

Último (20)

AWS Webcast - An Introduction to High Performance Computing on AWS