The document discusses high performance computing (HPC) on AWS. It begins with an agenda that includes HPC concepts, patterns and practices as well as a demo of CloudFormation Cluster (cfncluster). It then discusses using elastic AWS instance clusters that are neither too large nor too small for jobs. The document covers various pricing models on AWS including on-demand, reserved, spot and dedicated instances. It also discusses AWS services that can be used for HPC workloads like EC2, auto scaling, monitoring with CloudWatch, GPU computing, middleware services and integrated solutions. Finally, popular HPC workloads on AWS are listed in various industries.
2. • High performance and high throughput
computing on AWS
• Integrating on-premise HPC environments
with AWS
• HPC ecosystem – partners and tools
• Demo
Agenda
3. HPC and HTC on AWS
Concepts, Patterns & Practices
14. …HPC clusters are too small when
you need them most,
…and too large the rest of the time
Jason Stowe, Cycle Computing
15. Why AWS for HPC?
Low cost with flexible pricing Efficient clusters
Unlimited infrastructure
Faster time to results
Concurrent Clusters on-demand
Increased collaboration
16. Elastic Cloud-Based Resources
Actual demand
Resources scaled to demand
Waste Customer
Dissatisfaction
Actual Demand
Predicted Demand
Rigid On-Premises Resources
Benefits of Agility
17. Pay As You Go Model
Use only what you need
Multiple pricing models
On-Premises
Capital Expense Model
High upfront capital cost
High cost of ongoing support
Cost Benefits of HPC in the Cloud
18. Reserved
Make a low, one-time
payment and receive
a significant discount
on the hourly charge
For committed
utilization
Free Tier
Get Started on AWS
with free usage &
no commitment
For POCs and
getting started
On-Demand
Pay for compute
capacity by the hour
with no long-term
commitments
For spiky workloads,
or to define needs
Spot
Bid for unused
capacity, charged at
a Spot Price which
fluctuates based on
supply and demand
For time-insensitive
or transient
workloads
Dedicated
Launch instances
within Amazon VPC
that run on hardware
dedicated to a single
customer
For highly sensitive or
compliance related
workloads
Many Pricing Models to Support Different Workloads
20. 484.14 TFLOPS
76th fastest supercomputer in
the world
June 2014 Top500 list
26496 cores cluster of C3 instances
On-Demand Supercomputer!
21. • 8 Regions; 156,314 cores; 16,788 instances
• 1.21 petaFLOPS RPeak
• 264 Compute years in 18 hours
• Supercomputing environment worth $68M cost $33K
1 c|net news
http://news.cnet.com/8301-1001_3-57611919-92/supercomputing-simulation-employs-156000-amazon-
processor-cores/
“Supercomputing simulation employs 156,000 Amazon
processor cores
To simulate 205,000 molecules as quickly as possible for a
USC simulation, Cycle Computing fired up a mammoth
amount of Amazon servers around the globe.” 1
24. Feature Details
Flexible Run windows or Linux distributions
Scalable Wide range of instance types from micro to cluster compute
Machine
Images
Configurations can be saved as machine images (AMIs) from which new
instances can be created
Full control Full root or administrator rights
Secure Full firewall control via Security Groups
Monitoring Publishes metrics to Cloud Watch
Inexpensive On-demand, Reserved and Spot instance types
VM
Import/Export
Import and export VM images to transfer configurations in and out of EC2
Compute
Elastic Compute Cloud (EC2)
Basic unit of compute capacity
Range of CPU, memory & local disk options
35+ Instance types available, from micro to cluster
compute
c3.8xlarge
c3.2xlarge
c3.large
Vertical Scaling
25. Automation & Control
ec2-run-instances ami-xxxxxxxx
--instance-count 3
--availability-zone eu-west-1a
--instance-type m3.medium
http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/
CLI, API and Console
Scripted configurations
32. Computational Chemistry project for
Cancer treatment
Estimated computation time: 39 years
Estimate project cost: $40 million
87,000 Core AWS Cluster
Spot Instances
Completed in 9 hours
Total Cost $4,232
33. Import Export
Glacier
S3 EC2
RedshiftDynamoDB
EMR
Data Pipeline
S3Direct Connect
Kinesis
AWS Big Data Portfolio
When data sets and data analytics need to
scale to the point that you have to start
innovating around how to collect, store,
organize, analyze and share it
COLLECT | STORE | ANALYZE | SHARE
34. Analyzed more than 3 billion data
points in 2.8 seconds instead of weeks
or months
SEC used Tradeworx and
the AWS Cloud to create an
analytics platform at 10%
the cost of a traditional
environment in less than 4
months
AWS gives Tradeworx the
ability to collect and analyze
billions of data over years,
allowing the SEC to
reconstruct any market event,
down to the individual record
36. What if you need to:
Implement MPI?
Code for GPUs?
37. Tightly coupled
Enhanced Networking EC2 Instances
Single Root I/O Virtualization (SR-IOV)
Higher Packets per Seconds, lower latencies, low network jitter
Implement HVM process execution
10 Gigabit Ethernet
R3 instances
Intel Xeon E5-2670
v2 2.5GHz
32 vCPUs
640GB SSD Local
Disk
244 GB RAM C3 instances
Intel Xeon E5-2680
v2 2.8 GHz
32 vCPUs
640GB SSD Local
Disk
60GB RAM
I2 instances
Intel Xeon E5-2670
v2 2.5GHz
32 vCPUs
1.6TB SSD Local
Disk
244 GB RAM
38. Tightly coupled
Network Placement Groups
Cluster instances can be launched within a
Placement Group. All instances launched in a
Placement Group have low latency, full
bisection, 10 Gbps bandwidth between
instances.
10Gbps
39. Compute-intensive clinical trial
simulations that previously took 60
hours are finished in only 1.2 hours on
the AWS Cloud
http://aws.amazon.com/solutions/case-studies/bristol-myers-squibb/
BMS used AWS to build a
secure, self-provisioning portal
for hosting research so
scientists can run clinical trial
simulations on-demand while
BMS is able to establish rules
that keep compute costs low.
Running simulations 98%
faster has led to more
efficient and less costly
clinical trials and better
conditions for patients.
41. CUDA & OpenCL
CUDA & OpenCL
Massive parallel clusters running in GPUs
NVIDIA GRID and Tesla cards in specialized
instance types
42. National Taiwan University
50 x cg1.4xlarge instance types
100 nvidia Tesla M2050
“Our purpose is to break the record of solving the shortest vector problem
(SVP) in Euclidean lattices…the vectors we found are considered the hardest
SVP anyone has solved so far.”
Prof. Chen-Mou Cheng, the Principal Investigator of Fast Crypto Lab
$2,300 for using 100 Tesla M2050 for ten hours
43. Coming Soon…
New Compute-Optimized EC2 Instances
C4 family
C4 instances
Intel Xeon E5-2666
v3 Haswell, custom
36 vCPUs
60GB RAM
2.9GHz, up to 3.5GHz
with Turbo boost
Larger and Faster Elastic Block Store (EBS)
Volumes
Up to 16TB per volume
Up to 10,000 baseline IOPS per volume
Up to 20,000 provisioned IOPS per volume
45. Middleware Services
Data management
Fully managed SQL, NoSQL and object storage
Relational Database Service
Fully managed database
(MySQL, Oracle, MSSQL)
DynamoDB
NoSQL, Schemaless,
Provisioned throughput
database
S3
Object datastore up to 5TB
per object
99.999999999% durability
46. Collection CollaborationComputation
Moving computation closer to the data
“Big Data” changes dynamic of computation and data sharing
Direct Connect
Import/Export
S3
DynamoDB
EC2
GPUs
Elastic Map Reduce
CloudFormation
Simple Workflow
S3
Zocalo
48. Middleware Services
Coordinating workloads & task clusters
Handle long running processes across many nodes and task steps
with Simple Workflow
Task A
Task B
(Auto-
scaling)
Task C
2
3
1
Grid Engine
cfncluster
LSF
OpenLava
Bright Cluster Manager
56. Use your current development tools
NVIDIA CUDA drivers pre-loaded
Intel MPI and Intel MKL® libraries
OpenMPI and MPICH2
Applications/Services
MathWorks MatLab, Intel Lustre, OrangeFS, Ansys Fluent,
COMSOL, OpenFOAM etc.
Use your favorite batch scheduler and configuration
management tools
cfncluster Univa Sun Grid
Engine
HTCondor MIT StarCluster
Torque Slurm Rocks+
(StackIQ)
AWS
CloudFormation
Openlava Chef Puppet Elasticluster
HPC Applications and Tools
57. Oil and Gas
Seismic Data
Processing
Reservoir
Simulations,
Modeling
Manufacturing
& Engineering
Computational
Fluid
Dynamics
(CFD)
Finite Element
Analysis (FEA)
Life Sciences
Media &
Entertainment
Transcoding
and Encoding
DRM,
Encryption
Rendering
Scientific
Computing
Computational
Chemistry
High Energy
Physics
Stochastic
Modeling
Quantum
Analysis
Climate Models
EDA
Simulation
Verification
Genome
Analysis
Molecular
Modeling
Protein Docking
Popular HPC Workloads on AWS