Since 2011, ONS.org.br (responsible for planning and operating the Brazilian Electric Sector) has been using AWS to run daily simulations using complex mathematical models. The use of the MIT StarCluster toolkit makes running HPC on AWS much less complex and lets ONS provision a high performance cluster in less than 5 minutes. Since the elapsed time of a big cluster depends of the user, ONS decide to develop a HPC portal where its engineers can interface with AWS and MIT StarCluster without knowing a line of code or having to use the command terminal. It is just a simple turn-on/turn-off portal. The cluster now gets personal, and every engineer runs the models using HPC on AWS as if they are using a PC.
6. Pay As You Go Model
Use only what you need
Multiple pricing models
On-Premises
Capital Expense Model
High upfront capital cost
High cost of ongoing support
HPC as utility
7. Elastic Cloud-Based Resources
Actual demand
Resources scaled to demand
Waste Customer
Dissatisfaction
Actual Demand
Predicted Demand
Rigid On-Premises Resources
8.
9.
10. Scale using Elastic Capacity
>600 cores
Scalability on AWS
<10 cores
>1500
cores
11. Making Production Cloud HPC easy from 64 cores to …
Pharma
Johnson & Johnson
Manufacturing
HGST, a Western Digital Company
Financial Services
Pacific Life Insurance
Genomics
Life Technologies
Research
The Aerospace
Corporation
… 156,314cores for better solar panel materials for $33k, not $68M
Amazon EC2
16,788 Spot Instances
Amazon S3
4TB Processed
Spot Instances on all 8 Regions
1.21 PetaFLOPS
Intel SandyBridge on CC2
12.
13. Flexibility
How HPC can be used as utility
Cost-optimization
It’s about new cost models and new ways to enable your business to do more.
Shifting the Paradigm
14. On-Demand
Pay for compute capacity by the hour with no long-term commitments
For spiky workloads, or to define needs
Reserved
Make a low, one-time payment and receive a significant discount on the hourly charge
For committed utilization
Spot
Bid for unused capacity, charged at a Spot Price which fluctuates based on supply and demand
For time-insensitive or transient workloads
15. 0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Heavy Utilization Reserved Instances
Light RI Light RI Light RI Light RI
On-Demand
Spot and
On-
Demand
100%
80%
60%
40%
20%
Percentage of Peak Requirements Over Time
16. EC2 Compute Units (HP)
Memory (GB)
256 128 64 32 16 8 4 2 1 1 2 4 8 16 32 64 128 High CPU High Memory Cluster Compute & High I/O Micro Standard Cluster High Memory & High Storage
17. Cost-optimization
It’s about new cost models and new ways to enable your business to do more.
Flexibility
How HPC can be used as utility
Performance and power
From embarrassingly parallel to tightly coupled
Shifting the Paradigm
18. CLI, API, and console
Scripted configurations
Automation & control
Automatic re-sizing of compute clusters based upon demand and policies
21. Network placement groups
Cluster instances deployed in a Placement Group enjoy low latency, full bisection
10 Gbpsbandwidth
10Gbps
Performance for tightly-coupled workloads
22. GPU compute instances
cg1.8xlarge
33.5 EC2 Compute Units
20GB RAM
2x NVIDIA GPU
448 Cores
3GB Mem
g2.2xlarge
26 EC2 Compute Units
16GB RAM
1x NVIDIA GPU
1536 Cores
4GB Mem
G2 instances
Intel® Intel Xeon E5-2670
1 NVIDIA KeplerGK104 GPU
I/O Performance: Very High
CG1 instances
Intel® Xeon® X5570 processors
2 x NVIDIA Tesla “Fermi” M2050 GPUs
I/O Performance: Very High
Performance for tightly-coupled workloads
23. Flexibility
How HPC can be used as utility
Achieve more
Perform bigger, more complex jobs in a much reduced time
Performance and power
Flexibility to choose platforms
Shifting the Paradigm
Cost-optimization
It’s about new cost models and new ways to enable your business to do more
24. Oil and Gas
Seismic Data Processing
Reservoir Simulations, Modeling
Geospatial applications
Predictive Maintenance
Manufacturing & Engineering
Computational Fluid Dynamics (CFD)
Finite Element Analysis (FEA)
Wind Simulation
Life Sciences
Genome Analysis
Molecular Modeling
Protein Docking
Media & Entertainment
Transcoding and Encoding
DRM, Encryption
Rendering
Energy & Scientific Computing
Computational Chemistry
High Energy Physics
Stochastic Modeling
Quantum Analysis
Energy Models
Climate Models
Financial
Monte Carlo Simulations
Wealth Management Simulations
Portfolio, Credit Risk Analytics
High Frequency Trading Analytics
Customers are using AWS for more and more HPC workloads
32. Medium
Term
Short
Term
Horizon: 1 to 6 months
Stage: week
Horizon: 5 years
Stage: month NEWAVE
DECOMP
More uncertainty and fewer details
Less uncertainty and more details
Updating of operating conditions
33. Use Hydro
Use Thermal to
supplement Hydro
OK
Energy Deficit
(load shedding)
Spillage
(waste)
OK
Decision
Total Cost Immediate Cost Future Cost
34. Decomp
“We need more power” NewaveWeatherForecastParallel Processing
36. 1.Elastic Environment
•Unlimited processing power
•Ideal for unexpected load
2. Low data transfer
•Input and output of small files
•Ideal for internet connection
3. Variable and right cost
•Pay per use
•Don´t need to buy huge servers