SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Jafar Shameem and David Pellerin
High Performance Computing with AWS
Business Development, HPC
Migrate entire HPC applications
and datacenters to the cloud
Use cloud capabilities to create
entirely new HPC applications
Augment on-premise HPC
resources with cloud capacity
How are Organizations Using Cloud for HPC?
• Security: Deploy applications and store data in a secure,
highly configurable VPC environment
• Agility: Deploy the right infrastructure for each technical
computing job, at the right time
• Scalability: Add and subtract servers in minutes to
optimize time-to-results
• Cost Savings: Pay only for what you use, don’t pay for
idle or outdated servers
Why AWS for High-Performance Computing?
Waste
User/Customer
Dissatisfaction
Actual demand
Predicted Demand
Rigid On-Premise Resources Elastic Resources
Actual demand
Resources scaled to demand
AWS for Agility
On-Demand
Pay for compute
capacity by the hour
with no long-term
commitments
For spiky workloads,
or to define needs
Many purchase models to support different needs
Reserved
Make a low, one-time
payment and receive a
significant discount on
the hourly charge
For committed
utilization
Spot
Bid for unused capacity,
charged at a Spot Price
which fluctuates based
on supply and demand
For time-insensitive or
transient workloads
Dedicated
Launch instances within
Amazon VPC that run
on hardware dedicated
to a single customer
For highly sensitive or
compliance related
workloads
Free Tier
Get Started on AWS
with free usage & no
commitment
For POCs and
getting started
Massive scale allows AWS to constantly reduce
costs, while improving quality and reliability
TCO of cloud is much lower then on-premise IT
when all costs are considered
Result? Large scale datacenter-to-cloud
migrations are in progress every day
AWS for Scale
Scalable Computing: Go From Just One Instance…
To Thousands… in Just Minutes!
Memory
(GiB)
Small 1.7 GB,
1 EC2 Compute Unit
1 virtual core
Micro 613 MB
Up to 2 ECUs
Large 7.5 GB
4 EC2 Compute Units
2 virtual cores
Extra Large 15 GB
8 EC2 Compute Units
4 virtual cores
Hi-Mem XL 17.1 GB
6.5 EC2 Compute Units
2 virtual cores
Hi-Mem 2XL 34.2 GB
13 EC2 Compute Units
4 virtual cores
Hi-Mem 4XL 68.4 GB
26 EC2 Compute Units
8 virtual cores
High-CPU Med 1.7 GB
5 EC2 Compute Units
2 virtual cores
High-CPU XL 7 GB
20 EC2 Compute Units
8 virtual cores
Cluster GPU 4XL 22 GB
33.5 EC2 Compute Units,
2 x NVIDIA Tesla “Fermi”
M2050 GPUs
Cluster Compute 4XL 23 GB
33.5 EC2 Compute Units
Medium 3.7 GB,
2 EC2 Compute Units
1 virtual core
High I/O 4XL 60.5 GB, 35
EC2 Compute Units,
2*1024 GB SSD-based
local instance storage
High Storage 8XL 117 GB
35 EC2 Compute Units
24 * 2 TB instance store
Cluster High Mem 8XL
89 EC2 Compute Units
244 GB SSD instance storage
EC2 Compute Units
Cluster Compute 8XL 60.5 GB
88 EC2 Compute Units
Choose the Right Instance Type for the Job
On-Premise
Experiment
infrequently
Failure is
expensive
Less Innovation
Cloud
Experiment
often
Fail quickly at a
low cost
More Innovation
$ Millions Nearly $0
AWS for Innovation
s on innovation
e the muck of infrastructure management to AWS
http://eddie.niese.net/20090313/dont-pity-incompetence/
• Engineering: CAD and CAE for aerospace, defense, structures,
consumer products
• Life Sciences: For basic research, drug discovery, genomics, and
translational medicine
• Energy and Geophysics: Including seismic processing, reservoir
estimation, high-energy simulation, wind energy modeling, GIS
• Financial Services and Insurance: Including valuation and risk
analytics
And Many More!
HPC Applications Running on AWS Today
HPC for Engineering
Scalable Computing for CAD/CAE/EDA
AWS for Engineering
• Computer-Aided Design, Simulation, Analysis, Visualization
– For development of commercial and military products
– Aerospace, automotive, civil, construction, energy, others
– Across industries, the trend is Simulation-Driven Design
• Examples
– Computer-Aided Design (CAD) including 3D models
– Electronic Design Automation (EDA)
– Computational Fluid Dynamics (CFD)
– Finite Element Analysis (FEA) and Thermal Analysis
– Crash Analysis
– Failure and Hazard Analysis
CFD for Turbine Engine Design
• Time accurate fluid dynamics
• SBIR-funded project for the US Air Force Research Laboratory (AFRL)
• SAS 70 Type II certification and VPN-level access required
• Additional security measures:
– Uploaded and downloaded data was encrypted
– Dedicated EC2 cluster instances were provisioned
– Data was purged upon completion of the run
“The results of this case were impressive. Using Amazon EC2 the large-scale,
time accurate simulation was turned around in just 72 hours with computing
infrastructure costs well below $1,000.”
http://aws.amazon.com/solutions/case-studies/aerodynamic-solutions/
• Commercial provider of mixed-signal ASICs for X-ray and gamma ray
detection and imaging
• Needs to perform very large Monte Carlo simulations using as many as 4000
server nodes
• Computing workloads are highly variable, project-driven
• Building an on-premise cluster to handle peak loads would be cost prohibitive
• Solution: EC2 3rd-generation High-Memory instances
• Up to 80% savings by using Spot instances on EC2
Radiation Simulation for ASIC Design
1) Customer Managed Application Hosting
• Customer has account with AWS and manages infrastructure
• Customer maintains traditional software vendor relationships
• Software vendor offers license flexibility (BYOL)
2) Vendor Managed Hosting to Augment On-Premise Application
• Client-Server model for acceleration of batch tasks
• Customer pays software vendor for AWS-hosted services
• Customer does not need to manage low-level infrastructure
3) Vendor Managed Software-as-a-Service
• Pay-per-use, fully web-based including GUI
Scenarios for Technical Software
Trusted by Enterprises Worldwide
HPC for Life Sciences
Customer Case-Studies
And a rich history in Life Sciences
AWS Public Data Sets
• A centralized repository of public datasets
• Seamless integration with cloud based applications
• No charge to the community
• Some of the datasets available today:
– 1000 Genomes Project
– Ensembl
– GenBank
– Illumina – Jay Flateley Human Genome Dataset
– YRI Trio Dataset
– The Cannabis Sativa Genome
– UniGene
– Influenza Virrus
– PubChem
• Tell us what else you’d like for us to host …
Open Source ecosystem
• NCBI BLAST
• Crossbow
• CloudBurst
• Myrna
• Clovr
• BioPerl Max
• VIPDAC
• Superfamily
• Cloud-Coffee
• BioNimbus
• GMOD
• CloudAligner
• CRdata
• SeqWare
• Blend
• StormSeq
• BioConductor
Get links to AMIs at:
https://github.com/mndoci/mndoci.github.com/wiki/Life-Science-Apps-on-AWS
MIT StarCluster Sun Grid Engine Condor
Torque Slurm Rocks
Chef Puppet
Number of Cluster nodes can
be added depends on the computational
needs
Remove constraints
Capex, operational skills,
processing limitations
Focus on the problem
Not the technical challenges
of large compute clusters
Achieve more
Perform bigger, more
complex jobs in a much
reduced time
Iterate around the
problem
Do more and afford to take more
risks as cost of experimentation
reduced
Why
AWS?
Data Transfer
• AWS Import/Export
– Move large amounts of data into and outside AWS
– Data Migration, Content Distribution, DR, etc.
• AWS Direct Connect
– Secure private link to AWS
– 1Gbps, 10Gbps connectivity
– You can also co-locate hardware in AWS DX locations
• Bandwidth Optimization Solutions
– Commercial providers – Aspera, Riverbed, Attunity, etc.
– Open Source – Tsunami UDP, Globus Online
AWS Direct
Connect
AWS
Import/Export
Relational Database Service
Fully managed database
(MySQL, Oracle, MSSQL)
DynamoDB
NoSQL, Schemaless,
Provisioned throughput
database
S3
Object datastore up to 5TB
per object
99.999999999% durability
SimpleDB
NoSQL, Schemaless
Smaller datasets
Redshift
Petabyte scale
data warehousing service
Fully managed
Storage Options
1.3 Trillion
835k peak transactions per second
Objects in S3
Glacier
Long term cold storage
From $0.01 per GB/Month
99.999999999% durability
Archival
“Every day our genome sequencers produce terabytes of data. As our company
moves into the clinical space, we face a legal requirement to archive patient data
for years that would drastically raise the cost of storage. Thanks to Amazon
Glacier’s secure and scalable solution, we will be able to provide cost-effective,
long-term storage and thereby eliminate a barrier to providing whole genome
sequencing for medical treatment of cancer and other genetic diseases.”
- Keith Raffel, Senior Vice President and Chief Commercial Officer, Complete
Genomics
Elastic MapReduce
Managed, elastic Hadoop cluster
Integrates with S3 & DynamoDB
Leverage Hive & Pig analytics scripts
Integrates with instance types such as spot
Application Services
Feature Details
Scalable Use as many or as few compute instances running
Hadoop as you want. Modify the number of instances
while your job flow is running
Integrated with other
services
Works seamlessly with S3 as origin and output.
Integrates with DynamoDB
Comprehensive Supports languages such as Hive and Pig for defining
analytics, and allows complex definitions in
Cascading, Java, Ruby, Perl, Python, PHP, R, or C++
Cost effective Works with Spot instance types
Monitoring Monitor job flows from with the management
console
Compute Storage
AWS Global Infrastructure
Database
App Services
Deployment & Administration
Networking
EMR Jobs
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
4,000,000
3.7 M clusters
launched since May 2010
Crossbow
• Align billions of reads and find SNPs
– Reuse software components: Hadoop Streaming
h" p://bowI eAbio.sourceforge.net/crossbow2
• Map: Bowtie (Langmead et al., 2009)
– Find best alignment for each read
– Emit (chromosome region, alignment)
• Reduce: SOAPsnp (Li et al., 2009)
– Scan alignments for divergent columns
– Accounts for sequencing error, known SNPs
• Shuffle: Hadoop
– Group and sort alignments by region
…2
…2
Searching for SNPs with Cloud Computing.
Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Genome Biology. 10:R134 
Worldwide research and
development
The Amazon Virtual Private Cloud was a unique
option that offered an additional level of security and
an ability to integrate with other aspects of our
infrastructure.
“AWS enables Pfizer’s WRD to explore specific difficult or deep
scientific questions in a timely, scalable manner and helps
Pfizer make better decisions more quickly”
Dr. Michael Miller, Head of HPC for R&D, Pfizer
Spiral Genetics
• Alignment, Variant Calling, Annotation
• Turnaround time
– Targeted : less than 40 minutes
– Exome : less than 2 hours
– Whole Genome : less than 5 hours
• Workflows can be easily defined
and automated with integrated
Galaxy Platform capabilities
• Data movement is streamlined
with integrated Globus file-
transfer functionality
• Resources can be provisioned
on-demand with Amazon Web
Services cloud based
infrastructure
Globus Genomics
Proprietary and Confidential. ©2013 Syapse
Syapse: Bringing Omics in Routine Medical Use
Laboratory
Testing
Test Results Clinical Use
Syapse Semantic Data
Platform
Syapse Omics Medical
Record Application
Syapse Physician Portal
Application
Syapse Discovery
Application
Syapse
Leverage Spot instances in workflows
1 days worth of effort
resulted in
50% savings in cost
Harvard Medical School
The Laboratory of Personal Medicine
Run EC2 clusters to analyze entire
genomes“The AWS solution is stable, robust, flexible, and low cost. It
has everything to recommend it.”
Dr. Peter Tonellato, LPM, Center for Biomedical Informatics, Harvard Medical School
Illumina BaseSpace
• Data Analysis
– Alignment, Assembly, QC, Analysis
• Share data with colleagues
• Access high quality and diverse datasets
We are here to help
Enterprise support Trusted Advisor Professional Services
Sales and
Solutions Architects
Thank You
Jafar Shameem
(shameemj@amazon.com)
David Pellerin
(dpelleri@amazon.com)
http://aws.amazon.com/

Más contenido relacionado

La actualidad más candente

(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
Amazon Web Services
 

La actualidad más candente (20)

Building HPC Clusters as Code in the (Almost) Infinite Cloud | AWS Public Sec...
Building HPC Clusters as Code in the (Almost) Infinite Cloud | AWS Public Sec...Building HPC Clusters as Code in the (Almost) Infinite Cloud | AWS Public Sec...
Building HPC Clusters as Code in the (Almost) Infinite Cloud | AWS Public Sec...
 
(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud
 
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
(BDT311) MegaRun: Behind the 156,000 Core HPC Run on AWS and Experience of On...
 
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015
 
High Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloudHigh Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloud
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
Architectures for HPC and HTC Workloads on AWS | AWS Public Sector Summit 2017
Architectures for HPC and HTC Workloads on AWS | AWS Public Sector Summit 2017Architectures for HPC and HTC Workloads on AWS | AWS Public Sector Summit 2017
Architectures for HPC and HTC Workloads on AWS | AWS Public Sector Summit 2017
 
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksDeep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
 
High Performance Computing (HPC) on AWS 101
High Performance Computing (HPC) on AWS 101High Performance Computing (HPC) on AWS 101
High Performance Computing (HPC) on AWS 101
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
 
An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Inven...
An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Inven...An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Inven...
An MPI-IO Cloud Cluster Bioinformatics Summer Project (BDT205) | AWS re:Inven...
 
Building Big Data Applications on AWS
Building Big Data Applications on AWSBuilding Big Data Applications on AWS
Building Big Data Applications on AWS
 
HSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWSHSBC and AWS Day - Big Data and HPC on AWS
HSBC and AWS Day - Big Data and HPC on AWS
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
 
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
(BDT201) Big Data and HPC State of the Union | AWS re:Invent 2014
 
Cloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to ScaleCloud Economics, from Genesis to Scale
Cloud Economics, from Genesis to Scale
 
AWS Re:Invent - Optimizing Costs with AWS
AWS Re:Invent -  Optimizing Costs with AWSAWS Re:Invent -  Optimizing Costs with AWS
AWS Re:Invent - Optimizing Costs with AWS
 
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
Deep Dive on Amazon EC2 Instances - AWS Summit Cape Town 2017
 
Risk Management and Particle Accelerators: Innovating with New Compute Platfo...
Risk Management and Particle Accelerators: Innovating with New Compute Platfo...Risk Management and Particle Accelerators: Innovating with New Compute Platfo...
Risk Management and Particle Accelerators: Innovating with New Compute Platfo...
 
AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...
AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...
AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...
 

Destacado

Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYCAccelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
Amazon Web Services
 
Digital media in the aws cloud, hugo lerias
Digital media in the aws cloud, hugo leriasDigital media in the aws cloud, hugo lerias
Digital media in the aws cloud, hugo lerias
Amazon Web Services
 
Canonical AWS Summit London 2011
Canonical AWS Summit London 2011Canonical AWS Summit London 2011
Canonical AWS Summit London 2011
Amazon Web Services
 
AWS Cloud School - London April 2012
AWS Cloud School - London April 2012AWS Cloud School - London April 2012
AWS Cloud School - London April 2012
Amazon Web Services
 

Destacado (20)

AWS re:Invent 2016: No More Ransomware: How Europol, the Dutch Police, and AW...
AWS re:Invent 2016: No More Ransomware: How Europol, the Dutch Police, and AW...AWS re:Invent 2016: No More Ransomware: How Europol, the Dutch Police, and AW...
AWS re:Invent 2016: No More Ransomware: How Europol, the Dutch Police, and AW...
 
AWS re:Invent 2016: Powering the Next Generation of Virtual Reality with Veri...
AWS re:Invent 2016: Powering the Next Generation of Virtual Reality with Veri...AWS re:Invent 2016: Powering the Next Generation of Virtual Reality with Veri...
AWS re:Invent 2016: Powering the Next Generation of Virtual Reality with Veri...
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
 
AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security AWS Summit Auckland 2014 | Understanding AWS Security
AWS Summit Auckland 2014 | Understanding AWS Security
 
AWS Road Trip 2013 - Presentation
AWS Road Trip 2013 - PresentationAWS Road Trip 2013 - Presentation
AWS Road Trip 2013 - Presentation
 
Running Microsoft Enterprise Workloads on Amazon Web Services
Running Microsoft Enterprise Workloads on Amazon Web ServicesRunning Microsoft Enterprise Workloads on Amazon Web Services
Running Microsoft Enterprise Workloads on Amazon Web Services
 
Deploy, Manage & Scale Your Apps with Elastic Beanstalk
Deploy, Manage & Scale Your Apps with Elastic BeanstalkDeploy, Manage & Scale Your Apps with Elastic Beanstalk
Deploy, Manage & Scale Your Apps with Elastic Beanstalk
 
What's New
What's NewWhat's New
What's New
 
Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYCAccelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
Accelerating Organizations with Flexible IT - AWS Summit 2012 - NYC
 
February 2016 Webinar Series Migrate Your Apps from Parse to AWS
February 2016 Webinar Series   Migrate Your Apps from Parse to AWSFebruary 2016 Webinar Series   Migrate Your Apps from Parse to AWS
February 2016 Webinar Series Migrate Your Apps from Parse to AWS
 
Digital media in the aws cloud, hugo lerias
Digital media in the aws cloud, hugo leriasDigital media in the aws cloud, hugo lerias
Digital media in the aws cloud, hugo lerias
 
Canonical AWS Summit London 2011
Canonical AWS Summit London 2011Canonical AWS Summit London 2011
Canonical AWS Summit London 2011
 
AWS for Start-ups - Case Study - Go Squared
AWS for Start-ups - Case Study - Go SquaredAWS for Start-ups - Case Study - Go Squared
AWS for Start-ups - Case Study - Go Squared
 
Webinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFrontWebinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFront
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDB
 
Using Security to Build with Confidence in AWS
Using Security to Build with Confidence in AWSUsing Security to Build with Confidence in AWS
Using Security to Build with Confidence in AWS
 
AWS Cloud School - London April 2012
AWS Cloud School - London April 2012AWS Cloud School - London April 2012
AWS Cloud School - London April 2012
 
“Spikey Workloads” Emergency Management in the Cloud
“Spikey Workloads” Emergency Management in the Cloud“Spikey Workloads” Emergency Management in the Cloud
“Spikey Workloads” Emergency Management in the Cloud
 

Similar a High Performance Computing with AWS

GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SCGIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
Jim Tochterman
 

Similar a High Performance Computing with AWS (20)

(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
 
Best Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in EnterprisesBest Practices for On-Demand HPC in Enterprises
Best Practices for On-Demand HPC in Enterprises
 
High Performance Computing Pitch Deck
High Performance Computing Pitch DeckHigh Performance Computing Pitch Deck
High Performance Computing Pitch Deck
 
Building a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsBuilding a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for Analysts
 
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWSArquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
Arquitetura Hibrida - Integrando seu Data Center com a Nuvem da AWS
 
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Cloud Economics: The Financial Case for Cloud Migration
Cloud Economics: The Financial Case for Cloud MigrationCloud Economics: The Financial Case for Cloud Migration
Cloud Economics: The Financial Case for Cloud Migration
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, Docker
 
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
AWS Summit London 2014 | Optimising TCO for the AWS Cloud (100)
 
High Performance Computing Implementation on AWS
High Performance Computing Implementation on AWSHigh Performance Computing Implementation on AWS
High Performance Computing Implementation on AWS
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWS
 
Intro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute ServicesIntro to AWS: EC2 & Compute Services
Intro to AWS: EC2 & Compute Services
 
High Performance Computing in AWS, Immersion Day Huntsville 2019
High Performance Computing in AWS, Immersion Day Huntsville 2019High Performance Computing in AWS, Immersion Day Huntsville 2019
High Performance Computing in AWS, Immersion Day Huntsville 2019
 
Analytics on AWS - IP Expo 2013
Analytics on AWS - IP Expo 2013Analytics on AWS - IP Expo 2013
Analytics on AWS - IP Expo 2013
 
High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...High Performance Computing on AWS: Driving Innovation without Infrastructure ...
High Performance Computing on AWS: Driving Innovation without Infrastructure ...
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Harness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and AvereHarness the Power of Hybrid Cloud with AWS and Avere
Harness the Power of Hybrid Cloud with AWS and Avere
 
GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SCGIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
GIS & Cloud Computing - GAASC 2010 Fall Summit - Florence, SC
 
Re invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionRe invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampion
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

High Performance Computing with AWS

  • 1. Jafar Shameem and David Pellerin High Performance Computing with AWS Business Development, HPC
  • 2. Migrate entire HPC applications and datacenters to the cloud Use cloud capabilities to create entirely new HPC applications Augment on-premise HPC resources with cloud capacity How are Organizations Using Cloud for HPC?
  • 3. • Security: Deploy applications and store data in a secure, highly configurable VPC environment • Agility: Deploy the right infrastructure for each technical computing job, at the right time • Scalability: Add and subtract servers in minutes to optimize time-to-results • Cost Savings: Pay only for what you use, don’t pay for idle or outdated servers Why AWS for High-Performance Computing?
  • 4. Waste User/Customer Dissatisfaction Actual demand Predicted Demand Rigid On-Premise Resources Elastic Resources Actual demand Resources scaled to demand AWS for Agility
  • 5. On-Demand Pay for compute capacity by the hour with no long-term commitments For spiky workloads, or to define needs Many purchase models to support different needs Reserved Make a low, one-time payment and receive a significant discount on the hourly charge For committed utilization Spot Bid for unused capacity, charged at a Spot Price which fluctuates based on supply and demand For time-insensitive or transient workloads Dedicated Launch instances within Amazon VPC that run on hardware dedicated to a single customer For highly sensitive or compliance related workloads Free Tier Get Started on AWS with free usage & no commitment For POCs and getting started
  • 6. Massive scale allows AWS to constantly reduce costs, while improving quality and reliability TCO of cloud is much lower then on-premise IT when all costs are considered Result? Large scale datacenter-to-cloud migrations are in progress every day AWS for Scale
  • 7. Scalable Computing: Go From Just One Instance…
  • 8. To Thousands… in Just Minutes!
  • 9. Memory (GiB) Small 1.7 GB, 1 EC2 Compute Unit 1 virtual core Micro 613 MB Up to 2 ECUs Large 7.5 GB 4 EC2 Compute Units 2 virtual cores Extra Large 15 GB 8 EC2 Compute Units 4 virtual cores Hi-Mem XL 17.1 GB 6.5 EC2 Compute Units 2 virtual cores Hi-Mem 2XL 34.2 GB 13 EC2 Compute Units 4 virtual cores Hi-Mem 4XL 68.4 GB 26 EC2 Compute Units 8 virtual cores High-CPU Med 1.7 GB 5 EC2 Compute Units 2 virtual cores High-CPU XL 7 GB 20 EC2 Compute Units 8 virtual cores Cluster GPU 4XL 22 GB 33.5 EC2 Compute Units, 2 x NVIDIA Tesla “Fermi” M2050 GPUs Cluster Compute 4XL 23 GB 33.5 EC2 Compute Units Medium 3.7 GB, 2 EC2 Compute Units 1 virtual core High I/O 4XL 60.5 GB, 35 EC2 Compute Units, 2*1024 GB SSD-based local instance storage High Storage 8XL 117 GB 35 EC2 Compute Units 24 * 2 TB instance store Cluster High Mem 8XL 89 EC2 Compute Units 244 GB SSD instance storage EC2 Compute Units Cluster Compute 8XL 60.5 GB 88 EC2 Compute Units Choose the Right Instance Type for the Job
  • 10. On-Premise Experiment infrequently Failure is expensive Less Innovation Cloud Experiment often Fail quickly at a low cost More Innovation $ Millions Nearly $0 AWS for Innovation
  • 11. s on innovation e the muck of infrastructure management to AWS http://eddie.niese.net/20090313/dont-pity-incompetence/
  • 12. • Engineering: CAD and CAE for aerospace, defense, structures, consumer products • Life Sciences: For basic research, drug discovery, genomics, and translational medicine • Energy and Geophysics: Including seismic processing, reservoir estimation, high-energy simulation, wind energy modeling, GIS • Financial Services and Insurance: Including valuation and risk analytics And Many More! HPC Applications Running on AWS Today
  • 13. HPC for Engineering Scalable Computing for CAD/CAE/EDA
  • 14. AWS for Engineering • Computer-Aided Design, Simulation, Analysis, Visualization – For development of commercial and military products – Aerospace, automotive, civil, construction, energy, others – Across industries, the trend is Simulation-Driven Design • Examples – Computer-Aided Design (CAD) including 3D models – Electronic Design Automation (EDA) – Computational Fluid Dynamics (CFD) – Finite Element Analysis (FEA) and Thermal Analysis – Crash Analysis – Failure and Hazard Analysis
  • 15. CFD for Turbine Engine Design • Time accurate fluid dynamics • SBIR-funded project for the US Air Force Research Laboratory (AFRL) • SAS 70 Type II certification and VPN-level access required • Additional security measures: – Uploaded and downloaded data was encrypted – Dedicated EC2 cluster instances were provisioned – Data was purged upon completion of the run “The results of this case were impressive. Using Amazon EC2 the large-scale, time accurate simulation was turned around in just 72 hours with computing infrastructure costs well below $1,000.” http://aws.amazon.com/solutions/case-studies/aerodynamic-solutions/
  • 16. • Commercial provider of mixed-signal ASICs for X-ray and gamma ray detection and imaging • Needs to perform very large Monte Carlo simulations using as many as 4000 server nodes • Computing workloads are highly variable, project-driven • Building an on-premise cluster to handle peak loads would be cost prohibitive • Solution: EC2 3rd-generation High-Memory instances • Up to 80% savings by using Spot instances on EC2 Radiation Simulation for ASIC Design
  • 17. 1) Customer Managed Application Hosting • Customer has account with AWS and manages infrastructure • Customer maintains traditional software vendor relationships • Software vendor offers license flexibility (BYOL) 2) Vendor Managed Hosting to Augment On-Premise Application • Client-Server model for acceleration of batch tasks • Customer pays software vendor for AWS-hosted services • Customer does not need to manage low-level infrastructure 3) Vendor Managed Software-as-a-Service • Pay-per-use, fully web-based including GUI Scenarios for Technical Software
  • 19. HPC for Life Sciences Customer Case-Studies
  • 20. And a rich history in Life Sciences
  • 21. AWS Public Data Sets • A centralized repository of public datasets • Seamless integration with cloud based applications • No charge to the community • Some of the datasets available today: – 1000 Genomes Project – Ensembl – GenBank – Illumina – Jay Flateley Human Genome Dataset – YRI Trio Dataset – The Cannabis Sativa Genome – UniGene – Influenza Virrus – PubChem • Tell us what else you’d like for us to host …
  • 22. Open Source ecosystem • NCBI BLAST • Crossbow • CloudBurst • Myrna • Clovr • BioPerl Max • VIPDAC • Superfamily • Cloud-Coffee • BioNimbus • GMOD • CloudAligner • CRdata • SeqWare • Blend • StormSeq • BioConductor Get links to AMIs at: https://github.com/mndoci/mndoci.github.com/wiki/Life-Science-Apps-on-AWS MIT StarCluster Sun Grid Engine Condor Torque Slurm Rocks Chef Puppet
  • 23. Number of Cluster nodes can be added depends on the computational needs
  • 24. Remove constraints Capex, operational skills, processing limitations Focus on the problem Not the technical challenges of large compute clusters Achieve more Perform bigger, more complex jobs in a much reduced time Iterate around the problem Do more and afford to take more risks as cost of experimentation reduced Why AWS?
  • 25. Data Transfer • AWS Import/Export – Move large amounts of data into and outside AWS – Data Migration, Content Distribution, DR, etc. • AWS Direct Connect – Secure private link to AWS – 1Gbps, 10Gbps connectivity – You can also co-locate hardware in AWS DX locations • Bandwidth Optimization Solutions – Commercial providers – Aspera, Riverbed, Attunity, etc. – Open Source – Tsunami UDP, Globus Online AWS Direct Connect AWS Import/Export
  • 26. Relational Database Service Fully managed database (MySQL, Oracle, MSSQL) DynamoDB NoSQL, Schemaless, Provisioned throughput database S3 Object datastore up to 5TB per object 99.999999999% durability SimpleDB NoSQL, Schemaless Smaller datasets Redshift Petabyte scale data warehousing service Fully managed Storage Options
  • 27. 1.3 Trillion 835k peak transactions per second Objects in S3
  • 28. Glacier Long term cold storage From $0.01 per GB/Month 99.999999999% durability Archival “Every day our genome sequencers produce terabytes of data. As our company moves into the clinical space, we face a legal requirement to archive patient data for years that would drastically raise the cost of storage. Thanks to Amazon Glacier’s secure and scalable solution, we will be able to provide cost-effective, long-term storage and thereby eliminate a barrier to providing whole genome sequencing for medical treatment of cancer and other genetic diseases.” - Keith Raffel, Senior Vice President and Chief Commercial Officer, Complete Genomics
  • 29. Elastic MapReduce Managed, elastic Hadoop cluster Integrates with S3 & DynamoDB Leverage Hive & Pig analytics scripts Integrates with instance types such as spot Application Services Feature Details Scalable Use as many or as few compute instances running Hadoop as you want. Modify the number of instances while your job flow is running Integrated with other services Works seamlessly with S3 as origin and output. Integrates with DynamoDB Comprehensive Supports languages such as Hive and Pig for defining analytics, and allows complex definitions in Cascading, Java, Ruby, Perl, Python, PHP, R, or C++ Cost effective Works with Spot instance types Monitoring Monitor job flows from with the management console Compute Storage AWS Global Infrastructure Database App Services Deployment & Administration Networking
  • 31. Crossbow • Align billions of reads and find SNPs – Reuse software components: Hadoop Streaming h" p://bowI eAbio.sourceforge.net/crossbow2 • Map: Bowtie (Langmead et al., 2009) – Find best alignment for each read – Emit (chromosome region, alignment) • Reduce: SOAPsnp (Li et al., 2009) – Scan alignments for divergent columns – Accounts for sequencing error, known SNPs • Shuffle: Hadoop – Group and sort alignments by region …2 …2 Searching for SNPs with Cloud Computing. Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Genome Biology. 10:R134 
  • 32. Worldwide research and development The Amazon Virtual Private Cloud was a unique option that offered an additional level of security and an ability to integrate with other aspects of our infrastructure. “AWS enables Pfizer’s WRD to explore specific difficult or deep scientific questions in a timely, scalable manner and helps Pfizer make better decisions more quickly” Dr. Michael Miller, Head of HPC for R&D, Pfizer
  • 33. Spiral Genetics • Alignment, Variant Calling, Annotation • Turnaround time – Targeted : less than 40 minutes – Exome : less than 2 hours – Whole Genome : less than 5 hours
  • 34. • Workflows can be easily defined and automated with integrated Galaxy Platform capabilities • Data movement is streamlined with integrated Globus file- transfer functionality • Resources can be provisioned on-demand with Amazon Web Services cloud based infrastructure Globus Genomics
  • 35. Proprietary and Confidential. ©2013 Syapse Syapse: Bringing Omics in Routine Medical Use Laboratory Testing Test Results Clinical Use Syapse Semantic Data Platform Syapse Omics Medical Record Application Syapse Physician Portal Application Syapse Discovery Application Syapse
  • 36. Leverage Spot instances in workflows 1 days worth of effort resulted in 50% savings in cost Harvard Medical School The Laboratory of Personal Medicine Run EC2 clusters to analyze entire genomes“The AWS solution is stable, robust, flexible, and low cost. It has everything to recommend it.” Dr. Peter Tonellato, LPM, Center for Biomedical Informatics, Harvard Medical School
  • 37. Illumina BaseSpace • Data Analysis – Alignment, Assembly, QC, Analysis • Share data with colleagues • Access high quality and diverse datasets
  • 38. We are here to help Enterprise support Trusted Advisor Professional Services Sales and Solutions Architects
  • 39. Thank You Jafar Shameem (shameemj@amazon.com) David Pellerin (dpelleri@amazon.com) http://aws.amazon.com/