SlideShare a Scribd company logo
1 of 53
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DAT204
How Thermo Fisher Is Reducing Mass
Spectrometry Experiment Times from Days to
Minutes with MongoDB & AWS
World leader in serving science
Revenues of $17 billion
50,000 employees
50 countries
A Mass Spectrometer tells you…
What’s in there and how much
Making the world healthier, cleaner and safer
Mars Organic Molecule
Analyzer (MOMA) will
take a modified Thermo
Linear Ion Trap Mass
Spectrometer to Mars
in 2020
What beer looks like in a mass spec
Demo
Instrument
MongoDB
MS Instrument
Connect
Demo: Instrument Connect
Demo: remote monitoring a mass spectrometer
Why does Thermo use MongoDB?
ThermoFisher apps using MongoDB
XML  MongoDB
Starting on MongoDBOracle  MongoDB
SQL Lite  MongoDB
Postgres  MongoDB
Amazon DynamoDB 
MongoDB Atlas
Scientific apps = humongous data
Big molecules = big data
instrument {
UserId : "dr.ennis@poldark.net",
MachineName : "TRACEFINDER8",
Location : "Austin",
AcquisitionStationName : "TSQ 8000",
LastErrorEventDate : "2016-09-05",
LastErrorEventValue : null,
RuntimeEstimate : {
MeasuredElaspedDuration : 0.21966,
Confidence : HighConfidence
},
RunManagerStatus : {
Status : "Acquire",
Sequence : "Testosterone",
SampleName : "Drugx",
VialPosition : "1",
Rawfile : "2pg_161029205505",
Instmethod : "1x.meth",
Instrument : "TSQ 8000",
IsPaused : false,
Operator : "Fred",
}
}
Why MongoDB was chosen
• Performance
• Developer productivity
• Cost effective
• Runs anywhere
• Rich feature set
• Achieved legal and regulatory approval
MongoDB is a Swiss army knife
• Hierarchical data
• Relational data
• Queues
• File storage
• Device state
Join example
• Version 3.2 introduced the $lookup operator
• SQL query
• MongoDB C# driver query
MongoDB has caught
up to relational DBs
Notably, we show that the MUPG (match,
unwind, project, group) fragment is
already at least as expressive as full
relational algebra over (the relational view
of) a single collection, and in particular
able to express arbitrary joins.
– Bolzano University in Italy
Hash-Based Sharding
Roles
Kerberos
On-Prem Monitoring
2.4
GA 2013
2.6
GA 2014
3.0
GA 2015
3.2
GA 2015
Headline Features by Release
$out
Index Intersection
Text Search
Field-Level Redaction
LDAP & x509
Auditing
Document Validation
$lookup
Fast Failover
Simpler Scalability
Aggregation ++
Encryption At Rest
In-Memory Storage
Engine
BI Connector
MongoDB Compass
APM Integration
Profiler Visualization
Auto Index Builds
Backups to File
System
Doc-Level
Concurrency
Compression
Wired Tiger Storage
≤50 replicas
Auditing ++
Ops Manager
Linearizable reads
Intra-cluster compression
Views
Log Redaction
Graph Processing
Decimal
Collations
Faceted Navigation
Spark Connector ++
Zones ++
Aggregation ++
Auto-balancing ++
ARM, Power, zSeries
BI Connector ++
Compass ++
Hardware Monitoring
Server Pool
LDAP Authorization
Encrypted Backups
Cloud Foundry Integration
3.4
GA 2016Atlas
The evolution of MongoDB
1.0
2009
MySQL vs. MongoDB
Database schema
MySQL
schema
MongoDB
schema
Inserting data: MongoDB vs. MySQL
• Inserting 1,615 chemical compound records into two parent-child tables.
• To optimize the MySQL query, we turned off foreign keys during insert and
used a string builder to create a bulk insert SQL statement. This improved
insert performance by a factor of 360.
• Compare to MongoDB.
Database Milliseconds Lines of code
MySQL not optimized 147,600 (2.5 minutes) 21
MySQL optimized 410 40
MongoDB 68 1
Inserting data: MongoDB vs. MySQL
Selecting data: MongoDB vs. MySQL
• Query 600,000 rows of SampleCompound result data
• To optimize the MySQL select query, we created a dictionary to lookup child
records for each parent, this improved performance by a factor of 300,
optimization effort: 2 engineers and 2 weeks.
Database Seconds Lines of code
MySQL not optimized 2,400 (4.1 minutes) 20
MySQL optimized 8.2 29
MongoDB 17.5 7
Update: MongoDB vs. MySQL
Migrating to MongoDB reduced code by 3.5x
SQLite MongoDB
Data Layer Lines of Code 4271 1260
MongoDB compared to DynamoDB
MongoDB DynamoDB
Anywhere AWS
Rich Ad-hoc Query Language + IDE No Ad-hoc query language
Many operators (Joins, Aggregation, etc.) Fewer operators
Excellent Performance Excellent Performance
Easy to deploy (with Atlas) Easy to Deploy each table
Adding tables requires no configuration
changes
Adding tables requires additional configuration
and cost
Easy to use from AWS services but not
natively integrated
Native integration with AWS Services: IAM,
VPC, Lambda, Kinesis
Released in 2009 Released in 2012
MongoDB vs. S3 performance
Download 220 KB object from MongoDB was 7x faster cold, and 3x faster when warm
MongoDB Amazon S3
Retrieve document first time
68 ms 468 ms
Retrieve document second time 13 ms 38 ms
MongoDB vs. S3 performance
MongoDB 11x faster than S3 in the use case of partial document loading
MongoDB S3
Data size 400 Bytes 2.1 MB
Performance 19 ms 214 ms
Reducing processing from
days to minutes
Frameworks used to parallelize algorithms
• AWS Lambda
• Docker and Amazon ECS
• Spark and Elastic Map Reduce
Parallel data processing
Why Atlas?
• Easy
• Performant
• Seamless Migration
• Robust
• No downtime, even when scaling up
Building MongoDB Atlas
on Amazon Web Services
Operations burden
PATCHES
UPGRADES
SECURITY
BACKUPS
RECOVERY
99.999% UPTIME
UPSCALE
DOWNSCALE
PERFORMANCE
UAT
STAGING
MONITORING
ALERTS
PROVISION
CONFIGURE
INSTALL
Automated Available On-Demand
Secure Highly Available Automated Backups
Elastically Scalable
Database as a service for MongoDB
Fully managed MongoDB clusters
Customer only needs to choose the
shape and size of the cluster
● Instance size (CPU and RAM)
● Replication factor
● Number of shards
● Disk space
● Disk speed
Screenshot of create dialog
Cluster features
VPC peering
IP address whitelist
SCRAM-SHA-1 authentication
readWriteAnyDatabase
enableSharding
clusterMonitor
SSL
Using well-known CA
Trust system CAs by default
Security features
Backup AutomationMonitoring
Key components
AWS Account X—Region Y
VPC (Customer N)
Availability Zone A Availability Zone B Availability Zone C
Subnet A Subnet B Subnet C
mongod—27017 mongod—27017 mongod—27017
Customer container with replica set
AWS Account X—Region Y
VPC (Customer N)
Availability Zone A Availability Zone B Availability Zone C
Subnet A Subnet B Subnet C
Customer container with sharded cluster
shard0
S
shard1
S
shard2 config
shard0
S
shard1
S
shard2 config
shard0
S
shard1
S
shard2 config
mongod—27017 mongod—27017 mongod—27017
One security group per VPC applied to
all Amazon EC2 instances
Three classes of security rules:
● MongoDB traffic between cluster
members
● MongoDB traffic between application
and clusters
● SSH traffic between production
support jump box and EC2 instance
App Server Jump Box
IP firewall using security groups
173.31.248.0/21
10.0.0.0/16
VPC peering
Your VPC
Elastic LB
CIDR Block: 10.0.0.0/16
Atlas VPC
AZ 1 AZ 2 AZ 3
CIDR Block: 172.31.248.0/21
We want prime to
be such a good
value, you’d be
irresponsible not
to be a member.
—Jeff Bezos
Migrate to MongoDB Atlas today!
Use promo code
getAtlas
*$100 Value
Questions?
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
Amazon Web Services
 

What's hot (20)

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
AWS re:Invent 2016: From EC2 to ECS: How Capital One uses Application Load Ba...
AWS re:Invent 2016: From EC2 to ECS: How Capital One uses Application Load Ba...AWS re:Invent 2016: From EC2 to ECS: How Capital One uses Application Load Ba...
AWS re:Invent 2016: From EC2 to ECS: How Capital One uses Application Load Ba...
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
 
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
 
NEW LAUNCH! Developing Serverless C# Applications
NEW LAUNCH! Developing Serverless C# ApplicationsNEW LAUNCH! Developing Serverless C# Applications
NEW LAUNCH! Developing Serverless C# Applications
 
Amazon ECS with Docker | AWS Public Sector Summit 2016
Amazon ECS with Docker | AWS Public Sector Summit 2016Amazon ECS with Docker | AWS Public Sector Summit 2016
Amazon ECS with Docker | AWS Public Sector Summit 2016
 
Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015Scaling the Platform for Your Startup - Startup Talks June 2015
Scaling the Platform for Your Startup - Startup Talks June 2015
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECS
 
HSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWSHSBC and AWS Day - Database Options on AWS
HSBC and AWS Day - Database Options on AWS
 
Running Relational Databases on AWS
Running Relational Databases on AWS  Running Relational Databases on AWS
Running Relational Databases on AWS
 
Reducing Latency and Increasing Performance while Cutting Infrastructure Costs
Reducing Latency and Increasing Performance while Cutting Infrastructure CostsReducing Latency and Increasing Performance while Cutting Infrastructure Costs
Reducing Latency and Increasing Performance while Cutting Infrastructure Costs
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
 
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
 
數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣數據庫遷移到雲端的成功秘訣
數據庫遷移到雲端的成功秘訣
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWS
 
AWS re:Invent 2016: Using AWS Lambda to Build Control Systems for Your AWS In...
AWS re:Invent 2016: Using AWS Lambda to Build Control Systems for Your AWS In...AWS re:Invent 2016: Using AWS Lambda to Build Control Systems for Your AWS In...
AWS re:Invent 2016: Using AWS Lambda to Build Control Systems for Your AWS In...
 
Advanced AWS techniques from the trenches of the Enterprise – Sourced Group
Advanced AWS techniques from the trenches of the Enterprise – Sourced GroupAdvanced AWS techniques from the trenches of the Enterprise – Sourced Group
Advanced AWS techniques from the trenches of the Enterprise – Sourced Group
 
Running Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWSRunning Containerised Applications at Scale on AWS
Running Containerised Applications at Scale on AWS
 
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
 

Viewers also liked

12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
CPSA-2012_5-Minutes-Fame
 
Case Study: Thermo fisher
Case Study: Thermo fisherCase Study: Thermo fisher
Case Study: Thermo fisher
OpSource
 
Fisher Scientific Green Presentation
Fisher Scientific Green PresentationFisher Scientific Green Presentation
Fisher Scientific Green Presentation
Krishna Patel
 
Thermo Fisher Scientific Inc
Thermo Fisher Scientific IncThermo Fisher Scientific Inc
Thermo Fisher Scientific Inc
Mesfin Symons
 
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャNoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
Yasuhiro Matsuo
 

Viewers also liked (20)

MongoDB and AWS: Integrations
MongoDB and AWS: IntegrationsMongoDB and AWS: Integrations
MongoDB and AWS: Integrations
 
12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
12 2.6 million users can't be wrong keeley murphy and nick duczak - thermo sc...
 
3 Ways the New Thermo Scientific LC MS Triple Quads Improve Residue Analysis
3 Ways the New Thermo Scientific LC MS Triple Quads Improve Residue Analysis3 Ways the New Thermo Scientific LC MS Triple Quads Improve Residue Analysis
3 Ways the New Thermo Scientific LC MS Triple Quads Improve Residue Analysis
 
Case Study: Thermo fisher
Case Study: Thermo fisherCase Study: Thermo fisher
Case Study: Thermo fisher
 
Fisher Scientific Green Presentation
Fisher Scientific Green PresentationFisher Scientific Green Presentation
Fisher Scientific Green Presentation
 
Investor & Analyst Day 2015: Lung Cancer Pipeline (7/8)
Investor & Analyst Day 2015: Lung Cancer Pipeline (7/8)Investor & Analyst Day 2015: Lung Cancer Pipeline (7/8)
Investor & Analyst Day 2015: Lung Cancer Pipeline (7/8)
 
Thermo Fisher Introduction
Thermo Fisher IntroductionThermo Fisher Introduction
Thermo Fisher Introduction
 
We've moved!
We've moved!We've moved!
We've moved!
 
Chromatography: Optimize Helium Usage with the Thermo Scientific Helium Saver...
Chromatography: Optimize Helium Usage with the Thermo Scientific Helium Saver...Chromatography: Optimize Helium Usage with the Thermo Scientific Helium Saver...
Chromatography: Optimize Helium Usage with the Thermo Scientific Helium Saver...
 
Thermo Fisher Scientific Inc
Thermo Fisher Scientific IncThermo Fisher Scientific Inc
Thermo Fisher Scientific Inc
 
Comparison of Type and Time of Fixation on Tissue DNA Sequencing Results
Comparison of Type and Time of Fixation on Tissue DNA Sequencing ResultsComparison of Type and Time of Fixation on Tissue DNA Sequencing Results
Comparison of Type and Time of Fixation on Tissue DNA Sequencing Results
 
Recombinant Expression and Purification of Aedes aegypti Midgut Serine Protea...
Recombinant Expression and Purification of Aedes aegypti Midgut Serine Protea...Recombinant Expression and Purification of Aedes aegypti Midgut Serine Protea...
Recombinant Expression and Purification of Aedes aegypti Midgut Serine Protea...
 
Data Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS SystemsData Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS Systems
 
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャNoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
NoSQL on AWSで作る最新ソーシャルゲームアーキテクチャ
 
Thermofisher scientific ltd- organisational behaviour
Thermofisher scientific ltd- organisational behaviourThermofisher scientific ltd- organisational behaviour
Thermofisher scientific ltd- organisational behaviour
 
Computational Methods for detection of somatic mutations at 0.1% frequency fr...
Computational Methods for detection of somatic mutations at 0.1% frequency fr...Computational Methods for detection of somatic mutations at 0.1% frequency fr...
Computational Methods for detection of somatic mutations at 0.1% frequency fr...
 
SmashFly Transform: How Storytelling Transformed Thermo Fisher's Employer Brand
SmashFly Transform: How Storytelling Transformed Thermo Fisher's Employer BrandSmashFly Transform: How Storytelling Transformed Thermo Fisher's Employer Brand
SmashFly Transform: How Storytelling Transformed Thermo Fisher's Employer Brand
 
SSMでマネージドEC2 #reinvent #cmdevio
SSMでマネージドEC2 #reinvent #cmdevioSSMでマネージドEC2 #reinvent #cmdevio
SSMでマネージドEC2 #reinvent #cmdevio
 
AWS - Windowsアップデート re:Invent & Windows Server 2016
AWS - Windowsアップデートre:Invent & Windows Server 2016AWS - Windowsアップデートre:Invent & Windows Server 2016
AWS - Windowsアップデート re:Invent & Windows Server 2016
 
Best Practices for Running MongoDB on AWS - AWS May 2016 Webinar Series
Best Practices for Running MongoDB on AWS - AWS May 2016 Webinar SeriesBest Practices for Running MongoDB on AWS - AWS May 2016 Webinar Series
Best Practices for Running MongoDB on AWS - AWS May 2016 Webinar Series
 

Similar to AWS re:Invent 2016: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes with MongoDB & AWS (DAT204)

devworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentationdevworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentation
Alex Wu
 

Similar to AWS re:Invent 2016: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes with MongoDB & AWS (DAT204) (20)

How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
 
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
How Thermo Fisher is Reducing Data Analysis Times from Days to Minutes with M...
 
AWS re:Invent 2016 Day 2 Keynote re:Cap
AWS re:Invent 2016 Day 2 Keynote re:CapAWS re:Invent 2016 Day 2 Keynote re:Cap
AWS re:Invent 2016 Day 2 Keynote re:Cap
 
AWS re:Invent 2016 Day 2 Keynote re:Cap
AWS re:Invent 2016 Day 2 Keynote re:CapAWS re:Invent 2016 Day 2 Keynote re:Cap
AWS re:Invent 2016 Day 2 Keynote re:Cap
 
How leading financial services organisations are winning with tech
How leading financial services organisations are winning with techHow leading financial services organisations are winning with tech
How leading financial services organisations are winning with tech
 
AWS Partner ConneXions Taiwan - Q3 2016 Technology Update
AWS Partner ConneXions Taiwan - Q3 2016 Technology UpdateAWS Partner ConneXions Taiwan - Q3 2016 Technology Update
AWS Partner ConneXions Taiwan - Q3 2016 Technology Update
 
Architetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS LambdaArchitetture serverless e pattern avanzati per AWS Lambda
Architetture serverless e pattern avanzati per AWS Lambda
 
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
Astroinformatics 2014: Scientific Computing on the Cloud with Amazon Web Serv...
 
Build A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersBuild A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million Users
 
Best Practices Scaling Web Application Up to Your First 10 Million Users
Best Practices Scaling Web Application Up to Your First 10 Million UsersBest Practices Scaling Web Application Up to Your First 10 Million Users
Best Practices Scaling Web Application Up to Your First 10 Million Users
 
Keynote sp summit 2014 final
Keynote sp summit 2014  finalKeynote sp summit 2014  final
Keynote sp summit 2014 final
 
Deep Dive on Microservices and Docker
Deep Dive on Microservices and DockerDeep Dive on Microservices and Docker
Deep Dive on Microservices and Docker
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Solved: Your Most Dreaded Test Environment Management Challenges
Solved: Your Most Dreaded Test Environment Management ChallengesSolved: Your Most Dreaded Test Environment Management Challenges
Solved: Your Most Dreaded Test Environment Management Challenges
 
Aws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon ElishaAws-What You Need to Know_Simon Elisha
Aws-What You Need to Know_Simon Elisha
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
devworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentationdevworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentation
 
MongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and KubernetesMongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
 
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
AWS re:Invent 2016: ↑↑↓↓←→←→ BA Lambda Start (SVR305)
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Recently uploaded (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

AWS re:Invent 2016: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes with MongoDB & AWS (DAT204)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. DAT204 How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes with MongoDB & AWS
  • 2. World leader in serving science Revenues of $17 billion 50,000 employees 50 countries
  • 3. A Mass Spectrometer tells you… What’s in there and how much
  • 4.
  • 5. Making the world healthier, cleaner and safer
  • 6. Mars Organic Molecule Analyzer (MOMA) will take a modified Thermo Linear Ion Trap Mass Spectrometer to Mars in 2020
  • 7.
  • 8. What beer looks like in a mass spec
  • 9.
  • 10.
  • 11. Demo
  • 13. Demo: remote monitoring a mass spectrometer
  • 14. Why does Thermo use MongoDB?
  • 15. ThermoFisher apps using MongoDB XML  MongoDB Starting on MongoDBOracle  MongoDB SQL Lite  MongoDB Postgres  MongoDB Amazon DynamoDB  MongoDB Atlas
  • 16. Scientific apps = humongous data
  • 17. Big molecules = big data
  • 18. instrument { UserId : "dr.ennis@poldark.net", MachineName : "TRACEFINDER8", Location : "Austin", AcquisitionStationName : "TSQ 8000", LastErrorEventDate : "2016-09-05", LastErrorEventValue : null, RuntimeEstimate : { MeasuredElaspedDuration : 0.21966, Confidence : HighConfidence }, RunManagerStatus : { Status : "Acquire", Sequence : "Testosterone", SampleName : "Drugx", VialPosition : "1", Rawfile : "2pg_161029205505", Instmethod : "1x.meth", Instrument : "TSQ 8000", IsPaused : false, Operator : "Fred", } } Why MongoDB was chosen • Performance • Developer productivity • Cost effective • Runs anywhere • Rich feature set • Achieved legal and regulatory approval
  • 19. MongoDB is a Swiss army knife • Hierarchical data • Relational data • Queues • File storage • Device state
  • 20. Join example • Version 3.2 introduced the $lookup operator • SQL query • MongoDB C# driver query
  • 21. MongoDB has caught up to relational DBs Notably, we show that the MUPG (match, unwind, project, group) fragment is already at least as expressive as full relational algebra over (the relational view of) a single collection, and in particular able to express arbitrary joins. – Bolzano University in Italy
  • 22. Hash-Based Sharding Roles Kerberos On-Prem Monitoring 2.4 GA 2013 2.6 GA 2014 3.0 GA 2015 3.2 GA 2015 Headline Features by Release $out Index Intersection Text Search Field-Level Redaction LDAP & x509 Auditing Document Validation $lookup Fast Failover Simpler Scalability Aggregation ++ Encryption At Rest In-Memory Storage Engine BI Connector MongoDB Compass APM Integration Profiler Visualization Auto Index Builds Backups to File System Doc-Level Concurrency Compression Wired Tiger Storage ≤50 replicas Auditing ++ Ops Manager Linearizable reads Intra-cluster compression Views Log Redaction Graph Processing Decimal Collations Faceted Navigation Spark Connector ++ Zones ++ Aggregation ++ Auto-balancing ++ ARM, Power, zSeries BI Connector ++ Compass ++ Hardware Monitoring Server Pool LDAP Authorization Encrypted Backups Cloud Foundry Integration 3.4 GA 2016Atlas The evolution of MongoDB 1.0 2009
  • 25. Inserting data: MongoDB vs. MySQL • Inserting 1,615 chemical compound records into two parent-child tables. • To optimize the MySQL query, we turned off foreign keys during insert and used a string builder to create a bulk insert SQL statement. This improved insert performance by a factor of 360. • Compare to MongoDB. Database Milliseconds Lines of code MySQL not optimized 147,600 (2.5 minutes) 21 MySQL optimized 410 40 MongoDB 68 1
  • 27. Selecting data: MongoDB vs. MySQL • Query 600,000 rows of SampleCompound result data • To optimize the MySQL select query, we created a dictionary to lookup child records for each parent, this improved performance by a factor of 300, optimization effort: 2 engineers and 2 weeks. Database Seconds Lines of code MySQL not optimized 2,400 (4.1 minutes) 20 MySQL optimized 8.2 29 MongoDB 17.5 7
  • 29. Migrating to MongoDB reduced code by 3.5x SQLite MongoDB Data Layer Lines of Code 4271 1260
  • 30. MongoDB compared to DynamoDB MongoDB DynamoDB Anywhere AWS Rich Ad-hoc Query Language + IDE No Ad-hoc query language Many operators (Joins, Aggregation, etc.) Fewer operators Excellent Performance Excellent Performance Easy to deploy (with Atlas) Easy to Deploy each table Adding tables requires no configuration changes Adding tables requires additional configuration and cost Easy to use from AWS services but not natively integrated Native integration with AWS Services: IAM, VPC, Lambda, Kinesis Released in 2009 Released in 2012
  • 31. MongoDB vs. S3 performance Download 220 KB object from MongoDB was 7x faster cold, and 3x faster when warm MongoDB Amazon S3 Retrieve document first time 68 ms 468 ms Retrieve document second time 13 ms 38 ms
  • 32. MongoDB vs. S3 performance MongoDB 11x faster than S3 in the use case of partial document loading MongoDB S3 Data size 400 Bytes 2.1 MB Performance 19 ms 214 ms
  • 34. Frameworks used to parallelize algorithms • AWS Lambda • Docker and Amazon ECS • Spark and Elastic Map Reduce
  • 36. Why Atlas? • Easy • Performant • Seamless Migration • Robust • No downtime, even when scaling up
  • 37. Building MongoDB Atlas on Amazon Web Services
  • 39. Automated Available On-Demand Secure Highly Available Automated Backups Elastically Scalable Database as a service for MongoDB
  • 40. Fully managed MongoDB clusters Customer only needs to choose the shape and size of the cluster ● Instance size (CPU and RAM) ● Replication factor ● Number of shards ● Disk space ● Disk speed Screenshot of create dialog Cluster features
  • 41. VPC peering IP address whitelist SCRAM-SHA-1 authentication readWriteAnyDatabase enableSharding clusterMonitor SSL Using well-known CA Trust system CAs by default Security features
  • 43. AWS Account X—Region Y VPC (Customer N) Availability Zone A Availability Zone B Availability Zone C Subnet A Subnet B Subnet C mongod—27017 mongod—27017 mongod—27017 Customer container with replica set
  • 44. AWS Account X—Region Y VPC (Customer N) Availability Zone A Availability Zone B Availability Zone C Subnet A Subnet B Subnet C Customer container with sharded cluster shard0 S shard1 S shard2 config shard0 S shard1 S shard2 config shard0 S shard1 S shard2 config
  • 45. mongod—27017 mongod—27017 mongod—27017 One security group per VPC applied to all Amazon EC2 instances Three classes of security rules: ● MongoDB traffic between cluster members ● MongoDB traffic between application and clusters ● SSH traffic between production support jump box and EC2 instance App Server Jump Box IP firewall using security groups
  • 46. 173.31.248.0/21 10.0.0.0/16 VPC peering Your VPC Elastic LB CIDR Block: 10.0.0.0/16 Atlas VPC AZ 1 AZ 2 AZ 3 CIDR Block: 172.31.248.0/21
  • 47.
  • 48. We want prime to be such a good value, you’d be irresponsible not to be a member. —Jeff Bezos
  • 49. Migrate to MongoDB Atlas today! Use promo code getAtlas *$100 Value
  • 50.