SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Optimizing MongoDB:
Lessons Learned at Localytics

          Andrew Rollins
            June 2011
           MongoNYC
Me

•   Email: my first name @ localytics.com
•   twitter.com/andrew311
•   andrewrollins.com
•   Founder, Chief Software Architect at Localytics
Localytics

• Real time analytics for mobile applications
• Built on:
  –   Scala
  –   MongoDB
  –   Amazon Web Services
  –   Ruby on Rails
  –   and more…
Why I‟m here: brain dump!

• To share tips, tricks, and gotchas about:
  –   Documents
  –   Indexes
  –   Fragmentation
  –   Migrations
  –   Hardware
  –   MongoDB on AWS
• Basic to more advanced, a compliment to
  MongoDB Perf Tuning at MongoSF 2011
MongoDB at Localytics

• Use cases:
  – Anonymous loyalty information
  – De-duplication of incoming data
• Requirements:
  – High throughput
  – Add capacity without long down-time
• Scale today:
  – Over 1 billion events tracked in May
  – Thousands of MongoDB operations a second
Why MongoDB?

•   Stability
•   Community
•   Support
•   Drivers
•   Ease of use
•   Feature rich
•   Scale out
OPTIMIZE YOUR DATA
Documents and indexes
Shorten names

Bad:
{
    super_happy_fun_awesome_name: “yay!”
}

Good:
{
    s: “yay!”
}
Use BinData for UUIDs/hashes

Bad:
{
    u: “21EC2020-3AEA-1069-A2DD-08002B30309D”,
    // 36 bytes plus field overhead
}

 Good:
{
    u: BinData(0, “…”),
    // 16 bytes plus field overhead
}
Override _id

Turn this
{
    _id : ObjectId("47cc67093475061e3d95369d"),
    u: BinData(0, “…”) // <- this is uniquely indexed
 }
 into
{
    _id : BinData(0, “…”) // was the u field
}

Eliminated an extra index, but be careful about
locality... (more later, see Further Reading at end)
Pack „em in

• Look for cases where you can squish multiple
  “records” into a single document.
• Why?
  – Decreases number of index entries
  – Brings documents closer to the size of a page,
    alleviating potential fragmentation
• Example: comments for a blog post.
Prefix Indexes
Suppose you have an index on a large field, but that field doesn‟t have
many possible values. You can use a “prefix index” to greatly decrease
index size.

find({k: <kval>})
{
    k: BinData(0, “…”),   // 32 byte SHA256, indexed
 }
into find({p: <prefix>, k: <kval>})
{
    k: BinData(0, “…”),   // 28 byte SHA256 suffix, not indexed
    p: <32-bit integer>   // first 4 bytes of k packed in integer, indexed
}

Example: git commits
FRAGMENTATION AND MIGRATION
Hidden evils
Fragmentation

• Data on disk is memory mapped into RAM.
• Mapped in pages (4KB usually).
• Deletes/updates will cause memory
  fragmentation.


    Disk                        RAM
    doc1                        doc1
                 find(doc1)                 Page
   deleted                     deleted
     …                           …
New writes mingle with old data

                     Data
                     doc1
                                  Page
Write docX           docX
                     doc3
                     doc4         Page
                     doc5

find(docX) also pulls in old doc1, wasting RAM
Dealing with fragmentation

• “mongod --repair” on a secondary, swap with
  primary.
• 1.9 has in-place compaction, but this still holds a
  write-lock.
• MongoDB will auto-pad records.
• Pad records yourself by including and then
  removing extra bytes on first insert.
   – Alternative offered in SERVER-1810.
The Dark Side of Migrations

• Chunks are a logical construct, not physical.
• Shard keys have serious implications.
• What could go wrong?
  – Let‟s run through an example.
Suppose the following

     Chunk 1     • K is the shard key
     k: 1 to 5
                 • K is random
     Chunk 2
     k: 6 to 9


     Shard 1
    {k: 3, …}     1st write
    {k: 9, …}     2nd write
    {k: 1, …}     and so on
    {k: 7, …}
    {k: 2, …}
    {k: 8, …}
Migrate

     Chunk 1                 Chunk 1
     k: 1 to 5               k: 1 to 5

     Chunk 2
     k: 6 to 9


    Shard 1
                             Shard 2
    {k: 3, …}
                             {k: 3, …}
    {k: 9, …}    Random IO
                             {k: 1, …}
    {k: 1, …}
                             {k: 2, …}
    {k: 7, …}
    {k: 2, …}
    {k: 8, …}
Shard 1 is now heavily fragmented

     Chunk 1                  Chunk 1
     k: 1 to 5                k: 1 to 5

     Chunk 2
     k: 6 to 9


     Shard 1
                              Shard 2
     {k: 3, …}
                              {k: 3, …}
     {k: 9, …}
                              {k: 1, …}
     {k: 1, …}   Fragmented
                              {k: 2, …}
     {k: 7, …}
     {k: 2, …}
     {k: 8, …}
Why is this scenario bad?

• Random reads
• Massive fragmentation
• New writes mingle with old data
How can we avoid bad migrations?

• Pre-split, pre-chunk
• Better shard keys for better locality
   – Ideally where data in the same chunk tends to be in
     the same region of disk
Pre-split and move

• If you know your key distribution, then pre-create
  your chunks and assign them.
• See this:
  – http://blog.zawodny.com/2011/03/06/mongodb-pre-
    splitting-for-faster-data-loading-and-importing/
Better shard keys

• Usually means including a time prefix in your
  shard key (e.g., {day: 100, id: X})
• Beware of write hotspots
• How to Choose a Shard Key
  – http://www.snailinaturtleneck.com/blog/2011/01/04/ho
    w-to-choose-a-shard-key-the-card-game/
OPTIMIZING HARDWARE/CLOUD
Working Set in RAM
• EC2 m2.2xlarge, RAID0 setup with 16 EBS volumes.
• Workers hammering MongoDB with this loop, growing data:
   – Loop { insert 500 byte record; find random record }
• Thousands of ops per second when in RAM
• Much less throughput when working set (in this case, all data
  and index) grows beyond RAM.
                     Ops per second over time
                                                           In RAM



                                                           Not In RAM
Pre-fetch

• Updates hold a lock while they fetch the original
  from disk.
• Instead do a read to warm the doc in RAM under
  a shared read lock, then update.
Shard per core

• Instead of a shard per server, try a shard per
  core.
• Use this strategy to overcome write locks when
  writes per second matter.
• Why? Because MongoDB has one big write lock.
Amazon EC2

• High throughput / small working set
  – RAM matters, go with high memory instances.
• Low throughput / large working set
  –   Ephemeral storage might be OK.
  –   Remember that EBS IO goes over Ethernet.
  –   Pay attention to IO wait time (iostat).
  –   Your only shot at consistent perf: use the biggest
      instances in a family.
• Read this:
  – http://perfcap.blogspot.com/2011/03/understanding-
    and-using-amazon-ebs.html
Amazon EBS

• ~200 seeks per second per EBS on a good day
• EBS has *much* better random IO perf than
  ephemeral, but adds a dependency
• Use RAID0
• Check out this benchmark:
  – http://orion.heroku.com/past/2009/7/29/io_performanc
    e_on_ebs/
• To understand how to monitor EBS:
  – https://forums.aws.amazon.com/thread.jspa?messag
    eID=124044
Further Reading
•   MongoDB Performance Tuning
     – http://www.scribd.com/doc/56271132/MongoDB-Performance-Tuning
•   Monitoring Tips
     – http://blog.boxedice.com/mongodb-monitoring/
•   Markus‟ manual
     – http://www.markus-gattol.name/ws/mongodb.html
•   Helpful/interesting blog posts
     – http://nosql.mypopescu.com/tagged/mongodb/
•   MongoDB on EC2
     – http://www.slideshare.net/jrosoff/mongodb-on-ec2-and-ebs
•   EC2 and Ephemeral Storage
     – http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws-
       ec2.html
•   MongoDB Strategies for the Disk Averse
     – http://engineering.foursquare.com/2011/02/09/mongodb-strategies-for-the-disk-averse/
•   MongoDB Perf Tuning at MongoSF 2011
     – http://www.scribd.com/doc/56271132/MongoDB-Performance-Tuning
Thank you.

• Check out Localytics for mobile analytics!
• Reach me at:
  – Email: my first name @ localytics.com
  – twitter.com/andrew311
  – andrewrollins.com

Más contenido relacionado

La actualidad más candente

Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Amazon Web Services
 
Centralised logging with ELK stack
Centralised logging with ELK stackCentralised logging with ELK stack
Centralised logging with ELK stackSimon Hanmer
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkDataWorks Summit
 
Introduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XIDIntroduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XIDPGConf APAC
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixRajeshbabu Chintaguntla
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simonlucenerevolution
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
The roadmap for sql server 2019
The roadmap for sql server 2019The roadmap for sql server 2019
The roadmap for sql server 2019Javier Villegas
 
PostgreSQL for Oracle Developers and DBA's
PostgreSQL for Oracle Developers and DBA'sPostgreSQL for Oracle Developers and DBA's
PostgreSQL for Oracle Developers and DBA'sGerger
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
High Performance Haskell
High Performance HaskellHigh Performance Haskell
High Performance HaskellHarendra Kumar
 
Understanding and Improving Code Generation
Understanding and Improving Code GenerationUnderstanding and Improving Code Generation
Understanding and Improving Code GenerationDatabricks
 
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Amazon Web Services
 
Hive partitioning best practices
Hive partitioning  best practicesHive partitioning  best practices
Hive partitioning best practicesNabeel Moidu
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...SANG WON PARK
 
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012Treasure Data, Inc.
 

La actualidad más candente (20)

Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018
Deep Dive on PostgreSQL Databases on Amazon RDS (DAT324) - AWS re:Invent 2018
 
Centralised logging with ELK stack
Centralised logging with ELK stackCentralised logging with ELK stack
Centralised logging with ELK stack
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache Spark
 
Introduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XIDIntroduction to Vacuum Freezing and XID
Introduction to Vacuum Freezing and XID
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache Phoenix
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer SimonDocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
DocValues aka. Column Stride Fields in Lucene 4.0 - By Willnauer Simon
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
The roadmap for sql server 2019
The roadmap for sql server 2019The roadmap for sql server 2019
The roadmap for sql server 2019
 
PostgreSQL for Oracle Developers and DBA's
PostgreSQL for Oracle Developers and DBA'sPostgreSQL for Oracle Developers and DBA's
PostgreSQL for Oracle Developers and DBA's
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
High Performance Haskell
High Performance HaskellHigh Performance Haskell
High Performance Haskell
 
Amazon EMR Masterclass
Amazon EMR MasterclassAmazon EMR Masterclass
Amazon EMR Masterclass
 
Understanding and Improving Code Generation
Understanding and Improving Code GenerationUnderstanding and Improving Code Generation
Understanding and Improving Code Generation
 
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
 
Hive partitioning best practices
Hive partitioning  best practicesHive partitioning  best practices
Hive partitioning best practices
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012
Fluentd loves MongoDB, at MongoDB SV User Group, July 17, 2012
 

Destacado

PEO 101: What You Need To Know!
PEO 101: What You Need To Know!PEO 101: What You Need To Know!
PEO 101: What You Need To Know!ADP, LLC
 
The Top Six Early Detection and Action Must-Haves for Improving Outcomes
The Top Six Early Detection and Action Must-Haves for Improving OutcomesThe Top Six Early Detection and Action Must-Haves for Improving Outcomes
The Top Six Early Detection and Action Must-Haves for Improving OutcomesHealth Catalyst
 
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Legacy Typesafe (now Lightbend)
 
A Celebration Of Women In Marketing
A Celebration Of Women In MarketingA Celebration Of Women In Marketing
A Celebration Of Women In MarketingAdobe
 
Leading Adaptive Change to Create Value in Healthcare
Leading Adaptive Change to Create Value in HealthcareLeading Adaptive Change to Create Value in Healthcare
Leading Adaptive Change to Create Value in HealthcareHealth Catalyst
 
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...Health Catalyst
 
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...Health Catalyst
 
6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail
6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail
6 Proven Strategies for Engaging Physicians—and 4 Ways to FailHealth Catalyst
 
Splunk Forum Frankfurt - 15th Nov 2017 - Threat Hunting
Splunk Forum Frankfurt - 15th Nov 2017 - Threat HuntingSplunk Forum Frankfurt - 15th Nov 2017 - Threat Hunting
Splunk Forum Frankfurt - 15th Nov 2017 - Threat HuntingSplunk
 
The 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management SystemThe 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management SystemHealth Catalyst
 
How to Sustain Healthcare Quality Improvement in 3 Critical Steps
How to Sustain Healthcare Quality Improvement in 3 Critical StepsHow to Sustain Healthcare Quality Improvement in 3 Critical Steps
How to Sustain Healthcare Quality Improvement in 3 Critical StepsHealth Catalyst
 
Patient Flight Path Analytics: From Airline Operations to Healthcare Outcomes
Patient Flight Path Analytics: From Airline Operations to Healthcare OutcomesPatient Flight Path Analytics: From Airline Operations to Healthcare Outcomes
Patient Flight Path Analytics: From Airline Operations to Healthcare OutcomesHealth Catalyst
 
Database vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative ReviewDatabase vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative ReviewHealth Catalyst
 
Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Health Catalyst
 

Destacado (14)

PEO 101: What You Need To Know!
PEO 101: What You Need To Know!PEO 101: What You Need To Know!
PEO 101: What You Need To Know!
 
The Top Six Early Detection and Action Must-Haves for Improving Outcomes
The Top Six Early Detection and Action Must-Haves for Improving OutcomesThe Top Six Early Detection and Action Must-Haves for Improving Outcomes
The Top Six Early Detection and Action Must-Haves for Improving Outcomes
 
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
 
A Celebration Of Women In Marketing
A Celebration Of Women In MarketingA Celebration Of Women In Marketing
A Celebration Of Women In Marketing
 
Leading Adaptive Change to Create Value in Healthcare
Leading Adaptive Change to Create Value in HealthcareLeading Adaptive Change to Create Value in Healthcare
Leading Adaptive Change to Create Value in Healthcare
 
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
How To Avoid The 3 Most Common Healthcare Analytics Pitfalls And Related Inef...
 
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...
From Installed to Stalled: Why Sustaining Outcomes Improvement Requires More ...
 
6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail
6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail
6 Proven Strategies for Engaging Physicians—and 4 Ways to Fail
 
Splunk Forum Frankfurt - 15th Nov 2017 - Threat Hunting
Splunk Forum Frankfurt - 15th Nov 2017 - Threat HuntingSplunk Forum Frankfurt - 15th Nov 2017 - Threat Hunting
Splunk Forum Frankfurt - 15th Nov 2017 - Threat Hunting
 
The 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management SystemThe 3 Must-Have Qualities of a Care Management System
The 3 Must-Have Qualities of a Care Management System
 
How to Sustain Healthcare Quality Improvement in 3 Critical Steps
How to Sustain Healthcare Quality Improvement in 3 Critical StepsHow to Sustain Healthcare Quality Improvement in 3 Critical Steps
How to Sustain Healthcare Quality Improvement in 3 Critical Steps
 
Patient Flight Path Analytics: From Airline Operations to Healthcare Outcomes
Patient Flight Path Analytics: From Airline Operations to Healthcare OutcomesPatient Flight Path Analytics: From Airline Operations to Healthcare Outcomes
Patient Flight Path Analytics: From Airline Operations to Healthcare Outcomes
 
Database vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative ReviewDatabase vs Data Warehouse: A Comparative Review
Database vs Data Warehouse: A Comparative Review
 
Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?
 

Similar a Optimizing MongoDB: Lessons Learned at Localytics

Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...npinto
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010jbellis
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBRick Copeland
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
The Right Data for the Right Job
The Right Data for the Right JobThe Right Data for the Right Job
The Right Data for the Right JobEmily Curtin
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityJen Aman
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityJen Aman
 
Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5Burak TUNGUT
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalabilityjbellis
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
MongoDB Best Practices in AWS
MongoDB Best Practices in AWS MongoDB Best Practices in AWS
MongoDB Best Practices in AWS Chris Harris
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesTanel Poder
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsBenjamin Darfler
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsmarkgrover
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalVigyan Jain
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 

Similar a Optimizing MongoDB: Lessons Learned at Localytics (20)

Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
Scaling with MongoDB
Scaling with MongoDBScaling with MongoDB
Scaling with MongoDB
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
The Right Data for the Right Job
The Right Data for the Right JobThe Right Data for the Right Job
The Right Data for the Right Job
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance Understandability
 
Re-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance UnderstandabilityRe-Architecting Spark For Performance Understandability
Re-Architecting Spark For Performance Understandability
 
Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
MongoDB Best Practices in AWS
MongoDB Best Practices in AWS MongoDB Best Practices in AWS
MongoDB Best Practices in AWS
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-FinalSizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 

Último

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 

Último (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 

Optimizing MongoDB: Lessons Learned at Localytics

  • 1. Optimizing MongoDB: Lessons Learned at Localytics Andrew Rollins June 2011 MongoNYC
  • 2. Me • Email: my first name @ localytics.com • twitter.com/andrew311 • andrewrollins.com • Founder, Chief Software Architect at Localytics
  • 3. Localytics • Real time analytics for mobile applications • Built on: – Scala – MongoDB – Amazon Web Services – Ruby on Rails – and more…
  • 4. Why I‟m here: brain dump! • To share tips, tricks, and gotchas about: – Documents – Indexes – Fragmentation – Migrations – Hardware – MongoDB on AWS • Basic to more advanced, a compliment to MongoDB Perf Tuning at MongoSF 2011
  • 5. MongoDB at Localytics • Use cases: – Anonymous loyalty information – De-duplication of incoming data • Requirements: – High throughput – Add capacity without long down-time • Scale today: – Over 1 billion events tracked in May – Thousands of MongoDB operations a second
  • 6. Why MongoDB? • Stability • Community • Support • Drivers • Ease of use • Feature rich • Scale out
  • 8. Shorten names Bad: { super_happy_fun_awesome_name: “yay!” } Good: { s: “yay!” }
  • 9. Use BinData for UUIDs/hashes Bad: { u: “21EC2020-3AEA-1069-A2DD-08002B30309D”, // 36 bytes plus field overhead } Good: { u: BinData(0, “…”), // 16 bytes plus field overhead }
  • 10. Override _id Turn this { _id : ObjectId("47cc67093475061e3d95369d"), u: BinData(0, “…”) // <- this is uniquely indexed } into { _id : BinData(0, “…”) // was the u field } Eliminated an extra index, but be careful about locality... (more later, see Further Reading at end)
  • 11. Pack „em in • Look for cases where you can squish multiple “records” into a single document. • Why? – Decreases number of index entries – Brings documents closer to the size of a page, alleviating potential fragmentation • Example: comments for a blog post.
  • 12. Prefix Indexes Suppose you have an index on a large field, but that field doesn‟t have many possible values. You can use a “prefix index” to greatly decrease index size. find({k: <kval>}) { k: BinData(0, “…”), // 32 byte SHA256, indexed } into find({p: <prefix>, k: <kval>}) { k: BinData(0, “…”), // 28 byte SHA256 suffix, not indexed p: <32-bit integer> // first 4 bytes of k packed in integer, indexed } Example: git commits
  • 14. Fragmentation • Data on disk is memory mapped into RAM. • Mapped in pages (4KB usually). • Deletes/updates will cause memory fragmentation. Disk RAM doc1 doc1 find(doc1) Page deleted deleted … …
  • 15. New writes mingle with old data Data doc1 Page Write docX docX doc3 doc4 Page doc5 find(docX) also pulls in old doc1, wasting RAM
  • 16. Dealing with fragmentation • “mongod --repair” on a secondary, swap with primary. • 1.9 has in-place compaction, but this still holds a write-lock. • MongoDB will auto-pad records. • Pad records yourself by including and then removing extra bytes on first insert. – Alternative offered in SERVER-1810.
  • 17. The Dark Side of Migrations • Chunks are a logical construct, not physical. • Shard keys have serious implications. • What could go wrong? – Let‟s run through an example.
  • 18. Suppose the following Chunk 1 • K is the shard key k: 1 to 5 • K is random Chunk 2 k: 6 to 9 Shard 1 {k: 3, …} 1st write {k: 9, …} 2nd write {k: 1, …} and so on {k: 7, …} {k: 2, …} {k: 8, …}
  • 19. Migrate Chunk 1 Chunk 1 k: 1 to 5 k: 1 to 5 Chunk 2 k: 6 to 9 Shard 1 Shard 2 {k: 3, …} {k: 3, …} {k: 9, …} Random IO {k: 1, …} {k: 1, …} {k: 2, …} {k: 7, …} {k: 2, …} {k: 8, …}
  • 20. Shard 1 is now heavily fragmented Chunk 1 Chunk 1 k: 1 to 5 k: 1 to 5 Chunk 2 k: 6 to 9 Shard 1 Shard 2 {k: 3, …} {k: 3, …} {k: 9, …} {k: 1, …} {k: 1, …} Fragmented {k: 2, …} {k: 7, …} {k: 2, …} {k: 8, …}
  • 21. Why is this scenario bad? • Random reads • Massive fragmentation • New writes mingle with old data
  • 22. How can we avoid bad migrations? • Pre-split, pre-chunk • Better shard keys for better locality – Ideally where data in the same chunk tends to be in the same region of disk
  • 23. Pre-split and move • If you know your key distribution, then pre-create your chunks and assign them. • See this: – http://blog.zawodny.com/2011/03/06/mongodb-pre- splitting-for-faster-data-loading-and-importing/
  • 24. Better shard keys • Usually means including a time prefix in your shard key (e.g., {day: 100, id: X}) • Beware of write hotspots • How to Choose a Shard Key – http://www.snailinaturtleneck.com/blog/2011/01/04/ho w-to-choose-a-shard-key-the-card-game/
  • 26. Working Set in RAM • EC2 m2.2xlarge, RAID0 setup with 16 EBS volumes. • Workers hammering MongoDB with this loop, growing data: – Loop { insert 500 byte record; find random record } • Thousands of ops per second when in RAM • Much less throughput when working set (in this case, all data and index) grows beyond RAM. Ops per second over time In RAM Not In RAM
  • 27. Pre-fetch • Updates hold a lock while they fetch the original from disk. • Instead do a read to warm the doc in RAM under a shared read lock, then update.
  • 28. Shard per core • Instead of a shard per server, try a shard per core. • Use this strategy to overcome write locks when writes per second matter. • Why? Because MongoDB has one big write lock.
  • 29. Amazon EC2 • High throughput / small working set – RAM matters, go with high memory instances. • Low throughput / large working set – Ephemeral storage might be OK. – Remember that EBS IO goes over Ethernet. – Pay attention to IO wait time (iostat). – Your only shot at consistent perf: use the biggest instances in a family. • Read this: – http://perfcap.blogspot.com/2011/03/understanding- and-using-amazon-ebs.html
  • 30. Amazon EBS • ~200 seeks per second per EBS on a good day • EBS has *much* better random IO perf than ephemeral, but adds a dependency • Use RAID0 • Check out this benchmark: – http://orion.heroku.com/past/2009/7/29/io_performanc e_on_ebs/ • To understand how to monitor EBS: – https://forums.aws.amazon.com/thread.jspa?messag eID=124044
  • 31. Further Reading • MongoDB Performance Tuning – http://www.scribd.com/doc/56271132/MongoDB-Performance-Tuning • Monitoring Tips – http://blog.boxedice.com/mongodb-monitoring/ • Markus‟ manual – http://www.markus-gattol.name/ws/mongodb.html • Helpful/interesting blog posts – http://nosql.mypopescu.com/tagged/mongodb/ • MongoDB on EC2 – http://www.slideshare.net/jrosoff/mongodb-on-ec2-and-ebs • EC2 and Ephemeral Storage – http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws- ec2.html • MongoDB Strategies for the Disk Averse – http://engineering.foursquare.com/2011/02/09/mongodb-strategies-for-the-disk-averse/ • MongoDB Perf Tuning at MongoSF 2011 – http://www.scribd.com/doc/56271132/MongoDB-Performance-Tuning
  • 32. Thank you. • Check out Localytics for mobile analytics! • Reach me at: – Email: my first name @ localytics.com – twitter.com/andrew311 – andrewrollins.com