SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
Migrating from MySQL
to MongoDB
Based on a true story
Starting Point
Existing System
Starting Point
Requirements for replacement:
• Store data
• High performance
o worldwide access
o multiple physical locations
o read heavy application
• Simple to use
o No DBA...
• MI
o real-time reporting without affecting production
• Availability and Partition Tolerance
o from CAP
o by default Consistency not Availability...
• Document-Orientated
o called from OOPHP
• Simple querying using JSON
o benefit over CouchDB which needs MapReduce
• Simple multi-node setup
• MongoLab.com
o web interface for prototyping
So why MongoDB?
Read Preference
Defaults to primary (strongly consistent)
Other options allow eventual consistency
primaryPreferred
secondary
secondaryPreferred
New Architecture
Single primary, multiple read members
What are the other members?
One Delayed member provides a rolling
backup by maintaining a dataset, delay is
limited by the size of the oplog
One Hidden member provides access for MI
(and also used for dumping in this scenario)
Automatic Failover
Primary is FUBAR, new primary is elected
Automatic Failover
Had experience of this in production. It's magic.
Remaining replica set members will vote and
elect a new primary, which will start taking
writes. (PHP extension issue)
Replica set members that have been out of the
cluster will recover automatically on rejoining
o If you've gone past the oplog window then it's a manual
process
Sharding, and why we didn't
• Sharding divides the data set and
distributes the data over multiple servers, or
shards
• Necessary when
o data exceeds max for single instance (based on disk
space)
o the Working Set exceeds available RAM
o single instance can’t deal with writes (would have to
be a lot)
• We had none of these and Sharding adds
complexity
Sharding Architecture
http://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/
Sharding Availability
Each Shard would need a Replica Set
containing at least 3 members to achieve the
same level of availability as our non-sharded
architecture.
In addition if one or two of the Config Servers
are unavailable chunk migration and chunk
splitting are suspended
DataBase Design
• Schemaless means that the structure and
the content of the “record” are held for each
“row”
o long key names add up
o we used single chars and a lookup table
• Collections
o Can think of them as “buckets” rather than tables
o Contain similar records by design but it’s not
enforced
Management Information
• Different conceptually in NoSQL databases
due to the way they are queried, returning
JSON
• Traditional working through a query tool is
less intuitive (although MongoVUE is quite
good)
• This lead us to replace our SQL script ->
Excel -> Graph process (yuck)
• MapReduce, powerful but a steep learning
curve
Conclusions
MongoDB is ace
Questions?

Más contenido relacionado

La actualidad más candente

MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql Database
Sudhir Patil
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and Sharding
Tharun Srinivasa
 
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
Zarafa
 
Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDBHardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
MongoDB
 

La actualidad más candente (20)

MongoDb - Details on the POC
MongoDb - Details on the POCMongoDb - Details on the POC
MongoDb - Details on the POC
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDB
 
No sql - { If and Else }
No sql - { If and Else }No sql - { If and Else }
No sql - { If and Else }
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql Database
 
Cassandra database design best practises
Cassandra database design best practisesCassandra database design best practises
Cassandra database design best practises
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB DOC v1.5
MongoDB DOC v1.5MongoDB DOC v1.5
MongoDB DOC v1.5
 
MongoDB Replication and Sharding
MongoDB Replication and ShardingMongoDB Replication and Sharding
MongoDB Replication and Sharding
 
MongoDB for the SQL Server
MongoDB for the SQL ServerMongoDB for the SQL Server
MongoDB for the SQL Server
 
Open source Technology
Open source TechnologyOpen source Technology
Open source Technology
 
Zarafa Scaling & Performance
Zarafa Scaling & PerformanceZarafa Scaling & Performance
Zarafa Scaling & Performance
 
Optimizing InnoDB bufferpool usage
Optimizing InnoDB bufferpool usageOptimizing InnoDB bufferpool usage
Optimizing InnoDB bufferpool usage
 
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
 
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
Zarafa SummerCamp 2012 - Tips & tricks for running Zarafa is larger scale env...
 
Mongo db
Mongo dbMongo db
Mongo db
 
Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDBHardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Ndb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_memNdb cluster 80_ycsb_mem
Ndb cluster 80_ycsb_mem
 
Hdfs internals
Hdfs internalsHdfs internals
Hdfs internals
 

Similar a Migrating from MySQL to MongoDB

JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max Schireson
MongoDB
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scaling
MongoDB
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
Philip Zhong
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
Chris Henry
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Teresa Giacomini
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
MongoDB
 

Similar a Migrating from MySQL to MongoDB (20)

MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Scaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTPScaling MongoDB - Presentation at MTP
Scaling MongoDB - Presentation at MTP
 
JasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max SchiresonJasperWorld 2012: Reinventing Data Management by Max Schireson
JasperWorld 2012: Reinventing Data Management by Max Schireson
 
MongoDB
MongoDBMongoDB
MongoDB
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 
Best Practices for NoSQL Workloads on Amazon EC2 and Amazon EBS - February 20...
Best Practices for NoSQL Workloads on Amazon EC2 and Amazon EBS - February 20...Best Practices for NoSQL Workloads on Amazon EC2 and Amazon EBS - February 20...
Best Practices for NoSQL Workloads on Amazon EC2 and Amazon EBS - February 20...
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
MongoDB
MongoDBMongoDB
MongoDB
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scaling
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
 
The Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb ClusterThe Care + Feeding of a Mongodb Cluster
The Care + Feeding of a Mongodb Cluster
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
MongoDB: An Introduction - July 2011
MongoDB:  An Introduction - July 2011MongoDB:  An Introduction - July 2011
MongoDB: An Introduction - July 2011
 
2011 mongo sf-scaling
2011 mongo sf-scaling2011 mongo sf-scaling
2011 mongo sf-scaling
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Migrating from MySQL to MongoDB

  • 1. Migrating from MySQL to MongoDB Based on a true story
  • 3. Starting Point Requirements for replacement: • Store data • High performance o worldwide access o multiple physical locations o read heavy application • Simple to use o No DBA... • MI o real-time reporting without affecting production
  • 4. • Availability and Partition Tolerance o from CAP o by default Consistency not Availability... • Document-Orientated o called from OOPHP • Simple querying using JSON o benefit over CouchDB which needs MapReduce • Simple multi-node setup • MongoLab.com o web interface for prototyping So why MongoDB?
  • 5. Read Preference Defaults to primary (strongly consistent) Other options allow eventual consistency primaryPreferred secondary secondaryPreferred
  • 6. New Architecture Single primary, multiple read members
  • 7. What are the other members? One Delayed member provides a rolling backup by maintaining a dataset, delay is limited by the size of the oplog One Hidden member provides access for MI (and also used for dumping in this scenario)
  • 8. Automatic Failover Primary is FUBAR, new primary is elected
  • 9. Automatic Failover Had experience of this in production. It's magic. Remaining replica set members will vote and elect a new primary, which will start taking writes. (PHP extension issue) Replica set members that have been out of the cluster will recover automatically on rejoining o If you've gone past the oplog window then it's a manual process
  • 10. Sharding, and why we didn't • Sharding divides the data set and distributes the data over multiple servers, or shards • Necessary when o data exceeds max for single instance (based on disk space) o the Working Set exceeds available RAM o single instance can’t deal with writes (would have to be a lot) • We had none of these and Sharding adds complexity
  • 12. Sharding Availability Each Shard would need a Replica Set containing at least 3 members to achieve the same level of availability as our non-sharded architecture. In addition if one or two of the Config Servers are unavailable chunk migration and chunk splitting are suspended
  • 13. DataBase Design • Schemaless means that the structure and the content of the “record” are held for each “row” o long key names add up o we used single chars and a lookup table • Collections o Can think of them as “buckets” rather than tables o Contain similar records by design but it’s not enforced
  • 14. Management Information • Different conceptually in NoSQL databases due to the way they are queried, returning JSON • Traditional working through a query tool is less intuitive (although MongoVUE is quite good) • This lead us to replace our SQL script -> Excel -> Graph process (yuck) • MapReduce, powerful but a steep learning curve