Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Cargando en…3
×
1 de 67

MongoDB Best Practices for Developers

7

Compartir

Best practices for using MongoDB, migrating from RDBMS to MongoDB and tuning.
Presented in IL BigData Conference 2015

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

MongoDB Best Practices for Developers

  1. 1. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers For Developers
  2. 2. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers MongoDB For Developers Moshe Kaplan Scale Hacker http://top-performance.blogspot.com http://blogs.microsoft.co.il/vprnd
  3. 3. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms It’s all About Scale
  4. 4. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms HELLO. MY NAME IS MONGODB Introduction
  5. 5. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Who is Using mongoDB? 5
  6. 6. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Who is Behind mongoDB
  7. 7. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Key Value Store (with benefits) • insert • get • multiget • remove • truncate 7 <Key, Value> ://wiki.apache.org/cassandra/API
  8. 8. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms When Should I Choose NoSQL? • Eventually Consistent • Document Store • Key Value 8 http://guyharrison.squarespace.com/blog/tag/nosq
  9. 9. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms What mongoDB is Made of? 9 http://www.10gen.com/products/mongodb
  10. 10. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Why MongoDB? What? Why? JSON End to End No Schema “No DBA”, Just Serialize Write 10K Inserts/sec on virtual machine Read Similar to MySQL HA 10 min to setup a cluster Sharding Out of the Box GeoData Great for that No Schema None: no downtime to create new columns Buzz Trend is with NoSQL 10
  11. 11. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms NoSQL and Data Modeling What is the Difference
  12. 12. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Database for Software Engineers Class Subclass Document Subdocument
  13. 13. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Same Terminology • Database  Database • Table  Collection • Row  Document
  14. 14. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms A Blog Case Study in MySQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
  15. 15. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms as a SW Engineer would like it to be… http://www.slideshare.net/nateabele/building-apps-with-mongodb
  16. 16. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Migration from RDBMS to NoSQL How to do that?
  17. 17. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Migration • Map the table structure • Export the data and Import It • Add Indexes http://igcse-geography-lancaster.wikispaces.com/1.2+MIGRATION
  18. 18. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Selected Migration Tool
  19. 19. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Usage Details> Install ruby > gem install mongify … Modify the code to your needs … Create configuration files > mongify translation db.config > translation.rb > mongify process db.config translation.rb
  20. 20. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Date Functions • Year(), Month()… function included • … buy only in the JavaScript engine • Solution: New fields! • [original field] • [original field]_[year part] • [original field]_[month part] • [original field]_[day part] • [original field]_[hour part]
  21. 21. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms NO SCHEMA IS A GOOD THING BUT… Schemaless
  22. 22. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Default Values • No Schema • No Default Values • App Challenge • Timestamps… No single source of truth
  23. 23. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Casting and Type Safety • No Schema • No … • App Challenge
  24. 24. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Auto Numbers • Start using _id { "_id" : 0, "health" : 1, "stateStr" : "PRIMARY", "uptime" : 59917 } • Counter tables • Dedicated database • 1:1 Mapping • Counter++ using findAndModify
  25. 25. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms The ORM Solution
  26. 26. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts http://www.designersplayground.com/pr/internet-meme-list/data-analyst-2/
  27. 27. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Data Analysts • This is not SQL • There are no joins • No perfect tools Pentaho RockMongoMongoVUE RoboMongo
  28. 28. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms No Joins • Do in the application • Leverage the power of NoSQL http://www.slideshare.net/nateabele/building-apps-with-mongodb
  29. 29. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Limited Resultset • 16MB document size • Limit and Skip • Adjusted WHERE • GridFS
  30. 30. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Bottom Line • Powerful tool • Embrace the Challenge • Schema-less limitations: counters, data types • Tools for Data Scientists • Data design
  31. 31. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Mastering a New Query Language
  32. 32. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Connect to the Database • Connect: • > mongo • Show current database: • >> db • Show Databases • >> show databases; • Show Collections • >> show collections; or show tables;
  33. 33. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Databases Manipulation: Create & Drop • Change Database: • >> use <database> • Create Database • Just switch and create an object… • Delete Database • > use mydb; • > db.dropDatabase();
  34. 34. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Collections Manipulation • Create Collcation >db.createCollection(collectionName) • Delete Collection > db.collectionName.drop() Or just insert to it
  35. 35. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms SELECT: No SQL, just ORM… • Select All • db.things.find() • WHERE • db.posts.find({“comments.email” : ”b@c.com”}) • Pattern Matching • db.posts.find( {“title” : /mongo/i} ) • Sort • db.posts.find().sort({email : 1, date : -1}); • Limit • db.posts.find().limit(3)
  36. 36. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Specific fields Select All db.users.find( { }, { user_id: 1, status: 1, _id: 0 } ) 1: Show; 0: don’t show
  37. 37. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms WHERE • != “A” { $ne: "A" } • > 25 { $gt: 25 } • > 25 AND <= 50 { $gt: 25, $lte: 50 } • Like ‘bc%’ /^bc/ • < 25 OR >= 50 { $or : [ { $lt: 25 }, { $gte : 50 } ] }
  38. 38. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Join • Wrong Place… • Or Map Reduce
  39. 39. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms 39  db.article.aggregate(  { $group : {  _id : { author : "$author“, name : “$name” },  docsPerAuthor : { $sum : 1 },  viewsPerAuthor : { $sum : "$pageViews" }  }}  ); GROUP BY < GROUP BY author, name < SUM(pageViews) < SUM(1) = N
  40. 40. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms 40 db.posts.update( {“comments.email”: ”b@c.com”}, {$set : {“comments.email”: ”d@c.com”}} } SET age = age + 3 • db.users.update( • { status: "A" } , • { $inc: { age: 3 } }, • { multi: true } • ) UPDATE
  41. 41. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms 41 j = { name : "mongo" } k = { x : 3 } db.things.insert( j ) db.things.insert( k ) INSERT
  42. 42. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms 42 db.users.remove( { status: "D" } ) DELETE
  43. 43. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Performance Tuning Make a Change
  44. 44. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms MONGODB TUNING
  45. 45. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms journalCommitInterval = 300: Write to disk: 2ms <= t <= 300ms Default 100ms, increase to 300ms to save resources Disk The Journal Memory Journal Data 1 2
  46. 46. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms RAM Optimization: dataSize + indexSize < RAM OS Data Index Journal
  47. 47. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms PROFILING AND SLOW LOG
  48. 48. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Profiling Configuration • Enable: • mongod --profile=1 --slowms=15 • db.setProfilingLevel([level] , [time]) • How much: • 0 (none)  1 (slow queries only)  2 (all) • 100ms: default • Where: • system.profile collection @ local db
  49. 49. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Profiling Results Analysis • Last 5 >1ms: show profile • w/o commands: db.system.profile.find( { op: { $ne : 'command' } } ).pretty() • Specific database: db.system.profile.find( { ns : 'mydb.test' } ).pretty() • Slower than: db.system.profile.find( { millis : { $gt : 5 } } ).pretty() • Between dates: db.system.profile.find({ts : { $gt : new ISODate("2012-12-09T03:00:00Z") , $lt : new ISODate("2012-12-09T03:40:00Z") }}).pretty()
  50. 50. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Explain > db.courses.find().explain(); { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 11, “nscannedObjects" : 11, "nscanned" : 11, "nscannedObjectsAllPlans" : 11, "nscannedAllPlans" : 11, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : {}, "server" : "primary.domain.com:27017" }
  51. 51. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms INDEXES
  52. 52. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Index Management • Regular Index • db.users.ensureIndex( { user_id: 1 } ) • Multiple + DESC Index • db.users.ensureIndex( { user_id: 1, age: -1 } ) • Sub Document Index • db.users.ensureIndex( { address.zipcode: 1 } ) • List Indexes • db.users.getIndexes() • Drop Indexes • db.users.dropIndex(“indexName”)
  53. 53. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Known Index Issues • Bound filter should be the last (in the index as well). • BitMap Indexes not really working • You should design your indexes carefully
  54. 54. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms STATS & SCHEMA DESIGN
  55. 55. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Sparse Matrix? I don’t Think so • mongostat • > db.stats(); • > db.collectionname.stats(); • Fragmentation if storageSize/size > 2 • db.collectionanme.runCommand(“compact”) • Padding (wrong design) if paddingFactor > 2
  56. 56. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms High Availability Going Real Time
  57. 57. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms (Do Not) Master/Slave
  58. 58. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms • In mongo.conf • # Replication Options • replSet=myReplSet • > rs.initiate() • > rs.conf() • > rs.add(“host:port") • rs.reconfig() Replication Set
  59. 59. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms • rs.addArb(“host:port") • Also: • Low Priority • Hidden • (Weighted) Voting Arbiter
  60. 60. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Show Status: rs.status(); • {"set" : “myReplSet", "date" : ISODate("2013-02-05T10:23:28Z"), • "myState" : 1, • "members" : [ • { • "_id" : 0, "name" : "primary.example.com:27017", • "health" : 1, "state" : 1, • "stateStr" : "PRIMARY", "uptime" : 164545, • "optime" : Timestamp(1359901753000, 1), • "optimeDate" : ISODate("2013-02- 03T14:29:13Z"), "self" : true • }, • { • "_id" : 1, "name"
  61. 61. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Replica Set Recovery • Create a new mongod • Either install a plain vanilla • Or duplicate existing mongod (better) • Connect to the system • Use the previous machine IP • Or change configuration to remove old and add new
  62. 62. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Sharding and Scale out: Make a big Change Map Reduce and Aggregation
  63. 63. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms Secondary Read Enabling
  64. 64. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms The Strategy : Sharding
  65. 65. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms MongoDB Implementation
  66. 66. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Summary • NoSQL • Schemaless • HA • Sharding
  67. 67. © All rights reserved: Moshe Kaplan Big Data – Leading Platforms© All rights reserved: Moshe Kaplan MongoDB for Developers Thank You ! Moshe Kaplan moshe.kaplan@brightaqua.com 054-2291978

×