Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
#MongoDB 
Transitioning from SQL to 
MongoDB 
Buzz Moschetti 
buzz.moschetti@mongodb.com 
Enterprise Architect, MongoDB
Before We Begin 
• This webinar is being recorded 
• Use The Chat Window for 
• Technical assistance 
• Q&A 
• MongoDB Tea...
Who is your Presenter? 
• Yes, I use “Buzz” on my business cards 
• Former Investment Bank Chief Architect at 
JPMorganCha...
What Are Your Developers Doing All 
Day? 
Adding and testing business features 
OR 
“Integrating with other components, to...
Why Can’t We Just Save and Fetch 
Data? 
Because the way we think about data at the 
business use case level… 
…is differe...
This Problem Isn’t New… 
…but for the past 40 years, innovation at the business & application layers 
has outpaced innovat...
Exactly How Does MongoDB Change 
Things? 
• MongoDB is designed from the ground up to 
address rich structure (maps of map...
Rectangles are 1974. Maps and Lists are 
2014 
{ customer_id : 1, 
first_name : "Mark", 
last_name : "Smith", 
city : "San...
An Actual Code Example (Finally!) 
Let’s compare and contrast RDBMS/SQL to MongoDB 
development using Java over the course...
The Task: Saving and Fetching Contact 
data 
Map m = new HashMap(); 
m.put(“name”, “buzz”); 
m.put(“id”, “K1”); 
Start wit...
Day 1: Initial efforts for both technologies 
SQL 
DDL: create table contact ( … ) 
init() 
{ 
contactInsertStmt = connect...
Day 2: Add simple fields 
m.put(“name”, “buzz”); 
m.put(“id”, “K1”); 
m.put(“title”, “Mr.”); 
m.put(“hireDate”, new Date(2...
SQL Day 2 (changes in bold) 
DDL: alter table contact add title varchar(8); 
alter table contact add hireDate date; 
init(...
MongoDB Day 2 
save(Map m) 
{ 
collection.insert(m); 
} 
Map fetch(String id) 
{ 
Map m = null; 
DBObject dbo = new BasicD...
Day 3: Add list of phone numbers 
m.put(“name”, “buzz”); 
m.put(“id”, “K1”); 
m.put(“title”, “Mr.”); 
m.put(“hireDate”, ne...
SQL Day 3 changes: Option 1: Assume 
just 1 work and 1 home phone number 
DDL: alter table contact add work_phone varchar(...
SQL Day 3 changes: Option 2: 
Proper approach with multiple phone 
numbers DDL: create table phones ( … ) 
init() 
{ 
cont...
SQL Day 5: Zombies! (zero or more between entities) 
init() 
{ 
contactInsertStmt = connection.prepareStatement 
(“insert ...
MongoDB Day 3 
Advantages: 
1. Zero time and money spent on 
overhead code 
2. No need to fear fields that are 
“naturally...
By Day 14, our structure looks like this: 
n4.put(“geo”, “US-EAST”); 
n4.put(“startupApps”, new String[] { “app1”, “app2”,...
SQL Day 14 
Error: Could not fit all the code into this space. 
…actually, I didn’t want to spend 2 hours putting the code...
MongoDB Day 14 – and every other day 
Advantages: 
1. Zero time and money spent on 
overhead code 
2. Persistence is so ea...
But what if we must do a join? 
Both RDBMS and MongoDB will have a PhoneTransactions 
table/collection 
{ customer_id : 1,...
SQL Join Attempt #1 
select A.id, A.lname, B.type, B.number, C.target, C.duration 
from contact A, phones B, phonestx C 
W...
SQL Unwind Attempt #1 
Map idmap = new HashMap(); 
ResultSet rs = fetchStmt.execute(); 
while (rs.next()) { 
String id = r...
SQL Join Attempt #2 
select A.id, A.lname, B.type, B.number, C.target, C.duration 
Fromcontact A, phones B, phonestx C 
Wh...
SQL is about Disassembly 
String s = “select A, B, C, D, 
E, F from T1,T2,T3 where T1.col 
= T2.col and T2.col2 = T3.col2 ...
MongoDB is about Assembly 
Cursor c = coll1.find({“X”:”Y”}); 
while(c.hasNext()) { 
populate maps, lists and scalars; 
Cur...
MongoDB ”Join” 
Map idmap = new HashMap(); 
DBCursor c = contacts.find(); 
while(c.hasNext()) { 
DBObject item = c.next();...
But what about “real” queries? 
• MongoDB query language is a physical map-of-map 
based structure, not a String 
• Operat...
MongoDB Query Examples 
Find all contacts with at least one work phone 
SQL CLI select * from contact A, phones B where 
A...
MongoDB Query Examples 
Find all contacts with at least one work phone or 
hired after 2014-02-02 
SQL select A.did, A.lna...
MongoDB Query Examples 
Find all contacts with at least one work phone or 
hired after 2014-02-02 
MongoDB via 
Java drive...
…and before you ask… 
Yes, MongoDB query expressions 
support 
1. Sorting 
2. Cursor size limit 
3. Projection (asking for...
Day 30: RAD on MongoDB with Python 
import pymongo 
def save(data): 
coll.insert(data) 
def fetch(id): 
return coll.find_o...
Day 30: Polymorphic RAD on MongoDB with 
Python 
import pymongo 
item = fetch("K8") 
# item is: 
{ 
“name”: “bob”, 
“id”: ...
Day 30: (Not) RAD on top of SQL with 
Python 
init() 
{ 
contactInsertStmt = connection.prepareStatement 
(“insert into co...
The Fundamental Change with mongoDB 
RDBMS designed in era when: 
• CPU and disk was slow & 
expensive 
• Memory was VERY ...
mongoDB and the Rich Map Ecosystem 
Generic comparison of two 
records 
Map expr = new HashMap(); 
expr.put("myKey", "K1")...
Lastly: A CLI with teeth 
> db.contact.find({"SeqNum": {"$gt”:10000}}).explain(); 
{ 
"cursor" : "BasicCursor", 
"n" : 200...
What Does All This Add Up To? 
• MongoDB easier than RDBMS/SQL for real 
problems 
• Quicker to change 
• Much better harm...
Webinar Q&A 
buzz.moschetti@mongodb.com
#MongoDB 
Thank You 
Buzz Moschetti 
buzz.moschetti@mongodb.com 
Enterprise Architect, MongoDB
Próxima SlideShare
Cargando en…5
×

Transitioning from SQL to MongoDB

25.481 visualizaciones

Publicado el

Learn how to transition from SQL to MongoDB with this presentation.

Publicado en: Tecnología
  • The examples of code are so contrived. Who uses Java's statement.execute() and iterates over results in this day an age? Nobody. People use ORMs to map database fields to classes. The fact that you can store a hash easier in Mongo means just that. For complex applications relational model is fantastic at avoiding data duplication, and supporting ever changing requirements (normal form is very efficient way of storing data). MongoDB may be a great product, and I might use it when I need to store some unstructured data in the future, but RDMBs have just too many conveniences for me to abandon. So while I appreciate the slides and the message that somehow our job is easier if we all dropped RDMBS and switched to Mongo, after building software for 20 years and trying Mongo – I don't buy it.
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

Transitioning from SQL to MongoDB

  1. 1. #MongoDB Transitioning from SQL to MongoDB Buzz Moschetti buzz.moschetti@mongodb.com Enterprise Architect, MongoDB
  2. 2. Before We Begin • This webinar is being recorded • Use The Chat Window for • Technical assistance • Q&A • MongoDB Team will answer quick questions in realtime • “Common” questions will be reviewed at the end of the webinar
  3. 3. Who is your Presenter? • Yes, I use “Buzz” on my business cards • Former Investment Bank Chief Architect at JPMorganChase and Bear Stearns before that • Over 27 years of designing and building systems • Big and small • Super-specialized to broadly useful in any vertical • “Traditional” to completely disruptive • Advocate of language leverage and strong factoring • Still programming – using emacs, of course
  4. 4. What Are Your Developers Doing All Day? Adding and testing business features OR “Integrating with other components, tools, and systems” • Database(s) • ETL and other data transfer operations • Messaging • Services (web & other) • Other open source frameworks incl. ORMs
  5. 5. Why Can’t We Just Save and Fetch Data? Because the way we think about data at the business use case level… …is different than the way it is implemented at the application/code level… …which traditionally is VERY different than the way it is implemented at the database level
  6. 6. This Problem Isn’t New… …but for the past 40 years, innovation at the business & application layers has outpaced innovation at the database layer 1974 2014 Business Data Goals Capture my company’s transactions daily at 5:30PM EST, add them up on a nightly basis, and print a big stack of paper Capture my company’s global transactions in realtime plus everything that is happening in the world (customers, competitors, business/regulatory/weather), producing any number of computed results, and passing this all in realtime to predictive analytics with model feedback; results in realtime to 10000s of mobile devices, multiple GUIs, and b2b and b2c channels Release Schedule Semi-Annually Yesterday Application /Code COBOL, Fortran, Algol, PL/1, assembler, proprietary tools C, C++, VB, C#, Java, javascript, groovy, ruby, perl python, Obj-C, SmallTalk, Clojure, ActionScript, Flex, DSLs, spring, AOP, CORBA, ORM, third party software ecosystem, the whole open source movement, … and COBOL and Fortran Database I/VSAM, early RDBMS Mature RDBMS, legacy I/VSAM Column & key/value stores, and…mongoDB
  7. 7. Exactly How Does MongoDB Change Things? • MongoDB is designed from the ground up to address rich structure (maps of maps of lists of…), not rectangles • Standard RDBMS interfaces (i.e. JDBC) do not exploit features of contemporary languages • Rapid Application Development (RAD) and scripting in Javascript, Python, Perl, Ruby, and Scala is impedance-matched to mongoDB • In MongoDB, the data is the schema • Shapes of data go in the same way they come out
  8. 8. Rectangles are 1974. Maps and Lists are 2014 { customer_id : 1, first_name : "Mark", last_name : "Smith", city : "San Francisco", phones: [ { type : “work”, number: “1-800-555-1212” }, { type : “home”, number: “1-800-555-1313”, DNC: true }, { type : “home”, number: “1-800-555-1414”, DNC: true } ] }
  9. 9. An Actual Code Example (Finally!) Let’s compare and contrast RDBMS/SQL to MongoDB development using Java over the course of a few weeks. Some ground rules: 1. Observe rules of Software Engineering 101: Assume separation of application, Data Access Layer, and persistor implementation 2. Data Access Layer must be able to a. Expose simple, functional, data-only interfaces to the application • No ORM, frameworks, compile-time bindings, special tools b. Exploit high performance features of persistor 3. Focus on core data handling code and avoid distractions that require the same amount of work in both technologies a. No exception or error handling b. Leave out DB connection and other setup resources 4. Day counts are a proxy for progress, not actual time to complete indicated task
  10. 10. The Task: Saving and Fetching Contact data Map m = new HashMap(); m.put(“name”, “buzz”); m.put(“id”, “K1”); Start with this simple, flat shape in the Data Access Layer: save(Map m) And assume we save it in this way: Map m = fetch(String id) And assume we fetch one by primary key in this way: Brace yourself…..
  11. 11. Day 1: Initial efforts for both technologies SQL DDL: create table contact ( … ) init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name ) values ( ?,? )”); fetchStmt = connection.prepareStatement (“select id, name from contact where id = ?”); } save(Map m) { contactInsertStmt.setString(1, m.get(“id”)); contactInsertStmt.setString(2, m.get(“name”)); contactInsertStmt.execute(); } Map fetch(String id) { Map m = null; fetchStmt.setString(1, id); rs = fetchStmt.execute(); if(rs.next()) { m = new HashMap(); m.put(“id”, rs.getString(1)); m.put(“name”, rs.getString(2)); } return m; } MongoDB DDL: none save(Map m) { collection.insert(m); } Map fetch(String id) { Map m = null; DBObject dbo = new BasicDBObject(); dbo.put(“id”, id); c = collection.find(dbo); if(c.hasNext()) } m = (Map) c.next(); } return m; }
  12. 12. Day 2: Add simple fields m.put(“name”, “buzz”); m.put(“id”, “K1”); m.put(“title”, “Mr.”); m.put(“hireDate”, new Date(2011, 11, 1)); • Capturing title and hireDate is part of adding a new business feature • It was pretty easy to add two fields to the structure • …but now we have to change our persistence code Brace yourself (again) …..
  13. 13. SQL Day 2 (changes in bold) DDL: alter table contact add title varchar(8); alter table contact add hireDate date; init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name, title, hiredate ) values ( ?,?,?,? )”); fetchStmt = connection.prepareStatement (“select id, name, title, hiredate from contact where id = ?”); } save(Map m) { contactInsertStmt.setString(1, m.get(“id”)); contactInsertStmt.setString(2, m.get(“name”)); contactInsertStmt.setString(3, m.get(“title”)); contactInsertStmt.setDate(4, m.get(“hireDate”)); contactInsertStmt.execute(); } Map fetch(String id) { Map m = null; fetchStmt.setString(1, id); rs = fetchStmt.execute(); if(rs.next()) { m = new HashMap(); m.put(“id”, rs.getString(1)); m.put(“name”, rs.getString(2)); m.put(“title”, rs.getString(3)); m.put(“hireDate”, rs.getDate(4)); } return m; } Consequences: 1. Code release schedule linked to database upgrade (new code cannot run on old schema) 2. Issues with case sensitivity starting to creep in (many RDBMS are case insensitive for column names, but code is case sensitive) 3. Changes require careful mods in 4 places 4. Beginning of technical debt
  14. 14. MongoDB Day 2 save(Map m) { collection.insert(m); } Map fetch(String id) { Map m = null; DBObject dbo = new BasicDBObject(); dbo.put(“id”, id); c = collection.find(dbo); if(c.hasNext()) } m = (Map) c.next(); } return m; } Advantages: 1. Zero time and money spent on overhead code 2. Code and database not physically linked 3. New material with more fields can be added into existing collections; backfill is optional 4. Names of fields in database precisely match key names in code layer and directly match on name, not indirectly via positional offset 5. No technical debt is created ✔ NO CHANGE
  15. 15. Day 3: Add list of phone numbers m.put(“name”, “buzz”); m.put(“id”, “K1”); m.put(“title”, “Mr.”); m.put(“hireDate”, new Date(2011, 11, 1)); n1.put(“type”, “work”); n1.put(“number”, “1-800-555-1212”)); list.add(n1); n2.put(“type”, “home”)); n2.put(“number”, “1-866-444-3131”)); list.add(n2); m.put(“phones”, list); • It was still pretty easy to add this data to the structure • .. but meanwhile, in the persistence code … REALLY brace yourself…
  16. 16. SQL Day 3 changes: Option 1: Assume just 1 work and 1 home phone number DDL: alter table contact add work_phone varchar(16); alter table contact add home_phone varchar(16); init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name, title, hiredate, work_phone, home_phone ) values ( ?,?,?,?,?,? )”); fetchStmt = connection.prepareStatement (“select id, name, title, hiredate, work_phone, home_phone from contact where id = ?”); } save(Map m) { contactInsertStmt.setString(1, m.get(“id”)); contactInsertStmt.setString(2, m.get(“name”)); contactInsertStmt.setString(3, m.get(“title”)); contactInsertStmt.setDate(4, m.get(“hireDate”)); for(Map onePhone : m.get(“phones”)) { String t = onePhone.get(“type”); String n = onePhone.get(“number”); if(t.equals(“work”)) { contactInsertStmt.setString(5, n); } else if(t.equals(“home”)) { contactInsertStmt.setString(6, n); } } contactInsertStmt.execute(); } Map fetch(String id) { Map m = null; fetchStmt.setString(1, id); rs = fetchStmt.execute(); if(rs.next()) { m = new HashMap(); m.put(“id”, rs.getString(1)); m.put(“name”, rs.getString(2)); m.put(“title”, rs.getString(3)); m.put(“hireDate”, rs.getDate(4)); Map onePhone; onePhone = new HashMap(); onePhone.put(“type”, “work”); onePhone.put(“number”, rs.getString(5)); list.add(onePhone); onePhone = new HashMap(); onePhone.put(“type”, “home”); onePhone.put(“number”, rs.getString(6)); list.add(onePhone); m.put(“phones”, list); } This is just plain bad….
  17. 17. SQL Day 3 changes: Option 2: Proper approach with multiple phone numbers DDL: create table phones ( … ) init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name, title, hiredate ) values ( ?,?,?,? )”); c2stmt = connection.prepareStatement(“insert into phones (id, type, number) values (?, ?, ?)”; fetchStmt = connection.prepareStatement (“select id, name, title, hiredate, type, number from contact, phones where phones.id = contact.id and contact.id = ?”); } save(Map m) { startTrans(); contactInsertStmt.setString(1, m.get(“id”)); contactInsertStmt.setString(2, m.get(“name”)); contactInsertStmt.setString(3, m.get(“title”)); contactInsertStmt.setDate(4, m.get(“hireDate”)); for(Map onePhone : m.get(“phones”)) { c2stmt.setString(1, m.get(“id”)); c2stmt.setString(2, onePhone.get(“type”)); c2stmt.setString(3, onePhone.get(“number”)); c2stmt.execute(); } contactInsertStmt.execute(); endTrans(); } Map fetch(String id) { Map m = null; fetchStmt.setString(1, id); rs = fetchStmt.execute(); int i = 0; List list = new ArrayList(); while (rs.next()) { if(i == 0) { m = new HashMap(); m.put(“id”, rs.getString(1)); m.put(“name”, rs.getString(2)); m.put(“title”, rs.getString(3)); m.put(“hireDate”, rs.getDate(4)); m.put(“phones”, list); } Map onePhone = new HashMap(); onePhone.put(“type”, rs.getString(5)); onePhone.put(“number”, rs.getString(6)); list.add(onePhone); i++; } return m; } This took time and money
  18. 18. SQL Day 5: Zombies! (zero or more between entities) init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name, title, hiredate ) values ( ?,?,?,? )”); c2stmt = connection.prepareStatement(“insert into phones (id, type, number) values (?, ?, ?)”; fetchStmt = connection.prepareStatement (“select A.id, A.name, A.title, A.hiredate, B.type, B.number from contact A left outer join phones B on (A.id = B. id) where A.id = ?”); } Whoops! And it’s also wrong! We did not design the query accounting for contacts that have no phone number. Thus, we have to change the join to an outer join. But this ALSO means we have to change the unwind logic This took more time and money! while (rs.next()) { if(i == 0) { // … } String s = rs.getString(5); if(s != null) { Map onePhone = new HashMap(); onePhone.put(“type”, s); onePhone.put(“number”, rs.getString(6)); list.add(onePhone); } } …but at least we have a DAL… right?
  19. 19. MongoDB Day 3 Advantages: 1. Zero time and money spent on overhead code 2. No need to fear fields that are “naturally occurring” lists containing data specific to the parent structure and thus do not benefit from normalization and referential integrity 3. Safe from zombies and other undead distractions from productivity save(Map m) { collection.insert(m); } Map fetch(String id) { Map m = null; DBObject dbo = new BasicDBObject(); dbo.put(“id”, id); c = collection.find(dbo); if(c.hasNext()) } m = (Map) c.next(); } return m; } ✔ NO CHANGE
  20. 20. By Day 14, our structure looks like this: n4.put(“geo”, “US-EAST”); n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } ); list2.add(n4); n4.put(“geo”, “EMEA”); n4.put(“startupApps”, new String[] { “app6” } ); n4.put(“useLocalNumberFormats”, false): list2.add(n4); m.put(“preferences”, list2) n6.put(“optOut”, true); n6.put(“assertDate”, someDate); seclist.add(n6); m.put(“attestations”, seclist) m.put(“security”, mapOfDataCreatedByExternalSource); • It was still pretty easy to add this data to the structure • Want to guess what the SQL persistence code looks like? • How about the MongoDB persistence code?
  21. 21. SQL Day 14 Error: Could not fit all the code into this space. …actually, I didn’t want to spend 2 hours putting the code together.. But very likely, among other things: • n4.put(“startupApps”,new String[]{“app1”,“app2”,“app3”}); was implemented as a single semi-colon delimited string • m.put(“security”, anotherMapOfData); was implemented by flattening it out and storing a subset of fields
  22. 22. MongoDB Day 14 – and every other day Advantages: 1. Zero time and money spent on overhead code 2. Persistence is so easy and flexible and backward compatible that the persistor does not upward-influence the shapes we want to persist i.e. the tail does not wag the dog save(Map m) { collection.insert(m); } Map fetch(String id) { Map m = null; DBObject dbo = new BasicDBObject(); dbo.put(“id”, id); c = collection.find(dbo); if(c.hasNext()) } m = (Map) c.next(); } return m; } ✔ NO CHANGE
  23. 23. But what if we must do a join? Both RDBMS and MongoDB will have a PhoneTransactions table/collection { customer_id : 1, first_name : "Mark", last_name : "Smith", city : "San Francisco", phones: [ { type : “work”, number: “1-800-555-1212” }, { type : “home”, number: “1-800-555-1313”, DNC: true }, { type : “home”, number: “1-800-555-1414”, DNC: true } ] } { number: “1-800-555-1212”, target: “1-999-238-3423”, duration: 20 } { number: “1-800-555-1212”, target: “1-444-785-6611”, duration: 243 } { number: “1-800-555-1414”, target: “1-645-331-4345”, duration: 132 } { number: “1-800-555-1414”, target: “1-990-875-2134”, duration: 71 } PhoneTransactions
  24. 24. SQL Join Attempt #1 select A.id, A.lname, B.type, B.number, C.target, C.duration from contact A, phones B, phonestx C Where A.id = B.id and B.number = C.number id | lname | type | number | target | duration -----+--------------+------+----------------+----------------+---------- g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23 g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7 g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9 g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7 g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23 g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9 g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9 How to turn this into a list of names – each with a list of numbers, each of those with a list of target numbers?
  25. 25. SQL Unwind Attempt #1 Map idmap = new HashMap(); ResultSet rs = fetchStmt.execute(); while (rs.next()) { String id = rs.getString(“id"); String nmbr = rs.getString("number"); List tnum; Map snum; if((snum = (List) idmap.get(id)) == null) { snum = new HashMap(); idmap.put(did, snum); } if((tnum = snum.get(nmbr)) == null) { tnum = new ArrayList(); snum.put(number, tnum); } Map info = new HashMap(); info.put("target", rs.getString("target")); info.put("duration", rs.getInteger("duration")); tnum.add(info); } // idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)
  26. 26. SQL Join Attempt #2 select A.id, A.lname, B.type, B.number, C.target, C.duration Fromcontact A, phones B, phonestx C Where A.id = B.id and B.number = C.number order by A.id, B.number id | lname | type | number | target | duration -----+--------------+------+----------------+----------------+---------- g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7072 | 9 g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7070 | 23 g10 | Kalan | work | 1-999-444-9999 | 1-222-907-7071 | 7 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7072 | 9 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7070 | 23 g9 | Moschetti | home | 1-777-999-1212 | 1-222-807-7071 | 7 g9 | Moschetti | work | 1-800-989-2231 | 1-987-707-7072 | 9 g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7071 | 7 g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7072 | 9 g9 | Moschetti | home | 1-900-555-1212 | 1-222-707-7070 | 23 “Early bail out” from cursor is now possible – but logic to construct list of source and target numbers is similar
  27. 27. SQL is about Disassembly String s = “select A, B, C, D, E, F from T1,T2,T3 where T1.col = T2.col and T2.col2 = T3.col2 and X = Y and X2 != Y2 and G > 10 and G < 100 and TO_DATE(‘ …”; ResultSet rs = execute(s); while(ResultSet.next()) { if(new column1 value from T1) { set up new Object; } if(new column2 value from T2) { set up new Object2 } if(new column3 value from T3) { set up new Object3 } populate maps, lists and scalars } Design a Big Query including business logic to grab all the data up front Throw it at the engine Disassemble Big Rectangle into usable objects with logic implicit in change in column values
  28. 28. MongoDB is about Assembly Cursor c = coll1.find({“X”:”Y”}); while(c.hasNext()) { populate maps, lists and scalars; Cursor c2 = coll2.find(logic+key from c); while(c2.hasNext()) { populate maps, lists and scalars; Cursor c3 = coll3.find(logic+key from c2); while(c3.hasNext()) { populate maps, lists and scalars; } } Assemble usable objects incrementally with explicit logic
  29. 29. MongoDB ”Join” Map idmap = new HashMap(); DBCursor c = contacts.find(); while(c.hasNext()) { DBObject item = c.next(); String id = item.get(“id”); Map nummap = new HashMap(); for(Map phone : (List)item.get(”phones”)) { String pnum = phone.get(“number”); DBObject q = new BasicDBObject(“number”, pnum); DBCursor c2 = phonestx.find(q); List txs = new ArrayList(); while(c2.hasNext()) { txs.add((Map)c2.next()); } nummap.put(pnum, txs); } idmap.put(id, nummap); } // idmap[“g9”][“1-900-555-1212”] = ({target:1-222-707-7070,duration:23…)
  30. 30. But what about “real” queries? • MongoDB query language is a physical map-of-map based structure, not a String • Operators (e.g. AND, OR, GT, EQ, etc.) and arguments are keys and values in a cascade of Maps • No grammar to parse, no templates to fill in, no whitespace, no escaping quotes, no parentheses, no punctuation • Same paradigm to manipulate data is used to manipulate query expressions • …which is also, by the way, the same paradigm for working with MongoDB metadata and explain()
  31. 31. MongoDB Query Examples Find all contacts with at least one work phone SQL CLI select * from contact A, phones B where A.did = B.did and B.type = 'work’; MongoDB CLI db.contact.find({"phones.type”:”work”}); SQL in Java String s = “select * from contact A, phones B where A.did = B.did and B.type = 'work’”; ResultSet rs = execute(s); MongoDB via Java driver DBObject expr = new BasicDBObject(); expr.put(“phones.type”, “work”); Cursor c = contact.find(expr);
  32. 32. MongoDB Query Examples Find all contacts with at least one work phone or hired after 2014-02-02 SQL select A.did, A.lname, A.hiredate, B.type, B.number from contact A left outer join phones B on (B.did = A.did) where b.type = 'work' or A.hiredate > '2014-02-02'::date MongoDB CLI db.contacts.find({"$or”: [ {"phones.number":"1-900-555- 1212”}, {"hireda te": {”$gt": new ISODate("2014-02-02")}} ]});
  33. 33. MongoDB Query Examples Find all contacts with at least one work phone or hired after 2014-02-02 MongoDB via Java driver List arr = new ArrayList(); Map phones = new HashMap(); phones.put(“phones.type”, “work”); arr.add(phones); Map hdate = new HashMap(); java.util.Date d = dateFromStr(“2014-02-02”); hdate.put(“hiredate”, new BasicDBObject(“$gt”,d)); Map m1 = new HashMap(); m1.put(“$or”, arr); contact.find(new BasicDBObject(m1));
  34. 34. …and before you ask… Yes, MongoDB query expressions support 1. Sorting 2. Cursor size limit 3. Projection (asking for only parts of the rich shape to be returned) 4. Aggregation (“GROUP BY”) functions
  35. 35. Day 30: RAD on MongoDB with Python import pymongo def save(data): coll.insert(data) def fetch(id): return coll.find_one({”id": id } ) myData = { “name”: “jane”, “id”: “K2”, # no title? No problem “hireDate”: datetime.date(2011, 11, 1), “phones”: [ { "type": "work", "number": "1-800-555-1212" }, { "type": "home", "number": "1-866-444-3131" } ] } save(myData) print fetch(“K2”) expr = {"$or": [ {"phones.type": “work”}, {”hiredate": {“$gt”: datetime.date(2014,2,2)}} ]} for c in coll.find(expr): print [ k.upper() for k in sorted(c.keys()) ] Advantages: 1. Far easier and faster to create scripts due to “fidelity-parity” of mongoDB map data and python (and perl, ruby, and javascript) structures 1. Data types and structure in scripts are exactly the same as that read and written in Java and C++
  36. 36. Day 30: Polymorphic RAD on MongoDB with Python import pymongo item = fetch("K8") # item is: { “name”: “bob”, “id”: “K8”, "personalData": { "preferedAirports": [ "LGA", "JFK" ], "travelTimeThreshold": { "value": 3, "units": “HRS”} } } item = fetch("K9") # item is: { “name”: “steve”, “id”: “K9”, "personalData": { "lastAccountVisited": { "name": "mongoDB", "when": datetime.date(2013,11,4) }, "favoriteNumber": 3.14159 } } Advantages: 1. Scripting languages easily digest shapes with common fields and dissimilar fields 2. Easy to create an information architecture where placeholder fields like personalData are “known” in the software logic to be dynamic
  37. 37. Day 30: (Not) RAD on top of SQL with Python init() { contactInsertStmt = connection.prepareStatement (“insert into contact ( id, name, title, hiredate ) values ( ?,?,?,? )”); c2stmt = connection.prepareStatement(“insert into phones (id, type, number) values (?, ?, ?)”; fetchStmt = connection.prepareStatement (“select id, name, title, hiredate, type, number from contact, phones where phones.id = contact.id and contact.id = ?”); } save(Map m) { startTrans(); contactInsertStmt.setString(1, m.get(“id”)); contactInsertStmt.setString(2, m.get(“name”)); contactInsertStmt.setString(3, m.get(“title”)); contactInsertStmt.setDate(4, m.get(“hireDate”)); for(Map onePhone : m.get(“phones”)) { c2stmt.setString(1, onePhone.get(“type”)); c2stmt.setString(2, onePhone.get(“number”)); c2stmt.execute(); } contactInsertStmt.execute(); endTrans(); } Consequences: 1. All logic coded in Java interface layer (unwinding contact, phones, preferences, etc.) needs to be rewritten in python (unless Jython is used) … AND/or perl, C++, Scala, etc. 2. No robust way to handle polymorphic data other than BLOBing it 3. …and that will take real time and money!
  38. 38. The Fundamental Change with mongoDB RDBMS designed in era when: • CPU and disk was slow & expensive • Memory was VERY expensive • Network? What network? • Languages had limited means to dynamically reflect on their types • Languages had poor support for richly structured types Thus, the database had to • Act as combiner-coordinator of simpler types • Define a rigid schema • (Together with the code) optimize at compile-time, not run-time In mongoDB, the data is the schema!
  39. 39. mongoDB and the Rich Map Ecosystem Generic comparison of two records Map expr = new HashMap(); expr.put("myKey", "K1"); DBObject a = collection.findOne(expr); expr.put("myKey", "K2"); DBObject b = collection.findOne(expr); List<MapDiff.Difference> d = MapDiff.diff((Map)a, (Map)b); Getting default values for a thing on a certain date and then overlaying user preferences (like for a calculation run) Map expr = new HashMap(); expr.put("myKey", "DEFAULT"); expr.put("createDate", new Date(2013, 11, 1)); DBObject a = collection.findOne(expr); expr.clear(); expr.put("myKey", "user1"); DBObject b = otherCollectionPerhaps.findOne(expr); MapStack s = new MapStack(); s.push((Map)a); s.push((Map)b); Map merged = s.project(); Runtime reflection of Maps and Lists enables generic powerful utilities (MapDiff, MapStack) to be created once and used for all kinds of shapes, saving time and money
  40. 40. Lastly: A CLI with teeth > db.contact.find({"SeqNum": {"$gt”:10000}}).explain(); { "cursor" : "BasicCursor", "n" : 200000, //... "millis" : 223 } Try a query and show the diagnostics > for(v=[],i=0;i<3;i++) { … n = i*50000; … expr = {"SeqNum": {"$gt”: n}}; … v.push( [n, db.contact.find(expr).explain().millis)] } Run it 3 times with smaller and smaller chunks and create a vector of timing result pairs (size,time) > v [ [ 0, 225 ], [ 50000, 222 ], [ 100000, 220 ] ] Let’s see that vector > load(“jStat.js”) > jStat.stdev(v.map(function(p){return p[1];})) 2.0548046676563256 Use any other javascript you want inside the shell > for(i=0;i<3;i++) { … expr = {"SeqNum": {"$gt":i*1000}}; … db.foo.insert(db.contact.find(expr).explain()); } Party trick: save the explain() output back into a collection!
  41. 41. What Does All This Add Up To? • MongoDB easier than RDBMS/SQL for real problems • Quicker to change • Much better harmonized with modern languages + • Comprehensive indexing (arbitrary non/unique secondaries, compound keys, geospatial, text search, TTL, etc….) • Horizontally scalable to petabytes • Isomorphic HA and DR Modern Database for Modern Solutions =
  42. 42. Webinar Q&A buzz.moschetti@mongodb.com
  43. 43. #MongoDB Thank You Buzz Moschetti buzz.moschetti@mongodb.com Enterprise Architect, MongoDB

×