Modern day application development demands persistence of complex and dynamic shapes of data to match the highly flexible and powerful languages used in today's software landscape. Traditional approaches to solutions development with RDBMS increasingly expose the gap between the ease of use of modern development languages and the relational data model. Development time is wasted as the bulk of the work shifts from adding business features to struggling with the RDBMS. MongoDB, the leading NoSQL database, offers a flexible and scalable solution.
In this webinar, we will provide a medium-to-deep exploration of the MongoDB programming model and APIs and how they transform the way developers interact with a database, leading to:
Faster time to market for both initial deployment and subsequent change
Lower development costs
More choices in coupling features of a language to the database
We will also review the advantages of MongoDB technology in the rapid applications development (RAD) space for popular scripting languages such as javascript, python, perl, and ruby.
2. Who is your Presenter?
•
Yes, I use “Buzz” on my business cards
•
Former Investment Bank Chief Architect at
JPMorganChase and Bear Stearns before that
•
Over 25 years of designing and building systems
•
•
•
•
•
Big and small
Super-specialized to broadly useful in any vertical
“Traditional” to completely disruptive
Advocate of language leverage and strong factoring
Still programming – using emacs, of course
3. What Are Your Developers Doing All
Day?
Adding and testing business features
OR
“Integrating with other components, tools, and
systems”
•
•
•
•
•
Database(s)
ETL and other data transfer operations
Messaging
Services (web & other)
Other open source frameworks
4. Why Can’t We Just Save and Fetch
Data?
Because the way we think about data at the
business use case level…
…is different than the way it is implemented at
the application/code level…
…which traditionally is VERY different than the
way it is implemented at the database level
5. This Problem Isn’t New…
…but for the past 40 years, innovation at the business & application layers
has outpaced innovation at the database layer
1974
2014
Business
Data Goals
Capture my company’s
transactions daily at
5:30PM EST, add them up
on a nightly basis, and print
a big stack of paper
Capture my company’s global transactions in realtime
plus everything that is happening in the world
(customers, competitors, business/regulatory,weather),
producing any number of computed results, and passing
this all in realtime to predictive analytics with model
feedback; results in realtime to 10000s of mobile
devices, multiple GUIs, and b2b and b2c channels
Release
Schedule
Quarterly
Yesterday
Application
/Code
COBOL, Fortran, Algol,
PL/1, assembler,
proprietary tools
COBOL, Fortran, C, C++, VB, C#, Java, javascript,
groovy, ruby, perl python, Obj-C, SmallTalk, Clojure,
ActionScript, Flex, DSLs, spring, AOP, CORBA, ORM,
third party software ecosystem, open source movement
Database
I/VSAM, early RDBMS
Mature RDBMS, legacy I/VSAM
Column & key/value stores, and…mongoDB
6. Exactly How Does mongoDB Change
Things?
• mongoDB is designed from the ground up to
address rich structure (maps of maps of lists
of…), not rectangles
•
•
Standard RDBMS interfaces (i.e. JDBC) do not exploit features
of contemporary languages
Rapid Application Development (RAD) and scripting in
Javascript, Python, Perl, Ruby, and Scala is impedancematched to mongoDB
• In mongoDB, the data is the schema
• Shapes of data go in the same way they come
out
7. Rectangles are 1974. Maps and Lists are
2014
{
customer_id : 1,
first_name : "Mark",
last_name : "Smith",
city : "San Francisco",
phones: [
{
type : “work”,
number: “1-800-555-1212”
},
{
type : “home”,
number: “1-800-555-1313”,
DNC: true
},
{
type : “home”,
number: “1-800-555-1414”,
DNC: true
}
]
}
8. An Actual Code Example (Finally!)
Let’s compare and contrast RDBMS/SQL to mongoDB
development using Java over the course of a few weeks.
Some ground rules:
1. Observe rules of Software Engineering 101: Assume separation of
application, Data Access Layer, and persistor implementation
2. Data Access Layer must be able to
a. Expose simple, functional, data-only interfaces to the application
•
No ORM, frameworks, compile-time bindings, special tools
b. Exploit high performance features of persistor
3. Focus on core data handling code and avoid distractions that require the same
amount of work in both technologies
a. No exception or error handling
b. Leave out DB connection and other setup resources
4. Day counts are a proxy for progress, not actual time to complete indicated task
9. The Task: Saving and Fetching Contact
data
Start with this
simple, flat shape in
the Data Access
Layer:
And assume we
save it in this way:
And assume we
fetch one by primary
key in this way:
Map m = new HashMap();
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
save(Map m)
Map m = fetch(String id)
Brace yourself…..
10. Day 1: Initial efforts for both technologies
SQL
mongoDB
DDL: create table contact ( … )
DDL:
none
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name ) values ( ?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name from contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.execute();
}
save(Map
Let’s assume for argument’s sakem)that both
{
collection.insert(m);
approaches take the same amount of time
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
}
return m;
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
11. Day 2: Add simple fields
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11, 1));
• Capturing title and hireDate is part of adding a new
business feature
• It was pretty easy to add two fields to the structure
• …but now we have to change our persistence code
Brace yourself (again) …..
12. SQL Day 2 (changes in bold)
DDL:
alter table contact add title varchar(8);
alter table contact add hireDate date;
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate ) values
( ?,?,?,? )”);
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate from contact where id =
?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
}
return m;
}
Consequences:
1. Code release schedule linked
to database upgrade (new
code cannot run on old
schema)
2. Issues with case sensitivity
starting to creep in (many
RDBMS are case insensitive
for column names, but code is
case sensitive)
3. Changes require careful mods
in 4 places
4. Beginning of technical debt
13. mongoDB Day 2
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE
Advantages:
1. Zero time and money spent on
overhead code
2. Code and database not physically
linked
3. New material with more fields can
be added into existing collections;
backfill is optional
4. Names of fields in database
precisely match key names in
code layer and directly match on
name, not indirectly via positional
offset
5. No technical debt is created
14. Day 3: Add list of phone numbers
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
m.put(“title”, “Mr.”);
m.put(“hireDate”, new Date(2011, 11,
1));
n1.put(“type”, “work”);
n1.put(“number”, “1-800-555-1212”));
list.add(n1);
n2.put(“type”, “home”));
n2.put(“number”, “1-866-444-3131”));
list.add(n2);
m.put(“phones”, list);
• It was still pretty easy to add this data to the structure
• .. but meanwhile, in the persistence code …
REALLY brace yourself…
15. SQL Day 3 changes: Option 1: Assume
just 1 work and 1 home phone number
DDL:
alter table contact add work_phone varchar(16);
alter table contact add home_phone varchar(16);
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact (
id, name, title, hiredate, work_phone, home_phone )
values ( ?,?,?,?,?,? )”);
fetchStmt = connection.prepareStatement
(“select
id, name, title, hiredate, work_phone, home_phone from
contact where id = ?”);
}
save(Map m)
{
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
String t = onePhone.get(“type”);
String n = onePhone.get(“number”);
if(t.equals(“work”)) {
contactInsertStmt.setString(5, n);
} else if(t.equals(“home”)) {
contactInsertStmt.setString(6, n);
}
}
contactInsertStmt.execute();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
if(rs.next()) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
Map onePhone;
onePhone = new HashMap();
onePhone.put(“type”, “work”);
onePhone.put(“number”, rs.getString(5));
list.add(onePhone);
onePhone = new HashMap();
onePhone.put(“type”, “home”);
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
m.put(“phones”, list);
}
This is just plain bad….
16. SQL Day 3 changes: Option 2:
Proper approach with multiple phone
numbers
DDL:
create table phones ( … )
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
c2stmt.setString(1, m.get(“id”));
c2stmt.setString(2, onePhone.get(“type”));
c2stmt.setString(3, onePhone.get(“number”));
c2stmt.execute();
}
contactInsertStmt.execute();
endTrans();
}
Map fetch(String id)
{
Map m = null;
fetchStmt.setString(1, id);
rs = fetchStmt.execute();
int i = 0;
List list = new ArrayList();
while (rs.next()) {
if(i == 0) {
m = new HashMap();
m.put(“id”, rs.getString(1));
m.put(“name”, rs.getString(2));
m.put(“title”, rs.getString(3));
m.put(“hireDate”, rs.getDate(4));
m.put(“phones”, list);
}
Map onePhone = new HashMap();
onePhone.put(“type”, rs.getString(5));
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
i++;
}
return m;
}
This took time and money
17. SQL Day 5: Zombies!
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select A.id, A.name, A.title, A.hiredate, B.type, B.number from
contact A left outer join phones B on (A.id = B. id) where A.id =
?”);
}
while (rs.next()) {
if(i == 0) {
// …
}
String s = rs.getString(5);
if(s != null) {
Map onePhone = new HashMap();
onePhone.put(“type”, s);
onePhone.put(“number”, rs.getString(6));
list.add(onePhone);
}
}
(zero or more between entities)
Whoops! And it’s also wrong!
We did not design the query accounting
for contacts that have no phone number.
Thus, we have to change the join to an
outer join.
But this ALSO means we have to change
the unwind logic
This took more time and
…but at least we have a DAL…
money!
right?
18. mongoDB Day 3
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE
Advantages:
1. Zero time and money spent on
overhead code
2. No need to fear fields that are
“naturally occurring” lists
containing data specific to the
parent structure and thus do not
benefit from normalization and
referential integrity
19. By Day 14, our structure looks like this:
m.put(“name”, “buzz”);
m.put(“id”, “K1”);
//…
n4.put(“startupApps”, new String[] { “app1”, “app2”, “app3” } );
n4.put(“geo”, “US-EAST”);
list2.add(n4);
n4.put(“startupApps”, new String[] { “app6” } );
n4.put(“geo”, “EMEA”);l
n4.put(“useLocalNumberFormats”, false):
list2.add(n4);
m.put(“preferences”, list2)
n6.put(“optOut”, true);
n6.put(“assertDate”, someDate);
seclist.add(n6);
m.put(“attestations”, seclist)
m.put(“security”, anotherMapOfData);
• It was still pretty easy to add this data to the structure
• Want to guess what the SQL persistence code looks like?
• How about the mongoDB persistence code?
20. SQL Day 14
Error:
Could not fit all the code into this space.
…actually, I didn’t want to spend 2 hours putting the code together..
But very likely, among other things:
•
n4.put(“startupApps”,new String[]{“app1”,“app2”,“app3”});
was implemented as a single semi-colon delimited string
• m.put(“security”, anotherMapOfData);
was implemented by flattening it out and storing a subset of fields
21. mongoDB Day 14 – and every other day
save(Map m)
{
collection.insert(m);
}
Map fetch(String id)
{
Map m = null;
DBObject dbo = new BasicDBObject();
dbo.put(“id”, id);
c = collection.find(dbo);
if(c.hasNext()) }
m = (Map) c.next();
}
return m;
}
✔ NO
CHANGE
Advantages:
1. Zero time and money spent on
overhead code
2. Persistence is so easy and flexible
and backward compatible that the
persistor does not upwardinfluence the shapes we want to
persist i.e. the tail does not wag
the dog
22. But what about “real” queries?
• mongoDB query language is a physical map-ofmap based structure, not a String
• Operators (e.g. AND, OR, GT, EQ, etc.) and arguments are
keys and values in a cascade of Maps
• No grammar to parse, no templates to fill in, no
whitespace, no escaping quotes, no parentheses, no
punctuation
• Same paradigm to manipulate data is used to
manipulate query expressions
• …which is also, by the way, the same paradigm
for working with mongoDB metadata and
explain()
23. mongoDB Query Examples
Objective
Code
CLI
Find all contacts with
at least one mobile
phone
Map expr = new HashMap();
expr.put(“phones.type”, “mobile”);
db.contact.find({"phones.type”:"mobile”});
Find contacts with NO
phones
Map expr = new HashMap();
Map q1 = new HashMap();
q1.put(“$exists”, false);
expr.put(“phones”, q1);
db.contact.find({"phones”:{"$exists”:false}});
Advantages:
List fetchGeneral(Map expr)
{
List l = new ArrayList();
DBObject dbo = new BasicDBObject(expr);
Cursor c = collection.find(dbo);
while (c.hasNext()) }
l.add((Map)c.next());
}
return l;
}
1. Far less time required to set up
complex parameterized filters
2. No need for SQL rewrite logic or
creating new PreparedStatements
3. Map-of-Maps query structure is easily
walked and processed without parsing
24. …and before you ask…
Yes, mongoDB query expressions
support
1. Sorting
2. Cursor size limit
3. Aggregation functions
4. Projection (asking for only parts of the rich
shape to be returned)
25. Day 30: RAD on mongoDB with Python
import pymongo
def save(data):
coll.insert(data)
Advantages:
def fetch(id):
return coll.find_one({”id": id } )
1. Far easier and faster to create
scripts due to “fidelity-parity” of
mongoDB map data and python
(and perl, ruby, and javascript)
structures
myData = {
“name”: “jane”,
“id”: “K2”,
# no title? No problem
“hireDate”: datetime.date(2011, 11, 1),
“phones”: [
{ "type": "work",
"number": "1-800-555-1212"
},
{ "type": "home",
"number": "1-866-444-3131"
}
]
}
save(myData)
print fetch(“K2”)
1. Data types and structure in scripts
are exactly the same as that read and
written in Java and C++
expr = { "$or": [ {"phones": { "$exists": False }}, {"name": ”jane"}]}
for c in coll.find(expr):
print [ k.upper() for k in sorted(c.keys()) ]
26. Day 30: Polymorphic RAD on mongoDB with
Python
import pymongo
item = fetch("K8")
# item is:
{
“name”: “bob”,
“id”: “K8”,
"personalData": {
"preferedAirports": [ "LGA", "JFK" ],
"travelTimeThreshold": { "value": 3,
"units": “HRS”}
}
}
item = fetch("K9")
# item is:
{
“name”: “steve”,
“id”: “K9”,
"personalData": {
"lastAccountVisited": {
"name": "mongoDB",
"when": datetime.date(2013,11,4)
},
"favoriteNumber": 3.14159
}
}
Advantages:
1. Scripting languages easily digest
shapes with common fields and
dissimilar fields
2. Easy to create an information
architecture where placeholder fields
like personalData are “known” in the
software logic to be dynamic
27. Day 30: (Not) RAD on top of SQL with
Python
init()
{
contactInsertStmt = connection.prepareStatement
(“insert into contact ( id, name, title, hiredate )
values ( ?,?,?,? )”);
c2stmt = connection.prepareStatement(“insert into
phones (id, type, number) values (?, ?, ?)”;
fetchStmt = connection.prepareStatement
(“select id, name, title, hiredate, type, number from
contact, phones where phones.id = contact.id and
contact.id = ?”);
}
save(Map m)
{
startTrans();
contactInsertStmt.setString(1, m.get(“id”));
contactInsertStmt.setString(2, m.get(“name”));
contactInsertStmt.setString(3, m.get(“title”));
contactInsertStmt.setDate(4, m.get(“hireDate”));
for(Map onePhone : m.get(“phones”)) {
c2stmt.setString(1, onePhone.get(“type”));
c2stmt.setString(2, onePhone.get(“number”));
c2stmt.execute();
}
contactInsertStmt.execute();
endTrans();
}
Consequences:
1. All logic coded in Java interface
layer (splitting up contact, phones,
preferences, etc.) needs to be
rewritten in python (unless Jython
is used) … AND/or perl, C++,
Scala, etc.
2. No robust way to handle
polymorphic data other than
BLOBing it
3. …and that will take real time and
money!
28. The Fundamental Change with mongoDB
RDBMS designed in era when:
• CPU and disk was slow &
expensive
• Memory was VERY expensive
• Network? What network?
• Languages had limited means to
dynamically reflect on their types
• Languages had poor support for
richly structured types
Thus, the database had to
• Act as combiner-coordinator of
simpler types
• Define a rigid schema
• (Together with the code) optimize
at compile-time, not run-time
In mongoDB, the
data is the schema!
29. mongoDB and the Rich Map Ecosystem
Generic comparison of two
records
Map expr = new HashMap();
expr.put("myKey", "K1");
DBObject a = collection.findOne(expr);
expr.put("myKey", "K2");
DBObject b = collection.findOne(expr);
List<MapDiff.Difference> d = MapDiff.diff((Map)a, (Map)b);
Getting default values for a thing
on a certain date and then
overlaying user preferences (like
for a calculation run)
Map expr = new HashMap();
expr.put("myKey", "DEFAULT");
expr.put("createDate", new Date(2013, 11,
1));
DBObject a = collection.findOne(expr);
expr.clear();
expr.put("myKey", "user1");
DBObject b = otherCollectionPerhaps.findOne(expr);
MapStack s = new MapStack();
s.push((Map)a);
s.push((Map)b);
Map merged = s.project();
Runtime reflection of Maps and Lists enables generic powerful utilities
(MapDiff, MapStack) to be created once and used for all kinds of shapes,
saving time and money
30. Lastly: A CLI with teeth
> db.contact.find({"SeqNum": {"$gt”:10000}}).explain();
{
"cursor" : "BasicCursor",
"n" : 200000,
//...
"millis" : 223
}
Try a query and show the
diagnostics
> for(v=[],i=0;i<3;i++) {
… n = i*50000;
… expr = {"SeqNum": {"$gt”: n}};
… v.push( [n, db.contact.find(expr).explain().millis)] }
Run it 3 times with smaller and
smaller chunks and create a
vector of timing result pairs
(size,time)
>v
[ [ 0, 225 ], [ 50000, 222 ], [ 100000, 220 ] ]
Let’s see that vector
> load(“jStat.js”)
> jStat.stdev(v.map(function(p){return p[1];}))
2.0548046676563256
Use any other javascript you
want inside the shell
> for(i=0;i<3;i++) {
… expr = {"SeqNum": {"$gt":i*1000}};
… db.foo.insert(db.contact.find(expr).explain()); }
Party trick: save the explain()
output back into a collection!
Hello this is Buzz Moschetti; welcome to the webinar entitled “drama…”Today I’m going to hilight the progressive and powerful programming model in mongoDB and how it not only reduces time-to-market but also increases flexibility and capability.The content applies to any industry but those with questions about specific use cases in financial services, please feel free to reach out to me at the email address buzz.moschetti@mongodb.comSome quick logistics:The presentation audio & slides will be recorded and made available to you in about 24 hours.We have an hour set up but I’ll use about 40 minutes of that for the presentation with some time for questions.You can use the webex Q&A box to ask those questions at any time. If a significant number of similar questions show up in the middle, I will answer them; otherwise, I’ll try to answer as many as possible at the end.If you have technical issues, please send a webex message to the participant ID’d as mongoDB webinar team; otherwise keep your Qs focused on the content.
The way we describe concepts like trades, products, scenarios, workflows, locations and all the ways these things can work together is difficult to translate to software.It would be nice if we could record a use case, turn that into an MP3, and pass it to a runtime engine that would literally do what it is told but we’re not quite there yet
For several decades, database innovation has lagged behind other areas.Data create and consume goals and releases VERY differentApp/Code environment choices are significantly broader – moving from compile-time/vendor oriented to dynamic runtime open standardsWhy do databases lag?It’s hard to build a good, robust database.Until recently, did not have the power/flexibility of platform (incl cloud) plus new channels and interactive scale like smart mobile devices to push us over the hurdle into a new database model to satisfy these needs
How does it do this?A number of ways:BUT today I won’t get into the infrastructure side of things: low-cost horizontal scalability, no-downtime version upgrades, multisite DR, and indeed a coherent distributed scaling/HA/DR strategy.That’s all great.Today I want to talk about the topside experience, around these 3 points.Third: There is Symmetry between read and write operations on collections; this will become important as complexity of shapes increases.
To summarize (and we can’t resist putting in a huge ER diagram)Rectangles and the technology to support them is circa 1974.Management of Rich shape is Now.
2: The role of the data access layer is to hide implementation specific details of raw data access from the consumer. The implementation inside the DAL can be as DB-specific as possible to maximize performance or other i/o goals.2: The DAL contains as many functions as appropriate to vend data to applications – could be dozens.2: The topside of the DAL exposes only data to a consumer, not necessarily bespoke objects. This is because some data operations are insufficient or inappropriate to populate a true object. Logic on top of the DAL is required to take the raw data and construct the appropriate object. Even in the mongoDB rich data world, data does not necessarily = object!2: ORM (notably Hibernate) and annotation based frameworks (like Morphia) have a different set of dependency & design considerations2: From a practical basis, it may be necessary to perform high performance data-only operations independent of objects and the DAL permits this to occur without exposing the entire implementation of persistence.3: “Nearly-compilable” code
Why a Map?Why not?No compile time dependencies; we are restricting the types at the data access layer to pure data, not complex objects (e.g. no m.put(“contact”, Contact object);Response can carry additional informationMaps are very easy to work with and have a lot of tooling around them – especially if you constrain your types.Brace yourself.
Remember: This is happening in the logic of the DAL, not the application!
Consequences: code/schema couplingThis took only a bit of time, yes, but it is pure overhead:No business feature valueBeginning of “technical debt”: Here, manifest as disconnect between schema in DB, logic in DAL, and The MapSQL nuances:Positional params are here for convenience but don’t really change things because:A. InresultSet, a “select column AS foo” changes the resultset column name to foo instead of the real column name; thus what you see in the DB vs. the code could differ.Interest fact: select fld1 as foo, fld2 as foo is LEGAL! ResultSet.findColumn() and getString(String name) return the first one they find!B. The column names are still in a different semantic domain (e.g. case insensitivity) than our Map names so you have to provide a “column”->”mapKey” mapping anyway (no direct storage like mongoDB)C. preparedStatements can’t use column names for substitution anyway – they can ONLY use positional parameters; more evidence of input/output mismatch…D. Not going to be the dominant sweat/stability issue – it’s the relational composition and decomposition which we’ll see later.
No change!Values we set into the map re preserved; period.
Add of phone numbers Could be several; we don’t know how manyThere could be other attributes like do not call and smartphone type, but we’ll leave that to later.
Attempt to duck the “listiness” of multiple phones has yielded a bad information architecture practically from the start of developmentThis is just plain bad.Setting the stage for pain when more phones are added in the futureMore technical debt!
This is actually a “friendly case”: phones uses the same ID as contact; in other cases, a new foreign key would have to be managed We are sidestepping cross-table integrity; other functions besides save() would come into play The JOIN to fetch all information produces a ResultSet unwind issue. The id, name, title, and hiredate are repeated in the cartesian product and have to be managed via logic. Problem is magnified when more than one id is requested because the ordering is not guaranteed – unless ORDER BY is specified in the query, which impacts performance.This took real time and money! All previous consequences (esp. code/schema release coupling) still present. And Save vs. Fetch logic clearly starting to become asymmetric.The extra work is only starting to begin because in a few days…
The Zombies emerge. They’re prevalent in pop culture … and in traditional RDBMS programming.Between day 3 and day 5 we loaded up some test data into the DBThis exposes a common challenge in SQL/RDBMS: As the data model grows – and particularly as “always set / sometimes set” dynamics come into play – one must increasingly be very careful about query constructionGood thing we have a DAL though: imagine if a bunch of those older queries escaped into the application space!
No dramaSimply saved the list of phone numbers (or none)No zombies
anotherMapOfData might even come from a different place altogether…
But likely:Because we have lived thru at least 1 schema upgrade, we are gun-shy about list-y data or under pressure so we create a semicolon delimited string of app names and store it as a single string.The otherMapOfData may or may not be all the fields. We store what we deem to be appropriate – even if that structure changes over time.
Tail does not wag the dog.Often, the Time, effort, coordination involved in proper modeling in the RDBMS world incentivizes developers to take shortcuts.
So we have saved and fetched a single item. What about Real Queries?This is where the mongoDB programming model starts to really shine.Number 1:Operators; no grammarFor simple queries, it is slightly more “involved” than SQL – but how many users type raw SQL into a screen for execution? Do you really want to do that?For complex queries, it ends up being no more difficult.The same way you build and manipulate data can be applied to manipulating queries.And while we’re at it – it is the same paradigm for consuming responses from the server, both data (in a cursor) and diagnostic and other operations results.Results can be processed, logged, visualized, formatted, etc. the same way for all operations without parsing or losing fidelity.
We often start with the CLI to show how things are done.But here, we show the actual map-of-map setup in code.Below, we’ve generalized fetch() into fetchGeneral() and instead of taking a String id, we now take a Map of expressions.This is the REALLY general form of fetch; more specialized versions might take Map fragments or scalar values which are inserted into pre-defined map-of-map structures.You don’t have to worry about “parsable syntax”. It is operators and operands that cooperate in very strongly but easily defined way.
Recall on slide 22 we said the data and query paradigm is the same; note that myData and expr are the same! Same tools, tricks, techniques can be applied to bothVery powerful but compact and clear scripts can be easily be written that leverage investments made in modules for that particular language.
Polymorphism: a field can have different types (objects shapes) from doc to doc within the same collection.Example K8, K9Very easy to craft a system where software relies on a few “well-known” fields like name and id in this example to manage information in the large but still save and extract with high-fidelity custom data to the parts of the software stack that understand it WITHOUT the persistor getting in the way.
You’ll have to recode the split-up in python unless you use Jython.But even then, there’s no easy solution for polymorphic data (you’ll have to develop your own rich data store/query filter/fetch subsystem)
In short, no malloc() in the old days.mongoDB takes advantage of the higher type fidelity of today’s popular and powerful languages.
Traditional RDBMS CLIs (Psql, isql) are interpreters for that particular flavor of SQL plus some extra commands.SQL is not a general purpose procedural language.mongoDB shell, however, can be viewed this way: it is a javascript interpreter that happens to load up some mongoDB Interface libraries.Party trick: no DDL or special setup needed! Results can be stored back immediately into the DB!All of this: Rich shapes of data, dynamic use of types, and symmetry of operation semantics lead to faster, easier development.