SlideShare una empresa de Scribd logo
1 de 138
Descargar para leer sin conexión
asya999
AGGREGATION
How MongoDB 4.2 Pipeline is Powering Queries, Updates and Views
PIPELINE POWER+++
asya999
AGGREGATION
How MongoDB 4.2 Pipeline is Powering Queries, Updates and Views
PIPELINE POWER+++
PREVIOUSLY ...
... 2017
#MDBW17
Analytics with MongoDB Aggregation Framework
@asya999 by Asya Kamsky,
Lead MongoDB Maven
PIPELINE POWER
STORE
RETRIEVE
ps ax | grep mongod | head 1
*nix command line pipe
PIPELINE
$match $group | $sort|
Input stream {} {} {} {} Result {} {} ...
PIPELINE
MongoDB document pipeline
DATA PIPELINE
STAGES
Stage 1 Stage 2 Stage 3 Stage 4
{} {} {} {}
{} {} {} {}
DATA PIPELINE
{} {} {} {}
{"$stage":{ ... }}
START
Collection
View
Special stage
STAGES
{title: "The Great Gatsby",
language: "English",
subjects: "Long Island"}
{title: "The Great Gatsby",
language: "English",
subjects: "New York"}
{title: "The Great Gatsby",
language: "English",
subjects: "1920s"}
{title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{"$match":{"language":"English"}}
$match
{ _id:"Long Island",
count: 1 },
$group
{ _id: "New York",
count: 2 },
$unwind
{ _id: "1920s",
count: 1 },
$sort $skip$limit $project
{"$unwind":"$subjects"}
{"$group":{"_id":"$subjects", "count":{"$sum:1}}
{ _id: "Harlem",
count: 1 },
{ _id: "Long Island",
count: 1 },
{ _id: "New York",
count: 2 },
{ _id: "1920s",
count: 1 },
{title: "Open City",
language: "English",
subjects: [
"New York"
"Harlem" ] }
{ title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{ title: "War and Peace",
language: "Russian",
subjects: [
"Russia",
"War of 1812",
"Napoleon"] },
{ title: "Open City",
language: "English",
subjects: [
"New York",
"Harlem" ] },
{title: "Open City",
language: "English",
subjects: "New York"}
{title: "Open City",
language: "English",
subjects: "Harlem"}
{ _id: "Harlem",
count: 1 },
{"$sort:{"count":-1} {"$limit":3}
{"$project":...}
$cursor
{title: "The Great Gatsby",
language: "English".
subjects: "Long Island"}
{title: "The Great Gatsby",
language: "English",
subjects: "New York"}
{title: "The Great Gatsby",
language: "English",
subjects: "1920s"}
{title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{"$match":{"language":"English"}}
$match
{ _id:"Long Island",
count: 1 },
$group
{ _id: "New York",
count: 2 },
$unwind
{ _id: "1920s",
count: 1 },
$sort $skip$limit $project
{"$unwind":"$subjects"}
{"$group":{"_id":"$subjects", "count":{"$sum:1}}
{ _id: "Harlem",
count: 1 },
{ _id: "Long Island",
count: 1 },
{ _id: "New York",
count: 2 },
{ _id: "1920s",
count: 1 },
{title: "Open City",
language: "English",
subjects: [
"New York"
"Harlem" ] }
{ title: "The Great Gatsby",
language: "English",
subjects: [
"Long Island",
"New York",
"1920s"] },
{ title: "War and Peace",
language: "Russian",
subjects: [
"Russia",
"War of 1812",
"Napoleon"] },
{ title: "Open City",
language: "English",
subjects: [
"New York",
"Harlem" ] },
{title: "Open City",
language: "English",
subjects: "New York"}
{title: "Open City",
language: "English",
subjects: "Harlem"}
{ _id: "Harlem",
count: 1 },
{"$sort:{"count":-1} {"$limit":3}
{"$project":...}
$group $sort
1
INPUT STAGE RESULTSSTAGE
STREAMING RESOURCE USE
Each document is streamed through in RAM
INPUT STAGE RESULTSSTAGE
BLOCKING RESOURCE USE
Everything has to be kept in RAM (or spill)
$sort$match $group
start=ISODate("...")
end=ISODate("...")
{
user: "303900",
ipaddr: "71.56.112.56",
ts:ISODate("2017-05-08T...")
}
{$match:{ts:{$gte:start,$lt:end}}},
{$sort:{ts:1}},
{$group:{_id:"$user",ips:{$push:{ip:"$ipaddr", ts:"$ts"}},
diffIps:{$addToSet:"$ipaddr"}}},
{$match:{"diffIps.1":{$exists:true}}},
{$addFields:{diffs: {$filter:{
input:{$map:{
input: {$range:[0,{$subtract:[{$size:"$ips"},1]}]}, as:"i",
in:{$let:{vars:{ip1:{$arrayElemAt:["$ips","$$i"]},
ip2:{$arrayElemAt:["$ips",{$add:["$$i",1]}]}},
in:{
diff:{$cond:{
if:{$ne:["$$ip1.ip","$$ip2.ip"]},
then:{$divide:[{$subtract:["$$ip2.ts","$$ip1.ts"]},60000]},
else: 9999 }},
ip1:"$$ip1.ip", t1:"$$ip1.ts",
ip2:"$$ip2.ip", t2:"$$ip2.ts"
}}}}},
cond:{$lt:["$$this.diff",10]}
}}}},
{$match:{"diffs":{$ne:[]}}},
{$project:{_id:0, user:"$_id", suspectLogins:"$diffs"}}
{ "user" : "35237073",
"suspectLogins" : [
{"diff": 4.8333333333,
"ip1": "106.220.151.16",
"t1":"2017-05-08T06:58",
"ip2": "223.182.113.15"
"t2":"2017-05-08T07:03"
},
{"diff": 8.3,
"ip1": "223.182.113.15",
"t1":"2017-05-08T07:03",
"ip2": "49.206.217.26",
"t2":"2017-05-08T07:11"
}
]
} $match $addFields $match $project
https://github.com/asya999/mdbw17
5 minute review
https://github.com/asya999/mdbw17
#MDBW17
Analytics with MongoDB Aggregation Framework
@asya999 by Asya Kamsky,
Lead MongoDB Maven
PIPELINE POWER
https://github.com/asya999/mdbw17
PREVIOUSLY ...
... 2017
PREVIOUSLY ...
... 2017 ...
PREVIOUSLY ...
... 2017 ... 2018
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
THE FUTURE OF AGGREGATION
More options for output
Unify different languages
More options for output
Unify different languages
THE PRESENT OF AGGREGATION
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
Unify Different Languages
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
Unify Different Languages
db.c.aggregate([
{$addFields:{
numChildren:{$size:"$children"},
numDependents:{$size:{
$filter:{
input:"$children.dep",
cond: "$$this"
}
}}
}},
...
])
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
Unify Different Languages
db.c.aggregate([
{$addFields:{
numChildren:{$size:"$children"},
numDependents:{$size:{
$filter:{
input:"$children.dep",
cond: "$$this"
}
}}
}},
...
])
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
Unify Different Languages
db.c.find (
{$expr:{
$lt:[
{$size:{$filter:{
input: "$children.dep",
cond: "$$this"
}}},
2
]
}}
)
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
UPDATE
Unify Different Languages
db.c.find (
{$expr:{
$lt:[
{$size:{$filter:{
input: "$children.dep",
cond: "$$this"
}}},
2
]
}}
)
{children: [
{name:"Max", dob:"1994-12-01", dep:true},
{name:"Sam", dob:"1997-09-28", dep:true},
{name:"Kim", dob:"2000-02-29", dep:true}
]}
AGGREGATION
FIND
UPDATE
Unify Different Languages
db.c.update(
{$expr:{
$anyElementTrue:{$map:{
input:"$children",
in: {$and:[
{$lt:["$$this.dob","1997-01-22"]},
"$$this.dep"
]}
}}
}},
{$set:{ audit:true }}
)
db.coll.update(
<query>,
<update>,
<options>
)
Update
db.coll.update(
<query>,
<update>,
<options>
)
Update
db.coll.update(
<query>,
<update>,
<options>
)
<update>
Update
<update>
{
$set: { },
$inc: { },
$...
}
Update
{
f1: <value>,
f2: <value>,
...
}
<update>
{ } OR [ ]
Update in 4.2
[ ]
<update>
[ <aggregation-pipeline> ]
Update in 4.2
[ ]
Updates Using
Aggregation Pipeline
{ $addFields: { } }
{ $project: { } }
{ $replaceRoot: { } }
{ $set: { } }
{ $unset: [ ] }
{ $replaceWith: { } }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
{$inc:{a:1}},
{upsert:true})
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
"errmsg" : "Cannot apply
to a value of non-numeric type."
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$sum:["$a",1]}}} ],
{upsert:true})
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
{ _id: 1, a: 1 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$add:["$a",1]}}} ],
{upsert:true})
{ _id: 1, a: 1 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id: 1, a: 1 }
"errmsg" : "$add only supports
numeric or date types, not string"
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[
],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$cond:{
if:
then: , else: }} }}]
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: , else: }} }}],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }} }}],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 101 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }} }}],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$min:[ 100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }}]} }}],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$min:[ 100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }}]} }}],
{upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$min:[ 100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }}]},
prev_a: "$a"}}], {upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11 }
{ _id: 1, a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1 }
{ _id: 1 }
{ _id: 1, a: 10 }
{ _id: 1, a: 100 }
---
{ _id: 1, a: "10" }
db.coll.update({_id:1},
[ {$set:{a:{$min:[ 100, {$cond:{
if: {$eq:[{$type:"$a"},"missing"]},
then: 21, else: {$sum:["$a", 1]} }}]},
prev_a: "$a"}}], {upsert:true})
{ _id:1, a: 21 }
{ _id: 1, a: 11, prev_a: 10 }
{ _id: 1, a: 100, prev_a: 100 }
{ _id:1, a: 21 }
{ _id:1, a: 1, prev_a: "10" }
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{
}}
], {multi:true})
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
]}}
], {multi:true})
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
{ a:0, b:0, c:"unset" },
"$$ROOT"
]}}
], {multi:true})
Set Defaults
{_id: 1, a: 5, b: 12}
{_id: 2, a: 15, c: "abc"}
{_id: 3, b: 99, c: "xyz"}
If a or b are missing, set to 0, if c is missing -> "unset"
db.coll.update({}, [
{$replaceWith:{$mergeObjects:[
{ a:0, b:0, c:"unset" },
"$$ROOT"
]}}
], {multi:true})
Set Defaults
{_id: 1, a: 5, b: 12, c: "unset"}
{_id: 2, a: 15, b: 0, c: "abc"}
{_id: 3, a: 0, b: 99, c: "xyz"}
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y},
[ {$set:{h:{$cond:{
if:
then:
else:
}}}}],
{upsert:true})
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y},
[ {$set:{h:{$cond:{
if:
then:
else:
}}}}],
{upsert:true})
{ id: 1,
d: ISODate("2019-06-04T00:00:00"),
h: [
{ hour:"11", value: 296 },
{ hour:"12", value: 300 }
]}
id: X, d:Y, hour:Z, value: VAL
db.coll.update({id:X, d:Y},
[ {$set:{h:{$cond:{
if: {$in:[Z,{$ifNull:["$h.hour",[]]}]},
then:{$map:{
input:"$h",
in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this",
else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}}
}}
}},
else:{$concatArrays:[{$ifNull:["$h",[]]},[{hour:Z,value:VAL}]]}
}}}}],
{upsert:true})
Recap:
Updates can be specified with aggregation pipeline
All fields from existing document can be accessed
Slightly slower, but a lot more powerful
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
THE FUTURE OF AGGREGATION
Better performance & optimizations
More stages & expressions
More options for output
Compass helper for aggregate
Unify different languages
THE FUTURE OF AGGREGATION
More options for output
More Options for Output
Prior to MongoDB 4.2
$out
coll
new_coll
$out
Prior to MongoDB 4.2
$out
coll
new_coll
$out
db.coll.aggregate( [ {pipeline}, ...
{$out: "new_coll"} ]);
Prior to MongoDB 4.2
$out
coll
new_coll
$out
db.coll.aggregate( [ {pipeline}, ...
{$out: "new_coll"} ]);
new_coll
○ must be unsharded
○ overwrites existing
New $merge stage
in aggregation pipeline
MongoDB 4.2
$merge
coll
coll2
$merge
MongoDB 4.2
$merge
db.coll.aggregate( [
{pipeline}, ...,
{$merge: { ... }
]);
coll
coll2
$merge
MongoDB 4.2
$merge
db.coll.aggregate( [
{pipeline}, ...,
{$merge: { ... }
]);
coll2
can exist
same or different 'db'
can be sharded
coll
coll2
$merge
coll
coll2
$merge
{ } { } { } { }
{ } { } { } { }
MongoDB 4.2
$merge syntax
{
$merge: {
into: <target>
}
}
$merge syntax
{
$merge: {
into: <target>
}
}
{$merge: "collection2"}
$merge syntax
{
$merge: {
into: <target>
}
}
{$merge: {db: "db2", coll: "collection2"}
$merge syntax
{
$merge: {
into: <target>
}
}
$merge syntax
{
$merge: {
into: <target>,
on: <fields>
}
}
on: "_id"
on: [ "_id", "shardkey(s)" ]
must be unique
$merge syntax
{
$merge: {
into: <target>,
on: <fields>
}
}
Actions
source target
Actions
nothing matched:
source target
Actions
nothing matched: usually insert
source target
Actions
nothing matched: usually insert
document matched:
source target
Actions
nothing matched: usually insert
document matched: overwrite? update? ???
source target
Actions
nothing matched: usually insert
document matched: update
source target
Actions
nothing matched: usually insert
document matched: update (merge)
source target
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:
whenMatched:
}
}
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert",
whenMatched:
}
}
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert",
whenMatched:"merge"
}
}
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert"|"discard"|"fail",
whenMatched:"merge"
}
}
$merge syntax
{
$merge: {
into: <target>,
whenNotMatched:"insert"|"discard"|"fail",
whenMatched:"merge"|"replace"|"keepExisting"|"fail"|[...]
}
}
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[...]
}
}
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[<custom pipeline>]
}
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$addFields:{
}}
]
}
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$addFields:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
}
$merge example
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
_id: "37",
total: 309,
f1: "yyy"
}
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
}
$merge example 2
{
$merge: {
into: <target>,
whenMatched:[
{$replaceWith:{$mergeObjects:[
"$$new",
{total:{$sum:["$$new.total", "$total"]}}
]}}
]
}
}
Incoming Target
{
_id: "37",
total: 64,
f1: "x"
}
{
_id: "37",
total: 245,
f1: "yyy"
}
Result:
{
_id: "37",
total: 309,
f1: "x"
}
$merge syntax
{
$merge: {
into: <target>,
whenMatched:[...]
}
}
$merge syntax
{
$merge: {
into: <target>,
let: { ... },
whenMatched:[ ...]
}
}
$merge syntax
{
$merge: {
into: <target>,
let: {new: "$$ROOT"},
whenMatched:[ ...]
}
}
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
{
$merge: {
into: <target>,
whenMatched:[
{$set:{
total:{$sum:["$total","$$new.total"]}
}}
]
}
}
{
$merge: {
into: <target>,
let: {itotal: "$total"},
whenMatched:[
{$set:{
total:{$sum:["$total","$$itotal"]}
}}
]
}
}
EXAMPLES
APPEND from TEMP collection
temp
real
data
real
Using $merge to append loaded and
cleansed records loaded into db
aggregate 'temp' and append valid records to 'data'
db.temp.aggregate( [
{ ... } /* pipeline to massage and cleanse data in temp */,
{$merge:{
into: "data",
whenMatched: "fail"
}}
]);
aggregate 'temp' and append valid records to 'data'
db.temp.aggregate( [
{ ... } /* pipeline to massage and cleanse data in temp */,
{$merge:{
into: "data",
whenMatched: "fail"
}}
]);
Similar to SQL's INSERT INTO T1 SELECT * from T2
EXAMPLES
Maintain Single View
mflix
users
users
mfriendbook
users
sv
Using $merge to populate/update
user fields from other services
mflix
users
users
mfriendbook
users
sv
Using $merge to populate/update
user fields from other services
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy"
}
$merge updates fields from mflix.users collection into sv.users
collection. Our "_id" field is unique username
mflix_pipeline = [
{ "$project" : {
"_id" : "$username",
"mflix" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mflix)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy"
}
$merge updates fields from mflix.users collection into sv.users
collection. Our "_id" field is unique username
mflix_pipeline = [
{ "$project" : {
"_id" : "$username",
"mflix" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mflix) db.users.aggregate(mflix_pipeline)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... }
}
$merge updates fields from mfriendbook.users collection into sv.users
collection. Our "_id" field is unique username
mfriendbook_pipeline = [
{ "$project" : {
"_id" : "$username",
"mfriendbook" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mfriendbook)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... }
}
$merge updates fields from mfriendbook.users collection into sv.users
collection. Our "_id" field is unique username
mfriendbook_pipeline = [
{ "$project" : {
"_id" : "$username",
"mfriendbook" : "$$ROOT"
}},
{ "$merge" : {
"into" : {
"db": "sv",
"collection" : "users"
},
"whenNotMatched" : "discard"
}}
]
(in mfriendbook) db.users.aggregate(mfriendbook_pipeline)
sv.users
{
_id: "user253",
dob: ISODate(...),
f1: "yyy",
mflix: { ... },
mfriendbook: { ... }
}
EXAMPLES
Populate ROLLUPS into summary table
registrations
real
regsummary
real
Using $merge to incrementally
update periodic rollups in summary
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
db.registrations.aggregate([
{$match: {event_id: "MDBW19"}},
{$group:{
_id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}},
count: {$sum:1}
}},
{$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
$merge to create/update periodic
rollups in summary collection (for all days)
db.regsummary.createIndex({event:1, date:1}, {unique: true});
db.registrations.aggregate([
{$match: {event_id: "MDBW19"}},
{$group:{
_id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}},
count: {$sum:1}
}},
{$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
{ "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 }
{ "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 }
{ "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
$merge to incrementally update periodic rollups in summary
collection (for single day)
$merge to incrementally update periodic rollups in summary
collection (for single day)
db.registrations.aggregate([
{$match: {
event_id: "MDBW19",
date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")}
}},
{$count: "total"},
{$addFields: {event:"MDBW19", "date":"2019-05-22"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
$merge to incrementally update periodic rollups in summary
collection (for single day)
db.registrations.aggregate([
{$match: {
event_id: "MDBW19",
date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")}
}},
{$count: "total"},
{$addFields: {event:"MDBW19", "date":"2019-05-22"}},
{$merge: {
into: "regsummary",
on: ["event", "date"]
}}
])
{ "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 }
{ "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 }
{ "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
{ "event" : "MDBW19", "date" : "2019-05-22", "total" : 34 }
https://github.com/asya999/mdbw17
Asya Kamsky
https://github.com/asya999/mdbw17
Regent
https://github.com/asya999/mdbw17
MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Empowers Queries, Updates, and Materialized Views [MongoDB]

Más contenido relacionado

La actualidad más candente

Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
Kishor Parkhe
 
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
MongoSF
 
mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecture
zefhemel
 
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB
 

La actualidad más candente (20)

Aggregation in MongoDB
Aggregation in MongoDBAggregation in MongoDB
Aggregation in MongoDB
 
How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6How to leverage what's new in MongoDB 3.6
How to leverage what's new in MongoDB 3.6
 
Powerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation PipelinePowerful Analysis with the Aggregation Pipeline
Powerful Analysis with the Aggregation Pipeline
 
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
 
Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2Agg framework selectgroup feb2015 v2
Agg framework selectgroup feb2015 v2
 
Embedding a language into string interpolator
Embedding a language into string interpolatorEmbedding a language into string interpolator
Embedding a language into string interpolator
 
Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
 
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right WayMongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for Newbies
 
mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecture
 
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
[MongoDB.local Bengaluru 2018] Tutorial: Pipeline Power - Doing More with Mon...
 
ETL for Pros: Getting Data Into MongoDB
ETL for Pros: Getting Data Into MongoDBETL for Pros: Getting Data Into MongoDB
ETL for Pros: Getting Data Into MongoDB
 
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
 
Mobile Web 5.0
Mobile Web 5.0Mobile Web 5.0
Mobile Web 5.0
 
CouchDB @ red dirt ruby conference
CouchDB @ red dirt ruby conferenceCouchDB @ red dirt ruby conference
CouchDB @ red dirt ruby conference
 
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with Puppet
PuppetCamp SEA @ Blk 71 -  Nagios in under 10 mins with PuppetPuppetCamp SEA @ Blk 71 -  Nagios in under 10 mins with Puppet
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with Puppet
 
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with Puppet
PuppetCamp SEA @ Blk 71 -  Nagios in under 10 mins with PuppetPuppetCamp SEA @ Blk 71 -  Nagios in under 10 mins with Puppet
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with Puppet
 
PostgreSQLからMongoDBへ
PostgreSQLからMongoDBへPostgreSQLからMongoDBへ
PostgreSQLからMongoDBへ
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
DB2 Native XML
DB2 Native XMLDB2 Native XML
DB2 Native XML
 

Similar a MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Empowers Queries, Updates, and Materialized Views [MongoDB]

Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
MongoDB
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
MongoDB
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
Tyler Brock
 

Similar a MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Empowers Queries, Updates, and Materialized Views [MongoDB] (20)

MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
 
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
 
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
Aggregation Pipeline Power++: MongoDB 4.2 파이프 라인 쿼리, 업데이트 및 구체화된 뷰 소개 [MongoDB]
 
Aggregation Framework
Aggregation FrameworkAggregation Framework
Aggregation Framework
 
The Aggregation Framework
The Aggregation FrameworkThe Aggregation Framework
The Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Modern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter BootstrapModern Application Foundations: Underscore and Twitter Bootstrap
Modern Application Foundations: Underscore and Twitter Bootstrap
 
Herding types with Scala macros
Herding types with Scala macrosHerding types with Scala macros
Herding types with Scala macros
 
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
MongoDB World 2019: Exploring your MongoDB Data with Pirates (R) and Snakes (...
 
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
 
From java to kotlin beyond alt+shift+cmd+k - Droidcon italy
From java to kotlin beyond alt+shift+cmd+k - Droidcon italyFrom java to kotlin beyond alt+shift+cmd+k - Droidcon italy
From java to kotlin beyond alt+shift+cmd+k - Droidcon italy
 
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
Max Neunhöffer – Joins and aggregations in a distributed NoSQL DB - NoSQL mat...
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
Couchbase presentation - by Patrick Heneise
Couchbase presentation - by Patrick HeneiseCouchbase presentation - by Patrick Heneise
Couchbase presentation - by Patrick Heneise
 
Query for json databases
Query for json databasesQuery for json databases
Query for json databases
 
Swift - 혼자 공부하면 분명히 안할테니까 같이 공부하기
Swift - 혼자 공부하면 분명히 안할테니까 같이 공부하기Swift - 혼자 공부하면 분명히 안할테니까 같이 공부하기
Swift - 혼자 공부하면 분명히 안할테니까 같이 공부하기
 
MongoDB
MongoDB MongoDB
MongoDB
 
Wx::Perl::Smart
Wx::Perl::SmartWx::Perl::Smart
Wx::Perl::Smart
 
2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade
 

Más de MongoDB

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

MongoDB World 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pipeline Empowers Queries, Updates, and Materialized Views [MongoDB]

  • 1. asya999 AGGREGATION How MongoDB 4.2 Pipeline is Powering Queries, Updates and Views PIPELINE POWER+++
  • 2. asya999 AGGREGATION How MongoDB 4.2 Pipeline is Powering Queries, Updates and Views PIPELINE POWER+++
  • 4. #MDBW17 Analytics with MongoDB Aggregation Framework @asya999 by Asya Kamsky, Lead MongoDB Maven PIPELINE POWER
  • 6. ps ax | grep mongod | head 1 *nix command line pipe PIPELINE
  • 7. $match $group | $sort| Input stream {} {} {} {} Result {} {} ... PIPELINE MongoDB document pipeline
  • 9. Stage 1 Stage 2 Stage 3 Stage 4 {} {} {} {} {} {} {} {} DATA PIPELINE {} {} {} {} {"$stage":{ ... }} START Collection View Special stage STAGES
  • 10. {title: "The Great Gatsby", language: "English", subjects: "Long Island"} {title: "The Great Gatsby", language: "English", subjects: "New York"} {title: "The Great Gatsby", language: "English", subjects: "1920s"} {title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, {"$match":{"language":"English"}} $match { _id:"Long Island", count: 1 }, $group { _id: "New York", count: 2 }, $unwind { _id: "1920s", count: 1 }, $sort $skip$limit $project {"$unwind":"$subjects"} {"$group":{"_id":"$subjects", "count":{"$sum:1}} { _id: "Harlem", count: 1 }, { _id: "Long Island", count: 1 }, { _id: "New York", count: 2 }, { _id: "1920s", count: 1 }, {title: "Open City", language: "English", subjects: [ "New York" "Harlem" ] } { title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, { title: "War and Peace", language: "Russian", subjects: [ "Russia", "War of 1812", "Napoleon"] }, { title: "Open City", language: "English", subjects: [ "New York", "Harlem" ] }, {title: "Open City", language: "English", subjects: "New York"} {title: "Open City", language: "English", subjects: "Harlem"} { _id: "Harlem", count: 1 }, {"$sort:{"count":-1} {"$limit":3} {"$project":...}
  • 11.
  • 13. {title: "The Great Gatsby", language: "English". subjects: "Long Island"} {title: "The Great Gatsby", language: "English", subjects: "New York"} {title: "The Great Gatsby", language: "English", subjects: "1920s"} {title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, {"$match":{"language":"English"}} $match { _id:"Long Island", count: 1 }, $group { _id: "New York", count: 2 }, $unwind { _id: "1920s", count: 1 }, $sort $skip$limit $project {"$unwind":"$subjects"} {"$group":{"_id":"$subjects", "count":{"$sum:1}} { _id: "Harlem", count: 1 }, { _id: "Long Island", count: 1 }, { _id: "New York", count: 2 }, { _id: "1920s", count: 1 }, {title: "Open City", language: "English", subjects: [ "New York" "Harlem" ] } { title: "The Great Gatsby", language: "English", subjects: [ "Long Island", "New York", "1920s"] }, { title: "War and Peace", language: "Russian", subjects: [ "Russia", "War of 1812", "Napoleon"] }, { title: "Open City", language: "English", subjects: [ "New York", "Harlem" ] }, {title: "Open City", language: "English", subjects: "New York"} {title: "Open City", language: "English", subjects: "Harlem"} { _id: "Harlem", count: 1 }, {"$sort:{"count":-1} {"$limit":3} {"$project":...} $group $sort 1
  • 14. INPUT STAGE RESULTSSTAGE STREAMING RESOURCE USE Each document is streamed through in RAM
  • 15. INPUT STAGE RESULTSSTAGE BLOCKING RESOURCE USE Everything has to be kept in RAM (or spill)
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. $sort$match $group start=ISODate("...") end=ISODate("...") { user: "303900", ipaddr: "71.56.112.56", ts:ISODate("2017-05-08T...") } {$match:{ts:{$gte:start,$lt:end}}}, {$sort:{ts:1}}, {$group:{_id:"$user",ips:{$push:{ip:"$ipaddr", ts:"$ts"}}, diffIps:{$addToSet:"$ipaddr"}}}, {$match:{"diffIps.1":{$exists:true}}}, {$addFields:{diffs: {$filter:{ input:{$map:{ input: {$range:[0,{$subtract:[{$size:"$ips"},1]}]}, as:"i", in:{$let:{vars:{ip1:{$arrayElemAt:["$ips","$$i"]}, ip2:{$arrayElemAt:["$ips",{$add:["$$i",1]}]}}, in:{ diff:{$cond:{ if:{$ne:["$$ip1.ip","$$ip2.ip"]}, then:{$divide:[{$subtract:["$$ip2.ts","$$ip1.ts"]},60000]}, else: 9999 }}, ip1:"$$ip1.ip", t1:"$$ip1.ts", ip2:"$$ip2.ip", t2:"$$ip2.ts" }}}}}, cond:{$lt:["$$this.diff",10]} }}}}, {$match:{"diffs":{$ne:[]}}}, {$project:{_id:0, user:"$_id", suspectLogins:"$diffs"}} { "user" : "35237073", "suspectLogins" : [ {"diff": 4.8333333333, "ip1": "106.220.151.16", "t1":"2017-05-08T06:58", "ip2": "223.182.113.15" "t2":"2017-05-08T07:03" }, {"diff": 8.3, "ip1": "223.182.113.15", "t1":"2017-05-08T07:03", "ip2": "49.206.217.26", "t2":"2017-05-08T07:11" } ] } $match $addFields $match $project
  • 24. #MDBW17 Analytics with MongoDB Aggregation Framework @asya999 by Asya Kamsky, Lead MongoDB Maven PIPELINE POWER https://github.com/asya999/mdbw17
  • 28. THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 29. THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 30. THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 31. THE FUTURE OF AGGREGATION More options for output Unify different languages
  • 32. More options for output Unify different languages THE PRESENT OF AGGREGATION
  • 34. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION Unify Different Languages
  • 35. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION Unify Different Languages db.c.aggregate([ {$addFields:{ numChildren:{$size:"$children"}, numDependents:{$size:{ $filter:{ input:"$children.dep", cond: "$$this" } }} }}, ... ])
  • 36. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND Unify Different Languages db.c.aggregate([ {$addFields:{ numChildren:{$size:"$children"}, numDependents:{$size:{ $filter:{ input:"$children.dep", cond: "$$this" } }} }}, ... ])
  • 37. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND Unify Different Languages db.c.find ( {$expr:{ $lt:[ {$size:{$filter:{ input: "$children.dep", cond: "$$this" }}}, 2 ] }} )
  • 38. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND UPDATE Unify Different Languages db.c.find ( {$expr:{ $lt:[ {$size:{$filter:{ input: "$children.dep", cond: "$$this" }}}, 2 ] }} )
  • 39. {children: [ {name:"Max", dob:"1994-12-01", dep:true}, {name:"Sam", dob:"1997-09-28", dep:true}, {name:"Kim", dob:"2000-02-29", dep:true} ]} AGGREGATION FIND UPDATE Unify Different Languages db.c.update( {$expr:{ $anyElementTrue:{$map:{ input:"$children", in: {$and:[ {$lt:["$$this.dob","1997-01-22"]}, "$$this.dep" ]} }} }}, {$set:{ audit:true }} )
  • 43. <update> { $set: { }, $inc: { }, $... } Update { f1: <value>, f2: <value>, ... }
  • 44. <update> { } OR [ ] Update in 4.2 [ ]
  • 47. { $addFields: { } } { $project: { } } { $replaceRoot: { } } { $set: { } } { $unset: [ ] } { $replaceWith: { } }
  • 48. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, {$inc:{a:1}}, {upsert:true}) { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } "errmsg" : "Cannot apply to a value of non-numeric type."
  • 49. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$sum:["$a",1]}}} ], {upsert:true}) { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } { _id: 1, a: 1 }
  • 50. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$add:["$a",1]}}} ], {upsert:true}) { _id: 1, a: 1 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id: 1, a: 1 } "errmsg" : "$add only supports numeric or date types, not string"
  • 51. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ ], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 }
  • 52. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: then: , else: }} }}] {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 }
  • 53. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: , else: }} }}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 }
  • 54. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }} }}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 101 } { _id:1, a: 21 }
  • 55. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }} }}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 }
  • 56. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$min:[ 100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }}]} }}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 }
  • 57. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$min:[ 100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }}]} }}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } { _id:1, a: 1 }
  • 58. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$min:[ 100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }}]}, prev_a: "$a"}}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11 } { _id: 1, a: 100 } { _id:1, a: 21 } { _id:1, a: 1 }
  • 59. { _id: 1 } { _id: 1, a: 10 } { _id: 1, a: 100 } --- { _id: 1, a: "10" } db.coll.update({_id:1}, [ {$set:{a:{$min:[ 100, {$cond:{ if: {$eq:[{$type:"$a"},"missing"]}, then: 21, else: {$sum:["$a", 1]} }}]}, prev_a: "$a"}}], {upsert:true}) { _id:1, a: 21 } { _id: 1, a: 11, prev_a: 10 } { _id: 1, a: 100, prev_a: 100 } { _id:1, a: 21 } { _id:1, a: 1, prev_a: "10" }
  • 61. {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" Set Defaults
  • 62. {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{ }} ], {multi:true}) Set Defaults
  • 63. {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ ]}} ], {multi:true}) Set Defaults
  • 64. {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ { a:0, b:0, c:"unset" }, "$$ROOT" ]}} ], {multi:true}) Set Defaults
  • 65. {_id: 1, a: 5, b: 12} {_id: 2, a: 15, c: "abc"} {_id: 3, b: 99, c: "xyz"} If a or b are missing, set to 0, if c is missing -> "unset" db.coll.update({}, [ {$replaceWith:{$mergeObjects:[ { a:0, b:0, c:"unset" }, "$$ROOT" ]}} ], {multi:true}) Set Defaults {_id: 1, a: 5, b: 12, c: "unset"} {_id: 2, a: 15, b: 0, c: "abc"} {_id: 3, a: 0, b: 99, c: "xyz"}
  • 66. { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set:{h:{$cond:{ if: then: else: }}}}], {upsert:true})
  • 67. { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set:{h:{$cond:{ if: then: else: }}}}], {upsert:true}) { id: 1, d: ISODate("2019-06-04T00:00:00"), h: [ { hour:"11", value: 296 }, { hour:"12", value: 300 } ]} id: X, d:Y, hour:Z, value: VAL db.coll.update({id:X, d:Y}, [ {$set:{h:{$cond:{ if: {$in:[Z,{$ifNull:["$h.hour",[]]}]}, then:{$map:{ input:"$h", in: {$cond:{ if:{$ne:["$$this.hour",Z]}, then:"$$this", else: {hour: Z, value: {$sum:[ "$$this.value", VAL]}} }} }}, else:{$concatArrays:[{$ifNull:["$h",[]]},[{hour:Z,value:VAL}]]} }}}}], {upsert:true})
  • 68. Recap: Updates can be specified with aggregation pipeline All fields from existing document can be accessed Slightly slower, but a lot more powerful
  • 69. THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 70. THE FUTURE OF AGGREGATION Better performance & optimizations More stages & expressions More options for output Compass helper for aggregate Unify different languages
  • 71. THE FUTURE OF AGGREGATION More options for output
  • 73. Prior to MongoDB 4.2 $out coll new_coll $out
  • 74. Prior to MongoDB 4.2 $out coll new_coll $out db.coll.aggregate( [ {pipeline}, ... {$out: "new_coll"} ]);
  • 75. Prior to MongoDB 4.2 $out coll new_coll $out db.coll.aggregate( [ {pipeline}, ... {$out: "new_coll"} ]); new_coll ○ must be unsharded ○ overwrites existing
  • 76. New $merge stage in aggregation pipeline
  • 78. MongoDB 4.2 $merge db.coll.aggregate( [ {pipeline}, ..., {$merge: { ... } ]); coll coll2 $merge
  • 79. MongoDB 4.2 $merge db.coll.aggregate( [ {pipeline}, ..., {$merge: { ... } ]); coll2 can exist same or different 'db' can be sharded coll coll2 $merge
  • 80. coll coll2 $merge { } { } { } { } { } { } { } { } MongoDB 4.2
  • 82. $merge syntax { $merge: { into: <target> } } {$merge: "collection2"}
  • 83. $merge syntax { $merge: { into: <target> } } {$merge: {db: "db2", coll: "collection2"}
  • 85. $merge syntax { $merge: { into: <target>, on: <fields> } } on: "_id" on: [ "_id", "shardkey(s)" ] must be unique
  • 86. $merge syntax { $merge: { into: <target>, on: <fields> } }
  • 89. Actions nothing matched: usually insert source target
  • 90. Actions nothing matched: usually insert document matched: source target
  • 91. Actions nothing matched: usually insert document matched: overwrite? update? ??? source target
  • 92. Actions nothing matched: usually insert document matched: update source target
  • 93. Actions nothing matched: usually insert document matched: update (merge) source target
  • 94. $merge syntax { $merge: { into: <target>, whenNotMatched: whenMatched: } }
  • 95. $merge syntax { $merge: { into: <target>, whenNotMatched:"insert", whenMatched: } }
  • 96. $merge syntax { $merge: { into: <target>, whenNotMatched:"insert", whenMatched:"merge" } }
  • 97. $merge syntax { $merge: { into: <target>, whenNotMatched:"insert"|"discard"|"fail", whenMatched:"merge" } }
  • 98. $merge syntax { $merge: { into: <target>, whenNotMatched:"insert"|"discard"|"fail", whenMatched:"merge"|"replace"|"keepExisting"|"fail"|[...] } }
  • 99. $merge syntax { $merge: { into: <target>, whenMatched:[...] } }
  • 100. $merge syntax { $merge: { into: <target>, whenMatched:[<custom pipeline>] } }
  • 101. $merge example { $merge: { into: <target>, whenMatched:[ {$addFields:{ }} ] } }
  • 102. $merge example { $merge: { into: <target>, whenMatched:[ {$addFields:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 103. $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 104. $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } }
  • 105. $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { }
  • 106. $merge example { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { _id: "37", total: 309, f1: "yyy" }
  • 107. $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } }
  • 108. $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { }
  • 109. $merge example 2 { $merge: { into: <target>, whenMatched:[ {$replaceWith:{$mergeObjects:[ "$$new", {total:{$sum:["$$new.total", "$total"]}} ]}} ] } } Incoming Target { _id: "37", total: 64, f1: "x" } { _id: "37", total: 245, f1: "yyy" } Result: { _id: "37", total: 309, f1: "x" }
  • 110. $merge syntax { $merge: { into: <target>, whenMatched:[...] } }
  • 111. $merge syntax { $merge: { into: <target>, let: { ... }, whenMatched:[ ...] } }
  • 112. $merge syntax { $merge: { into: <target>, let: {new: "$$ROOT"}, whenMatched:[ ...] } }
  • 114. { $merge: { into: <target>, whenMatched:[ {$set:{ total:{$sum:["$total","$$new.total"]} }} ] } } { $merge: { into: <target>, let: {itotal: "$total"}, whenMatched:[ {$set:{ total:{$sum:["$total","$$itotal"]} }} ] } }
  • 116. temp real data real Using $merge to append loaded and cleansed records loaded into db
  • 117. aggregate 'temp' and append valid records to 'data' db.temp.aggregate( [ { ... } /* pipeline to massage and cleanse data in temp */, {$merge:{ into: "data", whenMatched: "fail" }} ]);
  • 118. aggregate 'temp' and append valid records to 'data' db.temp.aggregate( [ { ... } /* pipeline to massage and cleanse data in temp */, {$merge:{ into: "data", whenMatched: "fail" }} ]); Similar to SQL's INSERT INTO T1 SELECT * from T2
  • 120. mflix users users mfriendbook users sv Using $merge to populate/update user fields from other services
  • 121. mflix users users mfriendbook users sv Using $merge to populate/update user fields from other services sv.users { _id: "user253", dob: ISODate(...), f1: "yyy" }
  • 122. $merge updates fields from mflix.users collection into sv.users collection. Our "_id" field is unique username mflix_pipeline = [ { "$project" : { "_id" : "$username", "mflix" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mflix) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy" }
  • 123. $merge updates fields from mflix.users collection into sv.users collection. Our "_id" field is unique username mflix_pipeline = [ { "$project" : { "_id" : "$username", "mflix" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mflix) db.users.aggregate(mflix_pipeline) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... } }
  • 124. $merge updates fields from mfriendbook.users collection into sv.users collection. Our "_id" field is unique username mfriendbook_pipeline = [ { "$project" : { "_id" : "$username", "mfriendbook" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mfriendbook) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... } }
  • 125. $merge updates fields from mfriendbook.users collection into sv.users collection. Our "_id" field is unique username mfriendbook_pipeline = [ { "$project" : { "_id" : "$username", "mfriendbook" : "$$ROOT" }}, { "$merge" : { "into" : { "db": "sv", "collection" : "users" }, "whenNotMatched" : "discard" }} ] (in mfriendbook) db.users.aggregate(mfriendbook_pipeline) sv.users { _id: "user253", dob: ISODate(...), f1: "yyy", mflix: { ... }, mfriendbook: { ... } }
  • 127. registrations real regsummary real Using $merge to incrementally update periodic rollups in summary
  • 128. $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true});
  • 129. $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true}); db.registrations.aggregate([ {$match: {event_id: "MDBW19"}}, {$group:{ _id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}}, count: {$sum:1} }}, {$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ])
  • 130. $merge to create/update periodic rollups in summary collection (for all days) db.regsummary.createIndex({event:1, date:1}, {unique: true}); db.registrations.aggregate([ {$match: {event_id: "MDBW19"}}, {$group:{ _id:{$dateToString:{date:"$date",format:"%Y-%m-%d"}}, count: {$sum:1} }}, {$project: {_id:0,event:"MDBW19",date:"$_id",total:"$count"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ]) { "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 } { "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 } { "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 }
  • 131. $merge to incrementally update periodic rollups in summary collection (for single day)
  • 132. $merge to incrementally update periodic rollups in summary collection (for single day) db.registrations.aggregate([ {$match: { event_id: "MDBW19", date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")} }}, {$count: "total"}, {$addFields: {event:"MDBW19", "date":"2019-05-22"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ])
  • 133. $merge to incrementally update periodic rollups in summary collection (for single day) db.registrations.aggregate([ {$match: { event_id: "MDBW19", date:{$gte:ISODate("2019-05-22"),$lt:ISODate("2019-05-23")} }}, {$count: "total"}, {$addFields: {event:"MDBW19", "date":"2019-05-22"}}, {$merge: { into: "regsummary", on: ["event", "date"] }} ]) { "event" : "MDBW19", "date" : "2019-05-19", "total" : 33 } { "event" : "MDBW19", "date" : "2019-05-20", "total" : 15 } { "event" : "MDBW19", "date" : "2019-05-21", "total" : 24 } { "event" : "MDBW19", "date" : "2019-05-22", "total" : 34 }
  • 134.