The aggregation framework is one of the most powerful analytical tools available with MongoDB.
Learn how to create a pipeline of operations that can reshape and transform your data and apply a range of analytics functions and calculations to produce summary results across a data set.
2. Beyond The Basics : Part 2
Analytics and the Aggregation Framework
Joe Drumgoole
Director of Developer Advocacy, EMEA
@jdrumgoole
V1.1
3. 3
Beyond The Basics
– Storage Engines
• What storage engines are and how to pick them
– Aggregation Framework
• How to deploy advanced analytics processing right inside the database
– The BI Connector
• How to create visualisations and dashboards from your MongoDB data
– Authentication and Authorisation
• How to secure MongoDB, both on-premise and in the cloud
4. 4
The Aggregation Framework
• An analytics engine for MongoDB
• What is analytics?
• Think of the two types of database, OLTP, OLAP
• OLTP : Online Transaction Processing
– Airline booking,
– ATMs,
– Taxi booking
• OLAP : Online Analytical Processing
– Which tickets make us most money?
– When do we need to refill our ATMs?
– How many cabs do we need to service the West End of London?
7. 7
The Aggregation Framework – A Processing Pipeline
Match Project Group SortLimit
• Think unix pipeline
• The output of one stage is passed to the input of the next stage
• Each stage performs one job
• Stages can be repeated
• Output is a cursor, a new collection or a view
8. 8
Typical Goals of Aggregation Framework
• Columnar Analytics
• Reshaping data
• Unwinding arrays into individual documents
• Linking collections together
• Generating new data from old (collections and views)
9. 9
Pipeline Operators
• $match
Filter documents
• $project
Reshape documents
• $group
Summarize documents
• $out
Create new collections
• $sort
Order documents
• $limit/$skip
Paginate documents
• $lookup
Join two collections together
• $unwind
Expand an array
10. 10
Example Pipeline
Match Project Group Sort Out
• Find content
• Standard query
• Uses indexes
• Reduce doc
count
• Use first
• Select content
• Remove fields
• Add fields
• Reduce doc
size
• Looks at every
doc
• Collect content
• Sum, Avg etc.
• Rewrite _id
• Reduce doc
count
• Looks at every
doc
• Sort on fields
• Several sorts
allowed
• Ascending or
descending
• 100mb limit
• Allow Disk Use
• New collection
• $out overwrites
• Only one per
aggregate
• Last member
17. 17
Search for New Members This Year
JD10Gen:apps jdrumgoole$ ./mug_analytics_main.py --stats newmembers --url DublinMUG --sort join_date --format csv --
direction ascending --limit 10 --start 1-Jan-2017
Processing : ['DublinMUG']
Sorting on 'join_date' direction = 'ascending'
db.members.aggregate([
{"$match": {"batchID": 138}},
{"$unwind": "$member.chapters"},
{"$match": {"member.chapters.urlname": {"$in": ["DublinMUG"]}}},
{"$match": {"member.join_time": {"$gte": "2017-01-01T00:00:00"}}},
{"$project": {"join_date": "$member.join_time", "_id": 0, "group": "$member.chapters.urlname", "name":
"$member.member_name"}},
{"$limit": 10}])
group,name,join_date
DublinMUG,Gosia,17-Apr-2017 12:51
DublinMUG,Luke Shiels,15-Apr-2017 14:00
DublinMUG,Silvia Sirbu,11-Apr-2017 12:00
DublinMUG,Steeve P.,04-Apr-2017 09:47
DublinMUG,Dafei W,30-Mar-2017 11:36
DublinMUG,Ross Norman,13-Mar-2017 11:30
DublinMUG,Grzegorz F.,08-Mar-2017 10:25
DublinMUG,Lucas Sacramento,07-Mar-2017 11:05
DublinMUG,David Blount,06-Mar-2017 12:33
DublinMUG,Luca Ballerini,06-Mar-2017 10:41
Wrote 10 records
18. 18
Turn an Aggregation into a View
• Only supported on MongoDB 3.4
• Views are a non-materialised view on a collection
MongoDB Enterprise > db.createView( "batch138",
"members",
[ { "$match" : { "batchID" : 138 }} ] )
{ "ok" : 1 }
MongoDB Enterprise >
• A view persists and will return new results each time a find is run
• A view looks just like a collection
• Must turn 3.4 compatibility on
MongoDB Enterprise > db.adminCommand( { setFeatureCompatibilityVersion: "3.4"} )
19. 19
Useful Links
• The Aggregation Python class
https://github.com/jdrumgoole/mongodb_utils/blob/master/mongodb_utils/agg.py
• Aggregation docs
https://docs.mongodb.com/manual/aggregation/
• MongoDB Views in 3.4
https://docs.mongodb.com/manual/core/views/