SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
“The Weather of the Century”:! 
Data Visualization With 
MongoDB And Python 
A. Jesse Jiryu Davis 
Senior Engineer, MongoDB 
@jessejiryudavis
Serious MongoDB Talk 
Database
Serious MongoDB Talk
This Talk
Where’s the data from?
Where’s the data from?
How Much Is There? 
• 2.5 billion documents 
• 4 TB (1.6k per document) 
• “Medium data”
What Does It Look Like? 
0303725053947282013060322517+40779-073969FM-15+0048KNYC 
V0309999C00005030485MN0080475N5+02115+02005100975 
ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 
GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... 
{ 
"st" : "u725053", 
"ts" : ISODate("2013-06-03T22:51:00Z"), 
"airTemperature" : { 
"value" : 21.1, 
"quality" : "5" 
}, 
"atmosphericPressure" : { 
"value" : 1009.7, 
"quality" : "5" 
} 
} 
Station Identifier 
(»NYC Central Park«)
{! 
ts: ISODate("1991-01-01T00:00:00Z"),! 
position: {! 
type: "Point",! 
coordinates: [! 
-94.6,! 
39.117! 
]! 
},! 
airTemperature: {! 
value: 27,! 
quality: "1"! 
}! 
}! 
GeoJSON
Visualization
Visualization Pipeline 
MongoDB PyMongo Python NumPy Matplotlib 
dicts 
SciPy
{! 
ts: ISODate("1991-01-01T00:00:00Z"),! 
position: {! 
type: "Point",! 
coordinates: [! 
-94.6,! 
39.117! 
]! 
},! 
airTemperature: {! 
value: 45,! 
quality: "1"! 
}! 
}!
import numpy! 
import pymongo! 
! 
data = []! 
db = pymongo.MongoClient().my_database! 
! 
for doc in db.collection.find(query):! 
data.append((! 
doc['position']['coordinates'][0],! 
doc['position']['coordinates'][1],! 
doc['airTemperature']['value']))! 
! 
arrays = numpy.array(data)!
# NumPy column access syntax.! 
lons = arrays[:, 0]! 
lats = arrays[:, 1]! 
temps = arrays[:, 2]!
from scipy import griddata! 
from matplotlib import pyplot! 
! 
xs = numpy.linspace(-180, 180, 361)! 
ys = numpy.linspace(-90, 90, 181)! 
zs = griddata(lats, lons, temps,! 
(xs, ys),! 
method='linear')! 
Magic!! 
! 
pyplot.contour(xs, ys, zs)! 
Also magic!!
from matplotlib import pyplot! 
! 
xs = numpy.linspace(-180, 180, 361)! 
ys = numpy.linspace(-90, 90, 181)! 
zs = griddata(lats, lons, temps,! 
(xs, ys),! 
method='linear')! 
! 
pyplot.contour(xs, ys, zs)!
Triangulation
Triangulation
Triangulation 
What temperature?
Barycentric Interpolation 
What temperature? 53 
48 
54 
Weighted Average 
51.1
Interpolation 
51.1
Interpolation
Interpolation
Contours
Contours
Not terrifically fast 
import numpy! 
import pymongo! 
! 
data = []! 
db = pymongo.MongoClient().my_database! 
! 
for doc in db.collection.find(query):! 
data.append((! 
doc['position']['coordinates'][0],! 
doc['position']['coordinates'][1],! 
doc['airTemperature']['value']))! 
! 
arrays = numpy.array(data)!
MongoDB-to-NumPy Performance 
• Querying: 109k documents per second 
• (On localhost) 
• Can we go faster? 
• Enter “Monary”
Monary 
by David Beach 
MongoDB PyMongo Python NumPy Matplotlib 
dicts 
MongoDB Monary NumPy Matplotlib
import monary! 
! 
data = []! 
connection = monary.Monary()! 
! 
arrays = monary_connection.query(! 
db='my_database',! 
coll='collection',! 
query=query,! 
fields=[! 
'position.coordinates.0',! 
'position.coordinates.1',! 
'airTemperature.value'],! 
types=[! 
'float32',! 
'float32',! 
'float32'])!
Monary 
• PyMongo: 109k documents per second 
• Monary: 817k documents per second
Visualization
• Author: 
David Beach 
• Contributors from MongoDB, Inc.: 
Kyle Suarez 
Matt Cotter 
Anna Herlihy 
• Mentors: 
A. Jesse Jiryu Davis 
Jason Carey 
Monary
Recent features: 
• Easy installation 
• Nested field access 
• Aggregation 
• Python 3 
Monary
• Insert, update, remove 
• SSL and authentication mechanisms 
• Improved API and logging 
• parallelCollectionScan 
Monary 
Future:
! 
• MongoDB 
• Python 
• Monary 
• NumPy 
• SciPy 
• Matplotlib
Thanks
Thank you 
A. Jesse Jiryu Davis 
Senior Python Engineer, MongoDB 
#MongoDBWorld
Presents 
1. http://bit.ly/century-links 
2. October MongoDB certification exams! 
price *= 0.8 
Code “MongoDBBoston20” 
university.mongodb.com 
3. Ask The Experts!!

Más contenido relacionado

La actualidad más candente

Блохин Леонид - "Mist, как часть Hydrosphere"
Блохин Леонид - "Mist, как часть Hydrosphere"Блохин Леонид - "Mist, как часть Hydrosphere"
Блохин Леонид - "Mist, как часть Hydrosphere"
Provectus
 
Cloud flare jgc bigo meetup rolling hashes
Cloud flare jgc   bigo meetup rolling hashesCloud flare jgc   bigo meetup rolling hashes
Cloud flare jgc bigo meetup rolling hashes
Cloudflare
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
Ankur Dave
 

La actualidad más candente (20)

Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1Network Analysis with networkX : Real-World Example-1
Network Analysis with networkX : Real-World Example-1
 
Блохин Леонид - "Mist, как часть Hydrosphere"
Блохин Леонид - "Mist, как часть Hydrosphere"Блохин Леонид - "Mist, как часть Hydrosphere"
Блохин Леонид - "Mist, как часть Hydrosphere"
 
Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...
Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...
Sydney Python Presentation (Feb 2010) - Tracking Large Metallic Objects / Goo...
 
R and cpp
R and cppR and cpp
R and cpp
 
Azure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsAzure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analytics
 
Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19Sea Amsterdam 2014 November 19
Sea Amsterdam 2014 November 19
 
Code
CodeCode
Code
 
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
R Data Visualization-Spatial data and Maps in R: Using R as a GISR Data Visualization-Spatial data and Maps in R: Using R as a GIS
R Data Visualization-Spatial data and Maps in R: Using R as a GIS
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoop
 
Heapsort
HeapsortHeapsort
Heapsort
 
The easiest consistent hashing
The easiest consistent hashingThe easiest consistent hashing
The easiest consistent hashing
 
Cloud flare jgc bigo meetup rolling hashes
Cloud flare jgc   bigo meetup rolling hashesCloud flare jgc   bigo meetup rolling hashes
Cloud flare jgc bigo meetup rolling hashes
 
Boosting command line experience with python and awk
Boosting command line experience with python and awkBoosting command line experience with python and awk
Boosting command line experience with python and awk
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databases
 
Ganga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing gridGanga: an interface to the LHC computing grid
Ganga: an interface to the LHC computing grid
 
Heap
HeapHeap
Heap
 
Ac cuda c_5
Ac cuda c_5Ac cuda c_5
Ac cuda c_5
 
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows AzureCloudClustering: Toward a scalable machine learning toolkit for Windows Azure
CloudClustering: Toward a scalable machine learning toolkit for Windows Azure
 
Queue in swift
Queue in swiftQueue in swift
Queue in swift
 
chapter - 6.ppt
chapter - 6.pptchapter - 6.ppt
chapter - 6.ppt
 

Similar a The Weather of the Century

CS375 Presentation-binary sort.pptx
CS375 Presentation-binary sort.pptxCS375 Presentation-binary sort.pptx
CS375 Presentation-binary sort.pptx
Liyu Ying
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
InfluxData
 

Similar a The Weather of the Century (20)

The Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: VisualizationThe Weather of the Century Part 3: Visualization
The Weather of the Century Part 3: Visualization
 
A Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.ioA Century Of Weather Data - Midwest.io
A Century Of Weather Data - Midwest.io
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Stream-based Data Synchronization
Stream-based Data SynchronizationStream-based Data Synchronization
Stream-based Data Synchronization
 
CS375 Presentation-binary sort.pptx
CS375 Presentation-binary sort.pptxCS375 Presentation-binary sort.pptx
CS375 Presentation-binary sort.pptx
 
MongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big DataMongoDB Solution for Internet of Things and Big Data
MongoDB Solution for Internet of Things and Big Data
 
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
Lab pratico per la progettazione di soluzioni MongoDB in ambito Internet of T...
 
Persistent Data Structures - partial::Conf
Persistent Data Structures - partial::ConfPersistent Data Structures - partial::Conf
Persistent Data Structures - partial::Conf
 
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
Il tempo vola: rappresentare e manipolare sequenze di eventi e time series co...
 
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
OrientDB - Time Series and Event Sequences - Codemotion Milan 2014
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014Python Through the Back Door: Netflix Presentation at CodeMash 2014
Python Through the Back Door: Netflix Presentation at CodeMash 2014
 
Codemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of ThingsCodemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of Things
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 
It Probably Works - QCon 2015
It Probably Works - QCon 2015It Probably Works - QCon 2015
It Probably Works - QCon 2015
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
 

Más de MongoDB

Más de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

The Weather of the Century

  • 1. “The Weather of the Century”:! Data Visualization With MongoDB And Python A. Jesse Jiryu Davis Senior Engineer, MongoDB @jessejiryudavis
  • 7. How Much Is There? • 2.5 billion documents • 4 TB (1.6k per document) • “Medium data”
  • 8. What Does It Look Like? 0303725053947282013060322517+40779-073969FM-15+0048KNYC V0309999C00005030485MN0080475N5+02115+02005100975 ADDAA101000095AU100001015AW1105GA1025+016765999GA2045+024385999 GA3075+030485999GD11991+0167659GD22991+0243859GD33991+0304859... { "st" : "u725053", "ts" : ISODate("2013-06-03T22:51:00Z"), "airTemperature" : { "value" : 21.1, "quality" : "5" }, "atmosphericPressure" : { "value" : 1009.7, "quality" : "5" } } Station Identifier (»NYC Central Park«)
  • 9. {! ts: ISODate("1991-01-01T00:00:00Z"),! position: {! type: "Point",! coordinates: [! -94.6,! 39.117! ]! },! airTemperature: {! value: 27,! quality: "1"! }! }! GeoJSON
  • 11. Visualization Pipeline MongoDB PyMongo Python NumPy Matplotlib dicts SciPy
  • 12. {! ts: ISODate("1991-01-01T00:00:00Z"),! position: {! type: "Point",! coordinates: [! -94.6,! 39.117! ]! },! airTemperature: {! value: 45,! quality: "1"! }! }!
  • 13. import numpy! import pymongo! ! data = []! db = pymongo.MongoClient().my_database! ! for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))! ! arrays = numpy.array(data)!
  • 14.
  • 15. # NumPy column access syntax.! lons = arrays[:, 0]! lats = arrays[:, 1]! temps = arrays[:, 2]!
  • 16. from scipy import griddata! from matplotlib import pyplot! ! xs = numpy.linspace(-180, 180, 361)! ys = numpy.linspace(-90, 90, 181)! zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')! Magic!! ! pyplot.contour(xs, ys, zs)! Also magic!!
  • 17.
  • 18. from matplotlib import pyplot! ! xs = numpy.linspace(-180, 180, 361)! ys = numpy.linspace(-90, 90, 181)! zs = griddata(lats, lons, temps,! (xs, ys),! method='linear')! ! pyplot.contour(xs, ys, zs)!
  • 22. Barycentric Interpolation What temperature? 53 48 54 Weighted Average 51.1
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Not terrifically fast import numpy! import pymongo! ! data = []! db = pymongo.MongoClient().my_database! ! for doc in db.collection.find(query):! data.append((! doc['position']['coordinates'][0],! doc['position']['coordinates'][1],! doc['airTemperature']['value']))! ! arrays = numpy.array(data)!
  • 36. MongoDB-to-NumPy Performance • Querying: 109k documents per second • (On localhost) • Can we go faster? • Enter “Monary”
  • 37. Monary by David Beach MongoDB PyMongo Python NumPy Matplotlib dicts MongoDB Monary NumPy Matplotlib
  • 38. import monary! ! data = []! connection = monary.Monary()! ! arrays = monary_connection.query(! db='my_database',! coll='collection',! query=query,! fields=[! 'position.coordinates.0',! 'position.coordinates.1',! 'airTemperature.value'],! types=[! 'float32',! 'float32',! 'float32'])!
  • 39. Monary • PyMongo: 109k documents per second • Monary: 817k documents per second
  • 41. • Author: David Beach • Contributors from MongoDB, Inc.: Kyle Suarez Matt Cotter Anna Herlihy • Mentors: A. Jesse Jiryu Davis Jason Carey Monary
  • 42. Recent features: • Easy installation • Nested field access • Aggregation • Python 3 Monary
  • 43. • Insert, update, remove • SSL and authentication mechanisms • Improved API and logging • parallelCollectionScan Monary Future:
  • 44. ! • MongoDB • Python • Monary • NumPy • SciPy • Matplotlib
  • 46. Thank you A. Jesse Jiryu Davis Senior Python Engineer, MongoDB #MongoDBWorld
  • 47. Presents 1. http://bit.ly/century-links 2. October MongoDB certification exams! price *= 0.8 Code “MongoDBBoston20” university.mongodb.com 3. Ask The Experts!!