SlideShare una empresa de Scribd logo
1 de 58
Descargar para leer sin conexión
Real-Time stream computation on
graphs using Storm, Neo4j and
Python
Sonal Raj
http://www.sonalraj.com
Presented at Pycon India 2013
Bangalore, India
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
1
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Introduction
2
• With data multiplying each day, storage and
knowledge extraction is a major concern.
• Social Data Analysis, Business Intelligence
• Constraints of Real Time and Fault-Tolerant
Processing
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
. . In this Talk
3
• A look at storm as a distributed
computation Framework
• Neo4J as a NoSQL graph database
• Some Cool Pictures
• What are we trying to achieve ?
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Disclaimer !
4
• This talk presents an overview of Storm and
Neo4J . . Less dirty details 
• I’m going to go pretty fast . . . Please hang on.
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
5
Part -1
Storm – The Hadoop
of Real Time
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Don’t we have Hadoop ?
6
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Storm v/s Hadoop
7
STORM
HADOOP
• Distributed
Processing
• Fault Tolerance
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Storm v/s Hadoop
8
HADOOP
• Large but Finite Jobs
• Processes a Lot of Data at Once
• High Latency
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Storm v/s Hadoop
9
HADOOP
• Large but Finite Jobs
• Processes a Lot of Data at Once
• High Latency
Storm
Infinite Computations called Topologies
Process Infinite Streams of data one-tuple-at-a-time
Low Latency
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
So, what Storm gives us . .
10
 Real-Time Computations
 Guaranteed data Processing
 Horizontal Scalability and Fault-Tolerance
 No intermediate message Brokers
 Higher Abstraction than Message Passing, so makes
sense !
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
11
Streams
Tuple Tuple Tuple Tuple Tuple
An unbounded sequence of Tuples
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
12
Streams
Tuple Tuple Tuple Tuple Tuple
An unbounded sequence of Tuples
So, what kind of
a tuple is this ?
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
13
Spouts
A source of Streams
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
14
Spouts
A source of Streams
But, what is the
source FOR the
spouts ?
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
15
Bolts
Computational units processing input
streams and producing new streams
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
16
Bolts
Computational units processing input
streams and producing new streams
Just 1 stream ?
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little deeper . . Concepts
17
Topologies
A network of spouts and bolts
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Is that it . . . ?
18
Tasks and Parallelism
A spout or bolt can execute
multiple tasks across the
cluster
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
19
[ ]Mr. Tuple
O Shoot, where
do I go now?
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Groupings . . To the rescue of Mr. Tuple !
20
• Shuffle Grouping #pick a random task
• Fields Grouping #mod hashing on a
subset of tuple fields
• All Grouping #sends to all tasks
• Global Grouping #picks task with lowest
task id
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A Storm Cluster
21
NIMBUS
ZOOKEEPER
ZOOKEEPER
ZOOKEEPER
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A Storm Cluster
22
NIMBUS
ZOOKEEPER
ZOOKEEPER
ZOOKEEPER
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
If this were Hadoop
Job Tracker
Task Tracker
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A Storm Cluster
23
NIMBUS
ZOOKEEPER
ZOOKEEPER
ZOOKEEPER
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
SUPERVISOR
But it’s NOT Hadoop !
Co-ordinates
Everything
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Salient Features . .
24
• Storm > 0.7 supports Transactional Topologies
 Processes small batches of topologies
 If failure during commit, both batch+commit is
retried
• Storm guarantees message Processing using
acknowledgements
• Petrel by AirSage is a python wrapper for
Storm ; you can write and submit topologies in
Python.
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
25
Part -2
Neo4J – “Get Graphed”
26
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
This is how
Graph Data was
represented in
RDBMS.
27
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
ENTER, NOSQL DATABASES
28
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Types of NOSQL Databases
Graph
databases
Document
databases
Column-
Family
Key-Value
Stores
Data Complexity
DataSize
29
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Why NOSQL Databases
• Easily horizontally scalable
• Dynamic Schemas, Handle Unstructured data really
well.
• Excel in speed and volume
• Trade off in consistency for efficiency (except in
graph databases . . .We’ll see why  )
• Pleasure to code
• Free to use any query language ( even SQL ! )
• Downtime? What Downtime ?
30
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
The Property Graph Model of Graph Databases
• Core Abstractions
 Nodes
 Relationship between Nodes
 Properties of both
• Traversal Framework
High Performance Queries on connected datasets
• Bindings
REST, Gremlin, etc.
31
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Neo4J
• Fully ACID with rollbacks support (unbelievable!)
• Schema-less and Efficient storage of Semi Structured
Data
• Fast deep traversal instead of slow SQL queries that
span many table joins
• Whiteboard Friendly
• Very natural to express graph related problems with
traversals (recommendation engine, shortest path etc..)
32
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Neo4J Pythonized !
• Py2Neo is an excellent binding for Neo4J
• Accesses Neo4J using it’s RESTful API
• Still under development . . Features like labels yet to be
included !
33
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
So,Will Relational databases be Extinct ?
OOPS!
34
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Categories of Graphical Data
• Social Networks
• Citations
• Product Co-Purchasing
• Internet peer-to-peer
• Road Network and Map Data
• Web Graphs
Excellent Source of Sample Graphical Data
“ http://snap.Stanford.edu/data/ “
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
35
Part -3
Get your hands dirty !
36
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A demo . .
• Sample Social Network data set
• Data Includes people signing up info,
adding friends, unfriending etc. . . for a
month’s activity
• Neo4J
 Store and Update the social data
• Storm
 Calculate “friendship-index”
37
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A demo . .
• “friendship-index”
 n = Through how many people is
person “A” connected to person “B”
 Gives an idea of how close two people
are !
 Useful while searching friends on Social
Networks ( something like friends of friends concept
in facebook’s graph search )
38
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
The Topology . .
Update
Spout
Update
Bolt
Query
Spout Query
Bolt
Source
Source
39Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Spout
40Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Spout
Define what kind of tuples
are emitted
41Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Spout
Gets and emits tuple streams
42Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Bolt
43Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Bolt
Objects for database access
and indexing service
44Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Update Bolt
45Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Spout
46Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Spout
The tuple to be emitted
can contain multiple
entities.
47Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Bolt
48Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Bolt
49Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Bolt
Retrieve caller friend and
requested friend ids
50Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Query Bolt
Retrieve caller friend
and requested friend
ids as per database
51Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Create Topology
52Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Create Topology
Import all spout and
bolt files
53Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Create Topology
Unfortunately,There was no option in
Petrel to turn off console debug, so the
console view is really messy.
54Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
Topology.yaml
Configurations to the topology are
specified in this file
55
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
A little More . .
Update
Spout
Update
Bolt
Query
Spout Query
Bolt
Source
Source
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
56
Final Thoughts
• A Storm-Neo4j framework is a boon for real-time
graph computations
• Quite flexible in Java, Python bindings and
implementations still have a long way to go.
• If you are an Admin or developer, Analyse your data
and computing requirements before narrowing down
on a framework.
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
57
…to play with Storm and Neo4J
• My PyCon Talk Repo – slides, code skeletons,
etc.
http://www.sonalraj.com/neo-storm.html
• Storm documentation (official)
http://github.com/nathanmarz/storm
• Storm Book
http://www.amazon.com/Getting-Started-Storm-Jonathan-
Leibiusky/dp/1449324010
• Deployment of storm on AWS
http://github.com/nathanmarz/storm-deploy
• Neo4J Documentation
http://www.neo4j.org
Copyrights © 2013, Sonal Raj, http://www.sonalraj.com
58
Ex-terminated . . .
- That’s it
- Thanks for Listening !
- Questions

Más contenido relacionado

La actualidad más candente

Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter StormUwe Printz
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Dan Lynn
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormNati Shalom
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormEugene Dvorkin
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Stormviirya
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Robert Evans
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridDataWorks Summit
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with StormMariusz Gil
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleDung Ngua
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleDataWorks Summit/Hadoop Summit
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceP. Taylor Goetz
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormMd. Shamsur Rahim
 

La actualidad más candente (20)

Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.
 
STORM
STORMSTORM
STORM
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Multi-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop GridMulti-Tenant Storm Service on Hadoop Grid
Multi-Tenant Storm Service on Hadoop Grid
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
Introduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & ExampleIntroduction to Apache Storm - Concept & Example
Introduction to Apache Storm - Concept & Example
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Tutorial Kafka-Storm
Tutorial Kafka-StormTutorial Kafka-Storm
Tutorial Kafka-Storm
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 

Similar a Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013

Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...
Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...
Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...rivetlogic
 
Social Content Management with MongoDB
Social Content Management with MongoDBSocial Content Management with MongoDB
Social Content Management with MongoDBMongoDB
 
Introduction to MySQL Enterprise Monitor
Introduction to MySQL Enterprise MonitorIntroduction to MySQL Enterprise Monitor
Introduction to MySQL Enterprise MonitorMark Leith
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...jaxLondonConference
 
Approaching real-time-hadoop
Approaching real-time-hadoopApproaching real-time-hadoop
Approaching real-time-hadoopChris Huang
 
GraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
GraphPipe - Blazingly Fast Machine Learning Inference by Vish AbramsGraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
GraphPipe - Blazingly Fast Machine Learning Inference by Vish AbramsOracle Developers
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your MicroservicesMarcus Hirt
 
Spring & messaging
Spring & messagingSpring & messaging
Spring & messagingArtem Bilan
 
Comprehensive Monitoring for Docker
Comprehensive Monitoring for DockerComprehensive Monitoring for Docker
Comprehensive Monitoring for DockerChristian Beedgen
 
Developers vs DBA's - APACOUC webinar 2017
Developers vs DBA's - APACOUC webinar 2017Developers vs DBA's - APACOUC webinar 2017
Developers vs DBA's - APACOUC webinar 2017Connor McDonald
 
eProseed Oracle Open World 2016 debrief - Oracle Management Cloud
eProseed Oracle Open World 2016 debrief - Oracle Management CloudeProseed Oracle Open World 2016 debrief - Oracle Management Cloud
eProseed Oracle Open World 2016 debrief - Oracle Management CloudMarco Gralike
 
How To Visualize Graphs
How To Visualize GraphsHow To Visualize Graphs
How To Visualize GraphsJean Ihm
 
Pentest: footprinting & scan
Pentest: footprinting & scanPentest: footprinting & scan
Pentest: footprinting & scanJUNIOR SORO
 
Jfokus 2017 Oracle Dev Cloud and Containers
Jfokus 2017 Oracle Dev Cloud and ContainersJfokus 2017 Oracle Dev Cloud and Containers
Jfokus 2017 Oracle Dev Cloud and ContainersMika Rinne
 
What is WebRTC? What can I do with it?
What is WebRTC? What can I do with it?What is WebRTC? What can I do with it?
What is WebRTC? What can I do with it?Dan Jenkins
 
Full-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSFull-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSMongoDB
 
GraalVM: Run Programs Faster Everywhere
GraalVM: Run Programs Faster EverywhereGraalVM: Run Programs Faster Everywhere
GraalVM: Run Programs Faster EverywhereJ On The Beach
 
Crafting enhanced customer experience through chatbots, beacons and oracle jet
Crafting enhanced customer experience through chatbots, beacons and oracle jetCrafting enhanced customer experience through chatbots, beacons and oracle jet
Crafting enhanced customer experience through chatbots, beacons and oracle jetRohit Dhamija
 

Similar a Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013 (20)

Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...
Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...
Morpheus Drive – A Simple File Sharing UI for Alfresco that Solves the Dropbo...
 
Social Content Management with MongoDB
Social Content Management with MongoDBSocial Content Management with MongoDB
Social Content Management with MongoDB
 
Logging & Docker - Season 2
Logging & Docker - Season 2Logging & Docker - Season 2
Logging & Docker - Season 2
 
Introduction to MySQL Enterprise Monitor
Introduction to MySQL Enterprise MonitorIntroduction to MySQL Enterprise Monitor
Introduction to MySQL Enterprise Monitor
 
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...
 
Approaching real-time-hadoop
Approaching real-time-hadoopApproaching real-time-hadoop
Approaching real-time-hadoop
 
GraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
GraphPipe - Blazingly Fast Machine Learning Inference by Vish AbramsGraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
GraphPipe - Blazingly Fast Machine Learning Inference by Vish Abrams
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your Microservices
 
Spring & messaging
Spring & messagingSpring & messaging
Spring & messaging
 
Session 203 iouc summit database
Session 203 iouc summit databaseSession 203 iouc summit database
Session 203 iouc summit database
 
Comprehensive Monitoring for Docker
Comprehensive Monitoring for DockerComprehensive Monitoring for Docker
Comprehensive Monitoring for Docker
 
Developers vs DBA's - APACOUC webinar 2017
Developers vs DBA's - APACOUC webinar 2017Developers vs DBA's - APACOUC webinar 2017
Developers vs DBA's - APACOUC webinar 2017
 
eProseed Oracle Open World 2016 debrief - Oracle Management Cloud
eProseed Oracle Open World 2016 debrief - Oracle Management CloudeProseed Oracle Open World 2016 debrief - Oracle Management Cloud
eProseed Oracle Open World 2016 debrief - Oracle Management Cloud
 
How To Visualize Graphs
How To Visualize GraphsHow To Visualize Graphs
How To Visualize Graphs
 
Pentest: footprinting & scan
Pentest: footprinting & scanPentest: footprinting & scan
Pentest: footprinting & scan
 
Jfokus 2017 Oracle Dev Cloud and Containers
Jfokus 2017 Oracle Dev Cloud and ContainersJfokus 2017 Oracle Dev Cloud and Containers
Jfokus 2017 Oracle Dev Cloud and Containers
 
What is WebRTC? What can I do with it?
What is WebRTC? What can I do with it?What is WebRTC? What can I do with it?
What is WebRTC? What can I do with it?
 
Full-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWSFull-stack Web Development with MongoDB, Node.js and AWS
Full-stack Web Development with MongoDB, Node.js and AWS
 
GraalVM: Run Programs Faster Everywhere
GraalVM: Run Programs Faster EverywhereGraalVM: Run Programs Faster Everywhere
GraalVM: Run Programs Faster Everywhere
 
Crafting enhanced customer experience through chatbots, beacons and oracle jet
Crafting enhanced customer experience through chatbots, beacons and oracle jetCrafting enhanced customer experience through chatbots, beacons and oracle jet
Crafting enhanced customer experience through chatbots, beacons and oracle jet
 

Más de Sonal Raj

Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Sonal Raj
 
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...Sonal Raj
 
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Sonal Raj
 
Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Sonal Raj
 
Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Sonal Raj
 
Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Sonal Raj
 
IT Quiz Mains
IT Quiz MainsIT Quiz Mains
IT Quiz MainsSonal Raj
 
IT Quiz Prelims
IT Quiz PrelimsIT Quiz Prelims
IT Quiz PrelimsSonal Raj
 
Spock the human computer interaction system - synopsis
Spock   the human computer interaction system - synopsisSpock   the human computer interaction system - synopsis
Spock the human computer interaction system - synopsisSonal Raj
 

Más de Sonal Raj (9)

Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
Internet of Things with Python & Serverless - PyCon MY 2019 - Kuala Lumpur, M...
 
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
IOT and Home Automation with Serverless Computing | Serverless Days 2019 | So...
 
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
Internet of Python - IOT with Python and Serverless | Sonal Raj | HydPy Feb 2019
 
Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?Progressive Javascript: Why React when you can Vue?
Progressive Javascript: Why React when you can Vue?
 
Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018Alexa enabled smart home programming in Python - PyCon India 2018
Alexa enabled smart home programming in Python - PyCon India 2018
 
Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.Startup Diagnostics: Reasons why startups can fail.
Startup Diagnostics: Reasons why startups can fail.
 
IT Quiz Mains
IT Quiz MainsIT Quiz Mains
IT Quiz Mains
 
IT Quiz Prelims
IT Quiz PrelimsIT Quiz Prelims
IT Quiz Prelims
 
Spock the human computer interaction system - synopsis
Spock   the human computer interaction system - synopsisSpock   the human computer interaction system - synopsis
Spock the human computer interaction system - synopsis
 

Último

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 

Último (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 

Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013

  • 1. Real-Time stream computation on graphs using Storm, Neo4j and Python Sonal Raj http://www.sonalraj.com Presented at Pycon India 2013 Bangalore, India Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 1
  • 2. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Introduction 2 • With data multiplying each day, storage and knowledge extraction is a major concern. • Social Data Analysis, Business Intelligence • Constraints of Real Time and Fault-Tolerant Processing
  • 3. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com . . In this Talk 3 • A look at storm as a distributed computation Framework • Neo4J as a NoSQL graph database • Some Cool Pictures • What are we trying to achieve ?
  • 4. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Disclaimer ! 4 • This talk presents an overview of Storm and Neo4J . . Less dirty details  • I’m going to go pretty fast . . . Please hang on.
  • 5. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 5 Part -1 Storm – The Hadoop of Real Time
  • 6. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Don’t we have Hadoop ? 6
  • 7. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Storm v/s Hadoop 7 STORM HADOOP • Distributed Processing • Fault Tolerance
  • 8. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Storm v/s Hadoop 8 HADOOP • Large but Finite Jobs • Processes a Lot of Data at Once • High Latency
  • 9. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Storm v/s Hadoop 9 HADOOP • Large but Finite Jobs • Processes a Lot of Data at Once • High Latency Storm Infinite Computations called Topologies Process Infinite Streams of data one-tuple-at-a-time Low Latency
  • 10. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com So, what Storm gives us . . 10  Real-Time Computations  Guaranteed data Processing  Horizontal Scalability and Fault-Tolerance  No intermediate message Brokers  Higher Abstraction than Message Passing, so makes sense !
  • 11. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 11 Streams Tuple Tuple Tuple Tuple Tuple An unbounded sequence of Tuples
  • 12. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 12 Streams Tuple Tuple Tuple Tuple Tuple An unbounded sequence of Tuples So, what kind of a tuple is this ?
  • 13. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 13 Spouts A source of Streams
  • 14. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 14 Spouts A source of Streams But, what is the source FOR the spouts ?
  • 15. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 15 Bolts Computational units processing input streams and producing new streams
  • 16. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 16 Bolts Computational units processing input streams and producing new streams Just 1 stream ?
  • 17. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little deeper . . Concepts 17 Topologies A network of spouts and bolts
  • 18. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Is that it . . . ? 18 Tasks and Parallelism A spout or bolt can execute multiple tasks across the cluster
  • 19. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 19 [ ]Mr. Tuple O Shoot, where do I go now?
  • 20. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Groupings . . To the rescue of Mr. Tuple ! 20 • Shuffle Grouping #pick a random task • Fields Grouping #mod hashing on a subset of tuple fields • All Grouping #sends to all tasks • Global Grouping #picks task with lowest task id
  • 21. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A Storm Cluster 21 NIMBUS ZOOKEEPER ZOOKEEPER ZOOKEEPER SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR
  • 22. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A Storm Cluster 22 NIMBUS ZOOKEEPER ZOOKEEPER ZOOKEEPER SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR If this were Hadoop Job Tracker Task Tracker
  • 23. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A Storm Cluster 23 NIMBUS ZOOKEEPER ZOOKEEPER ZOOKEEPER SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR SUPERVISOR But it’s NOT Hadoop ! Co-ordinates Everything
  • 24. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Salient Features . . 24 • Storm > 0.7 supports Transactional Topologies  Processes small batches of topologies  If failure during commit, both batch+commit is retried • Storm guarantees message Processing using acknowledgements • Petrel by AirSage is a python wrapper for Storm ; you can write and submit topologies in Python.
  • 25. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 25 Part -2 Neo4J – “Get Graphed”
  • 26. 26 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com This is how Graph Data was represented in RDBMS.
  • 27. 27 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com ENTER, NOSQL DATABASES
  • 28. 28 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Types of NOSQL Databases Graph databases Document databases Column- Family Key-Value Stores Data Complexity DataSize
  • 29. 29 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Why NOSQL Databases • Easily horizontally scalable • Dynamic Schemas, Handle Unstructured data really well. • Excel in speed and volume • Trade off in consistency for efficiency (except in graph databases . . .We’ll see why  ) • Pleasure to code • Free to use any query language ( even SQL ! ) • Downtime? What Downtime ?
  • 30. 30 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com The Property Graph Model of Graph Databases • Core Abstractions  Nodes  Relationship between Nodes  Properties of both • Traversal Framework High Performance Queries on connected datasets • Bindings REST, Gremlin, etc.
  • 31. 31 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Neo4J • Fully ACID with rollbacks support (unbelievable!) • Schema-less and Efficient storage of Semi Structured Data • Fast deep traversal instead of slow SQL queries that span many table joins • Whiteboard Friendly • Very natural to express graph related problems with traversals (recommendation engine, shortest path etc..)
  • 32. 32 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Neo4J Pythonized ! • Py2Neo is an excellent binding for Neo4J • Accesses Neo4J using it’s RESTful API • Still under development . . Features like labels yet to be included !
  • 33. 33 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com So,Will Relational databases be Extinct ? OOPS!
  • 34. 34 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Categories of Graphical Data • Social Networks • Citations • Product Co-Purchasing • Internet peer-to-peer • Road Network and Map Data • Web Graphs Excellent Source of Sample Graphical Data “ http://snap.Stanford.edu/data/ “
  • 35. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 35 Part -3 Get your hands dirty !
  • 36. 36 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A demo . . • Sample Social Network data set • Data Includes people signing up info, adding friends, unfriending etc. . . for a month’s activity • Neo4J  Store and Update the social data • Storm  Calculate “friendship-index”
  • 37. 37 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A demo . . • “friendship-index”  n = Through how many people is person “A” connected to person “B”  Gives an idea of how close two people are !  Useful while searching friends on Social Networks ( something like friends of friends concept in facebook’s graph search )
  • 38. 38 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com The Topology . . Update Spout Update Bolt Query Spout Query Bolt Source Source
  • 39. 39Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Spout
  • 40. 40Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Spout Define what kind of tuples are emitted
  • 41. 41Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Spout Gets and emits tuple streams
  • 42. 42Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Bolt
  • 43. 43Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Bolt Objects for database access and indexing service
  • 44. 44Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Update Bolt
  • 45. 45Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Spout
  • 46. 46Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Spout The tuple to be emitted can contain multiple entities.
  • 47. 47Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Bolt
  • 48. 48Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Bolt
  • 49. 49Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Bolt Retrieve caller friend and requested friend ids
  • 50. 50Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Query Bolt Retrieve caller friend and requested friend ids as per database
  • 51. 51Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Create Topology
  • 52. 52Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Create Topology Import all spout and bolt files
  • 53. 53Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Create Topology Unfortunately,There was no option in Petrel to turn off console debug, so the console view is really messy.
  • 54. 54Copyrights © 2013, Sonal Raj, http://www.sonalraj.com Topology.yaml Configurations to the topology are specified in this file
  • 55. 55 Copyrights © 2013, Sonal Raj, http://www.sonalraj.com A little More . . Update Spout Update Bolt Query Spout Query Bolt Source Source
  • 56. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 56 Final Thoughts • A Storm-Neo4j framework is a boon for real-time graph computations • Quite flexible in Java, Python bindings and implementations still have a long way to go. • If you are an Admin or developer, Analyse your data and computing requirements before narrowing down on a framework.
  • 57. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 57 …to play with Storm and Neo4J • My PyCon Talk Repo – slides, code skeletons, etc. http://www.sonalraj.com/neo-storm.html • Storm documentation (official) http://github.com/nathanmarz/storm • Storm Book http://www.amazon.com/Getting-Started-Storm-Jonathan- Leibiusky/dp/1449324010 • Deployment of storm on AWS http://github.com/nathanmarz/storm-deploy • Neo4J Documentation http://www.neo4j.org
  • 58. Copyrights © 2013, Sonal Raj, http://www.sonalraj.com 58 Ex-terminated . . . - That’s it - Thanks for Listening ! - Questions