SlideShare una empresa de Scribd logo
1 de 131
Descargar para leer sin conexión
THE PATH FORWARD 
TITAN 1.0 + TINKERPOP3 
MARKO A. RODRIGUEZ 
http://THINKAURELIUS.COM 
http://TINKERPOP.COM
PART 1: 
INTRODUCTION TO GRAPHS
VERTEX
software
software 
LABEL
name:gremlin 
software
PROPERTY 
KEY VALUE 
name:gremlin 
software
name:gremlin 
use:queries 
software
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
dependsOn
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
EDGE dependsOn
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
dependsOn 
since:0.1
GRAPH
COMPUTING 
PROCESS STRUCTURE
COMPUTING 
PROCESS STRUCTURE 
TRAVERSAL GRAPH
TRAVERSAL GRAPH 
PROCESS STRUCTURE 
COMPUTING GRAPH-BASED 
COMPUTING
WhY GRAPH-BASED COMPUTING?
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING 
EXPRESSIVE QUERYING
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING 
EXPRESSIVE QUERYING 
NUMEROUS ANALYSES 
Mixing Patterns 
Centrality 
Ranking 
Path Expressions 
Geodesics 
Inference 
Motifs 
Scoring
ANALYSES ARE THE 
EPIPHENOMENA OF TRAVERSAL 
f( )→
WHAT IS THE SIGNIFICANCE OF 
GRAPH ANALYSIS?
ANALYSES YIELD 
INSIGHTS ABOUT THE MODEL 
= 
DATA 
PRODUCTS 
DATA-DRIVEN 
DECISION SUPPORT
RECOMMENDATION 
People you may know. 
Products you might like. 
Movies you should watch and 
the friends you should watch them with. 
SOCIAL GRAPH 
RATINGS GRAPH 
SOCIAL+RATINGS 
GRAPH
WHO ELSE MIGHT HERCULES KNOW? 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules 
==>v[0]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows') 
==>v[1] 
==>v[2] 
==>v[3]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows').out('knows') 
==>v[4] 
==>v[5] 
==>v[5] 
==>v[6] 
==>v[5]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows').out('knows').groupCount() 
==>v[4]=1 
==>v[5]=3 
==>v[6]=1
HERCULES PROBABLY KNOWS NEPTUNE 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
knows
HERCULES PROBABLY KNOWS NEPTUNE 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
...PROBABLY MORE SO WHEN OTHER 
TYPES OF EDGES ARE ANALYZED
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
likes
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
likes likes 
dislikes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
likes likes 
dislikes 
SOCIAL GRAPH 
RATINGS GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
composedOf 
tartarus 
8 
likes likes 
dislikes 
NEMEAN MIGHT LIKE TARTARUS 
smellsOf 
SOCIAL GRAPH 
RATINGS GRAPH 
PRODUCT GRAPH 
* Collaborative Filtering + Content-Based Recommendation
PATH FINDING 
How is this person related to this film? 
Which authors of this book also 
wrote a New York Times bestseller? 
Which movies are based on a book by a 
New York Times bestseller? 
MOVIE GRAPH 
BOOK GRAPH 
MOVIE+BOOK 
GRAPH
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
gremlin> hercules.match('hercules', 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
g.of().as('actor').out('actor').as('star')) 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
g.of().as('actor').out('actor').as('star')) 
.select(['movie','star']) 
==>[movie:v[7], star:v[9]] 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie') 
g.of().as('movie').out('hasActor').as('actor') 
g.of().as('actor').out('role').as('hercules') 
g.of().as('actor').out('actor').as('star')) 
.select(['movie','star']){it.name} 
==>[movie:hercules in new york, star:arnold schwarzenegger] 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn 
thinksHeIs
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
BOOK GRAPH 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn 
thinksHeIs 
MOVIE GRAPH 
TRANSPORTATION GRAPH 
PROFILE 
GRAPH
SOCIAL INFLUENCE 
Who are the most influential people in 
java, mathematics, art, surreal art, politics, ...? 
Which region of the social graph will propagate this 
advertisement this furthest? 
Which 3 experts should review this submitted article? 
Which people should I talk to at the upcoming 
conference and what topics should 
I talk to them about? 
SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
PATTERN IDENTIFICATION 
This connectivity pattern is a sign of financial fraud. 
When this motif is found, a red flag will be raised. 
TRANSACTION GRAPH 
Healthy discourse is typified by a discussion board 
with a branch factor in this range and a concept 
clique score in this range. 
DISCUSSION GRAPH
KNOWLEDGE DISCOVERY 
The terms "ice", "fans", "stanley cup," 
are classified as "sports" 
Given that all identified birds fly, 
it can be deduced that all birds fly. 
If contrary evidence is provided, 
then this "fact" can be retracted. 
WIKIPEDIA GRAPH 
EVIDENTIAL LOGIC GRAPH
WORLD MODEL
WORLD PROCESSES 
WORLD MODEL
WORLD PROCESSES 
WORLD MODEL 
A single world model and various types of traversers 
moving through that model to solve problems.
TRAVERSAL GRAPH 
PROCESS STRUCTURE 
COMPUTING GRAPH-BASED 
COMPUTING
PART 2: 
THE PATH TO TITAN 1.0
WhY CREATE TITAN? 
A number of Aurelius' clients... 
...need to represent and process 
graphs at the 100+ billion edge 
scale w/ thousands of concurrent 
transactions. 
...need both local graph traversals 
(OLTP) and batch graph 
processing (OLAP). 
...desire a free, open source 
distributed graph database.
TITAN's KEY FEATURES 
Titan provides... 
..."infinite size" graphs and 
"unlimited" users by means of a 
distributed storage engine. 
...real-time local traversals (OLTP) 
and support for global batch 
processing via Hadoop (OLAP). 
...distribution via the liberal, free, 
open source Apache2 license.
-OR-BACKEND 
AGNOSTIC
TITAN DISTRIBUTED 
VIA CASSANDRA 
titan$ bin/gremlin.sh 
,,,/ 
(o o) 
-----oOOo-(_)-oOOo----- 
gremlin> conf = new BaseConfiguration(); 
==>org.apache.commons.configuration.BaseConfiguration@763861e6 
gremlin> conf.setProperty("storage.backend","cassandra"); 
gremlin> conf.setProperty("storage.hostname","77.77.77.77"); 
gremlin> g = TitanFactory.open(conf); 
==>titangraph[cassandra:77.77.77.77] 
gremlin>
INHERITED FEATURES 
Continuously available with no single point of failure. 
No write bottlenecks to the graph as there is no master/slave architecture. 
Built-in replication ensures data is available during machine failure. 
Caching layer ensures that continuously accessed data is available in memory. 
Elastic scalability allows for the introduction and removal of machines. 
Cassandra available at http://cassandra.apache.org/
TITAN DISTRIBUTED 
titan$ bin/gremlin.sh 
,,,/ 
(o o) 
VIA HBASE 
-----oOOo-(_)-oOOo----- 
gremlin> conf = new BaseConfiguration(); 
==>org.apache.commons.configuration.BaseConfiguration@763861e6 
gremlin> conf.setProperty("storage.backend","hbase"); 
gremlin> conf.setProperty("storage.hostname","77.77.77.77"); 
gremlin> g = TitanFactory.open(conf); 
==>titangraph[hbase:77.77.77.77] 
gremlin>
INHERITED FEATURES 
Strictly consistent reads and writes. 
Linear scalability with the addition of machines. 
Base classes for backing Hadoop MapReduce jobs with HBase tables. 
HDFS-based data replication. 
Generally good integration with the tools in the Hadoop ecosystem. 
HBase available at http://hbase.apache.org/
Titan is all about ...
Titan is all about numerous concurrent users...
Titan is all about numerous concurrent users... 
high availability....
Titan is all about numerous concurrent users... 
high availability.... 
dynamic scalability...
VERTEX-CENTRIC INDICES 
THE SUPER NODE PROBLEM 
Natural, real-world graphs contain 
vertices of high degree. 
Even if rare, their degree ensures that 
they exist on many paths. 
Traversing a high degree vertex 
means touching numerous incident 
edges and potentially touching most 
of the graph in only a few steps.
VERTEX-CENTRIC INDICES 
A SUPER NODE SOLUTION 
A "super node" only exists from the 
vantage point of classic "textbook 
style" graphs. 
In the world of property graphs, 
intelligent disk-level filtering can 
interpret a "super node" as a more 
manageable low-degree vertex. 
Vertex-centric querying utilizes sort 
orders for speedy lookup of incident 
edges with particular qualities.
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
vertex.bothE() 
likes likes 
stars:2 
stars:2 
knows knows 
likes likes 
knows 
likes 
stars:3 stars:3 
8 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes likes 
likes 
knows knows 
stars:3 stars:3 
likes likes 
stars:2 
stars:2 
vertex.outE() 
7 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes likes 
likes 
stars:3 stars:3 
likes likes 
stars:2 
stars:2 
vertex.outE("likes") 
5 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes 
1 edge 
vertex.outE("likes").has("stars",5)
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
stars:1 
stars:2 
stars:5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes 
knows 
stars:1 
stars:2 
stars:5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes w/ time 1-3 
knows 
stars:1 
stars:2 
stars:5 
likes w/ time 4-5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes w/ time 1-3 
knows 
schema = g.openManagementSystem() 
stars = schema 
.makePropertyKey("stars") 
.dataType(Integer.class) 
.make() 
likes = schema 
.makeEdgeLabel('likes') 
.make() 
schema.buildEdgeIndex(likes, 
'timeIdx',BOTH,DESC,stars) 
stars:1 
stars:2 
stars:5 
likes w/ time 4-5
THE TITAN OLAP STORY 
0 
1 
2 
3 
A 
B 
A 
C 
a:b 
c:d 
e:f 
g:h 
i:j 
4 
5 
6 
7 
vertex 
id props 
incoming edges outgoing edges 
edge 
id props label id 
edge 
id props label id 
edge 
id label id 
edge 
id label id 
1 e:f 4 c:d A 2 5 B 0 6 g:h A 3 7 C 3
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11 
AN ADJACENCY LIST
AN ADJACENCY LIST 
+ 
CLUSTER 
127.0.0.2 127.0.0.3 127.0.0.4 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11
A DISTRIBUTED ADJACENCY LIST 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
127.0.0.2 127.0.0.3 127.0.0.4
HADOOP 
Hadoop is a distributed computing platform composed of two key components: 
HDFS: 
A distributed file system that stores arbitrarily large files within a cluster. 
MapReduce: 
A parallel functional computing model for key/value pair data. 
http://hadoop.apache.org
FAUNUS AND HADOOP 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
Process 
Structure 
127.0.0.2 127.0.0.3 127.0.0.4 
Faunus provides graph input/output formats (structure) 
and a traversal language for graphs (process).
THE TITAN OLAP STORY 
Serial Key/Value Data Structure Indexed Key/Indexed Value Data Structure 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11
Adam Jacobs. 2009. The Pathologies of Big Data. Communications of the ACM 52, 8 (August 2009), 36-44. 
doi:10.1145/1536616.1536632 http://doi.acm.org/10.1145/1536616.1536632
INTRA-CLUSTER CONFIGURATION 
Titan/Cassandra Faunus/Hadoop 
Data is processed on the machine where it is located. 
Limited network communication.
INTER-CLUSTER CONFIGURATION 
Graph data is offloaded to another cluster. 
Repeated analysis does not interfere with production graph database.
THE TITAN OLAP STORY 
Titan currently provides a Hadoop OLAP engine 
Titan-Hadoop (nicknamed Faunus) 
Titan 1.0 will also provide an in-memory, off head, single-machine OLAP engine. 
Titan-Memory (nicknamed Fulgora)
PART 3: 
THE PATH TO TINKERPOP3
TINKERPOP3 FEATURES 
A standard, vendor-agnostic graph query language. 
Both OLTP and OLAP engines for evaluating queries. 
A way for developers to create custom, domain specific variants. 
A server system for pushing queries to the server for evaluation. 
A vertex-centric computing framework for distributed graph algorithms.
gremlin>
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
name:gremlin 
0 
software
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
name:titan name:gremlin 
0 
software 
2 
software
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') 
==>e[4][2-dependsOn->0] 
name:titan name:gremlin 
0 
software 
2 
software 
dependsOn 
since:0.1
Gremlin is a style of graph traversal that is 
heavily influenced by functional programming. 
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') 
==>e[4][2-dependsOn->0] 
////////////////////////////////////////////////////////////////// 
gremlin> g.V 
==>v[0] 
==>v[2] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').outE 
==>e[4][2-dependsOn->0] 
gremlin> g.V.has('name','titan').outE.since 
==>0.1 
gremlin> g.V.has('name','titan').out.name 
==>gremlin
g.V.has('name','titan').out.name 
What are the names of the vertices outgoing adjacent to Titan?
g.V.has('name','titan').out.name 
name=gremlin 
0 
2 
name=titan 
For the graph 'g'... 
Traversals are defined using fluent, function composition.
g.V.has('name','titan').out.name 
name=gremlin 
0 
2 
name=titan 
…get all the vertices... 
Objects flow through the traversal in a lazy evaluation manner.
g.V.has('name','titan').out.name 
2 
name=titan 
…filter all vertices that don't have the name 'titan'… 
The general functions are map(), flatMap(), filter(), sideEffect(), and branch().
g.V.has('name','titan').out.name 
name=gremlin 
0 
…get all outgoing adjacent vertices… 
Any programming language can provide a Gremlin implementation.
g.V.has('name','titan').out.name 
gremlin 
…get the names of those vertices... 
Any graph system vendor can provide a binding to Gremlin.
THE GENERIC STEPS 
? 
S filter S S map E S 
flatMap 
sideEffect branch 
A 
E 
E E 
E 
E 
E 
S S S 
S 
... 
S 
? 
? 
has() 
retain() 
except() 
interval() 
simplePath() 
dedup() 
timeLimit() 
... 
back() 
fold() 
valueMap() 
match() 
orderBy() 
shuffle() 
... 
out() 
outE() 
inV() 
both() 
values() 
unfold() 
... 
aggregate() 
store() 
groupCount() 
groupBy() 
tree() 
count() 
subgraph() 
... 
jump() 
until() 
choose() 
union() 
...
TINKERPOP'S OLAP STORY 
Execute parallel, distributed Gremlin traversals. 
Execute vertex-centric message passing algorithms.
TINKERPOP'S OLAP STORY 
Execute parallel, distributed Gremlin traversals. 
Execute vertex-centric message passing algorithms. 
traversers are the messages!
OLAP 
1 
1 
1 
1 
2 
2 
2 
3 
1 
2 
3 
OLTP 
✓ Real-time queries. 
✓ Touching a subset of the graph. 
✓ Numerous concurrent queries. 
✓ Good for local graph mutations. 
✓ Long running queries. 
✓ Touching the entire of the graph. 
✓ Single user. 
✓ Good for bulk loading/mutations.
g.compute() 
GraphComputer 
Vertex 
Program 
program 
mapReduce 
distributed 
to all workers 
MapReduce 
MapReduce 
MapReduce 
worker 1 
worker 2 
worker 3 
1. Logically distribute the vertex program to all the vertices in the graph. 
2. The vertices execute the vertex program sending and receiving messages on each iteration. 
3. Any number of MapReduce jobs can follow to aggregate the computed data in the graph.
gremlin> result = g.compute().program(PageRankVertexProgram).submit() 
==>result[tinkergraph[vertices:6 edges:6],memory[size:0]] 
gremlin> result.memory.iteration 
==>30 
gremlin> result.graph 
==>tinkergraph[vertices:6 edges:6] 
gremlin> result.graph.V.map{ [it.id,it.pageRank]} 
==>[1, 0.15000000000000002] 
==>[2, 0.19250000000000003] 
==>[3, 0.4018125] 
==>[4, 0.19250000000000003] 
==>[5, 0.23181250000000003] 
==>[6, 0.15000000000000002] 
* Minor tweaks to example to make it fit nicely on a single line.
gremlin> g.V.both.both.name.groupCount() 
==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] 
gremlin> g.V.both.both.name.groupCount().submit(g.compute()) 
==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] 
OLTP 
OLAP
MULTIPLE GRAPH COMPUTERS 
FOR THE SAME GRAPH 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.compute(FaunusGraphComputer) 
.program(PageRankVertexProgram) 
.submit() 
==>result[faunusgraph,memory[size:0]] 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.compute(FulgoraGraphComputer) 
.program(PageRankVertexProgram) 
.submit() 
==>result[fulgoragraph,memory[size:0]]
GREMLIN SERVER
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
localhost
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
localhost
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').out 
localhost 
1. get the titan vertex 
2. get titan's adjacents
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').out 
localhost 
1. get the titan vertex 
2. get titan's adjacents 
CHATTY 
Each step in the traversal is a round trip
GREMLIN SERVER
GREMLIN SERVER
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
localhost
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
gremlin> :> g.V.has('name','titan') 
==>v[2] 
localhost 
g.V.has('name','titan')
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
gremlin> :> g.V.has('name','titan') 
==>v[2] 
gremlin> :> g.V.has('name','titan').out 
localhost 
g.V.has('name','titan').out 
LESS CHATTY 
1. Gremlin Server client knows the hash function 
to query the machine with the Titan vertex 
2. Push the traversal to the server and 
get a WebSockets iterator back
GREMLIN SERVER 
Allows non-JVM languages to interact with a remote graph. 
Supports stored procedures -- MyAlgorithms.class 
Supports WebSockets, REST, and various serialization mechanisms.
CREDITS 
PRESENTED BY 
MARKO A. RODRIGUEZ 
MANY THANKS TO 
MATTHIAS BRöCHELER 
STEPHEN MALLETTE 
DAN LAROCQUE 
DANIEL KUPPITZ 
BOB BRIODY 
BRYN COOKE 
JOSH SHINAVIER 
PAVEL YASKEVICH 
AURELIUS COMMUNITY 
TINKERPOP COMMUNITY 
KETRINA YIM

Más contenido relacionado

Destacado

Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBMohamed Taher Alrefaie
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageMarko Rodriguez
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks DataWorks Summit/Hadoop Summit
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Matthias Broecheler
 
Humanitarian Logistics
Humanitarian LogisticsHumanitarian Logistics
Humanitarian Logisticsihab tarek
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
 
Standard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch DeckStandard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch DeckZachary Townsend
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataSpark Summit
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageMarko Rodriguez
 

Destacado (11)

Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DB
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming Language
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3
 
Humanitarian Logistics
Humanitarian LogisticsHumanitarian Logistics
Humanitarian Logistics
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
Standard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch DeckStandard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch Deck
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And Data
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
 

Más de Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data TypeMarko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinMarko Rodriguez
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical GremlinMarko Rodriguez
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the GraphMarko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming PatternMarko Rodriguez
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
General-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked ProcessGeneral-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked ProcessMarko Rodriguez
 
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human EudaimoniaCollective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human EudaimoniaMarko Rodriguez
 
Distributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of DataDistributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of DataMarko Rodriguez
 
An Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and GraphAn Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and GraphMarko Rodriguez
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementMarko Rodriguez
 

Más de Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
General-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked ProcessGeneral-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked Process
 
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human EudaimoniaCollective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
 
Distributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of DataDistributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of Data
 
An Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and GraphAn Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and Graph
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Management
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

The Path Forward

  • 1. THE PATH FORWARD TITAN 1.0 + TINKERPOP3 MARKO A. RODRIGUEZ http://THINKAURELIUS.COM http://TINKERPOP.COM
  • 7. PROPERTY KEY VALUE name:gremlin software
  • 9. name:gremlin use:queries software name:titan use:database software
  • 10. name:gremlin use:queries software name:titan use:database software dependsOn
  • 11. name:gremlin use:queries software name:titan use:database software EDGE dependsOn
  • 12. name:gremlin use:queries software name:titan use:database software dependsOn since:0.1
  • 13. GRAPH
  • 15. COMPUTING PROCESS STRUCTURE TRAVERSAL GRAPH
  • 16. TRAVERSAL GRAPH PROCESS STRUCTURE COMPUTING GRAPH-BASED COMPUTING
  • 18. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING
  • 19. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING
  • 20. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING NUMEROUS ANALYSES Mixing Patterns Centrality Ranking Path Expressions Geodesics Inference Motifs Scoring
  • 21. ANALYSES ARE THE EPIPHENOMENA OF TRAVERSAL f( )→
  • 22. WHAT IS THE SIGNIFICANCE OF GRAPH ANALYSIS?
  • 23. ANALYSES YIELD INSIGHTS ABOUT THE MODEL = DATA PRODUCTS DATA-DRIVEN DECISION SUPPORT
  • 24. RECOMMENDATION People you may know. Products you might like. Movies you should watch and the friends you should watch them with. SOCIAL GRAPH RATINGS GRAPH SOCIAL+RATINGS GRAPH
  • 25. WHO ELSE MIGHT HERCULES KNOW? hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows
  • 26. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules ==>v[0]
  • 27. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows') ==>v[1] ==>v[2] ==>v[3]
  • 28. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows').out('knows') ==>v[4] ==>v[5] ==>v[5] ==>v[6] ==>v[5]
  • 29. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows').out('knows').groupCount() ==>v[4]=1 ==>v[5]=3 ==>v[6]=1
  • 30. HERCULES PROBABLY KNOWS NEPTUNE hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows knows
  • 31. HERCULES PROBABLY KNOWS NEPTUNE hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother ...PROBABLY MORE SO WHEN OTHER TYPES OF EDGES ARE ANALYZED
  • 32. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother
  • 33. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother likes
  • 34. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother likes SOCIAL GRAPH
  • 35. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes SOCIAL GRAPH
  • 36. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes SOCIAL GRAPH
  • 37. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 SOCIAL GRAPH
  • 38. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 likes likes dislikes SOCIAL GRAPH
  • 39. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 likes likes dislikes SOCIAL GRAPH RATINGS GRAPH
  • 40. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes composedOf tartarus 8 likes likes dislikes NEMEAN MIGHT LIKE TARTARUS smellsOf SOCIAL GRAPH RATINGS GRAPH PRODUCT GRAPH * Collaborative Filtering + Content-Based Recommendation
  • 41. PATH FINDING How is this person related to this film? Which authors of this book also wrote a New York Times bestseller? Which movies are based on a book by a New York Times bestseller? MOVIE GRAPH BOOK GRAPH MOVIE+BOOK GRAPH
  • 42. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 43. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york gremlin> hercules.match('hercules', actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 44. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 45. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 46. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 47. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), g.of().as('actor').out('actor').as('star')) 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 48. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), g.of().as('actor').out('actor').as('star')) .select(['movie','star']) ==>[movie:v[7], star:v[9]] 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 49. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie') g.of().as('movie').out('hasActor').as('actor') g.of().as('actor').out('role').as('hercules') g.of().as('actor').out('actor').as('star')) .select(['movie','star']){it.name} ==>[movie:hercules in new york, star:arnold schwarzenegger] 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 50. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 51. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 52. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules depictedIn
  • 53. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy
  • 54. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn
  • 55. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe
  • 56. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn
  • 57. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn thinksHeIs
  • 58. albuquerque hercules 0 arnold schwarzenegger BOOK GRAPH hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn thinksHeIs MOVIE GRAPH TRANSPORTATION GRAPH PROFILE GRAPH
  • 59. SOCIAL INFLUENCE Who are the most influential people in java, mathematics, art, surreal art, politics, ...? Which region of the social graph will propagate this advertisement this furthest? Which 3 experts should review this submitted article? Which people should I talk to at the upcoming conference and what topics should I talk to them about? SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
  • 60. PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud. When this motif is found, a red flag will be raised. TRANSACTION GRAPH Healthy discourse is typified by a discussion board with a branch factor in this range and a concept clique score in this range. DISCUSSION GRAPH
  • 61. KNOWLEDGE DISCOVERY The terms "ice", "fans", "stanley cup," are classified as "sports" Given that all identified birds fly, it can be deduced that all birds fly. If contrary evidence is provided, then this "fact" can be retracted. WIKIPEDIA GRAPH EVIDENTIAL LOGIC GRAPH
  • 64. WORLD PROCESSES WORLD MODEL A single world model and various types of traversers moving through that model to solve problems.
  • 65. TRAVERSAL GRAPH PROCESS STRUCTURE COMPUTING GRAPH-BASED COMPUTING
  • 66. PART 2: THE PATH TO TITAN 1.0
  • 67. WhY CREATE TITAN? A number of Aurelius' clients... ...need to represent and process graphs at the 100+ billion edge scale w/ thousands of concurrent transactions. ...need both local graph traversals (OLTP) and batch graph processing (OLAP). ...desire a free, open source distributed graph database.
  • 68. TITAN's KEY FEATURES Titan provides... ..."infinite size" graphs and "unlimited" users by means of a distributed storage engine. ...real-time local traversals (OLTP) and support for global batch processing via Hadoop (OLAP). ...distribution via the liberal, free, open source Apache2 license.
  • 70. TITAN DISTRIBUTED VIA CASSANDRA titan$ bin/gremlin.sh ,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","cassandra"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[cassandra:77.77.77.77] gremlin>
  • 71. INHERITED FEATURES Continuously available with no single point of failure. No write bottlenecks to the graph as there is no master/slave architecture. Built-in replication ensures data is available during machine failure. Caching layer ensures that continuously accessed data is available in memory. Elastic scalability allows for the introduction and removal of machines. Cassandra available at http://cassandra.apache.org/
  • 72. TITAN DISTRIBUTED titan$ bin/gremlin.sh ,,,/ (o o) VIA HBASE -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","hbase"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[hbase:77.77.77.77] gremlin>
  • 73. INHERITED FEATURES Strictly consistent reads and writes. Linear scalability with the addition of machines. Base classes for backing Hadoop MapReduce jobs with HBase tables. HDFS-based data replication. Generally good integration with the tools in the Hadoop ecosystem. HBase available at http://hbase.apache.org/
  • 74. Titan is all about ...
  • 75. Titan is all about numerous concurrent users...
  • 76. Titan is all about numerous concurrent users... high availability....
  • 77. Titan is all about numerous concurrent users... high availability.... dynamic scalability...
  • 78. VERTEX-CENTRIC INDICES THE SUPER NODE PROBLEM Natural, real-world graphs contain vertices of high degree. Even if rare, their degree ensures that they exist on many paths. Traversing a high degree vertex means touching numerous incident edges and potentially touching most of the graph in only a few steps.
  • 79. VERTEX-CENTRIC INDICES A SUPER NODE SOLUTION A "super node" only exists from the vantage point of classic "textbook style" graphs. In the world of property graphs, intelligent disk-level filtering can interpret a "super node" as a more manageable low-degree vertex. Vertex-centric querying utilizes sort orders for speedy lookup of incident edges with particular qualities.
  • 80. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 vertex.bothE() likes likes stars:2 stars:2 knows knows likes likes knows likes stars:3 stars:3 8 edges
  • 81. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes likes likes knows knows stars:3 stars:3 likes likes stars:2 stars:2 vertex.outE() 7 edges
  • 82. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes likes likes stars:3 stars:3 likes likes stars:2 stars:2 vertex.outE("likes") 5 edges
  • 83. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes 1 edge vertex.outE("likes").has("stars",5)
  • 84. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows stars:1 stars:2 stars:5
  • 85. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes knows stars:1 stars:2 stars:5
  • 86. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes w/ time 1-3 knows stars:1 stars:2 stars:5 likes w/ time 4-5
  • 87. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes w/ time 1-3 knows schema = g.openManagementSystem() stars = schema .makePropertyKey("stars") .dataType(Integer.class) .make() likes = schema .makeEdgeLabel('likes') .make() schema.buildEdgeIndex(likes, 'timeIdx',BOTH,DESC,stars) stars:1 stars:2 stars:5 likes w/ time 4-5
  • 88. THE TITAN OLAP STORY 0 1 2 3 A B A C a:b c:d e:f g:h i:j 4 5 6 7 vertex id props incoming edges outgoing edges edge id props label id edge id props label id edge id label id edge id label id 1 e:f 4 c:d A 2 5 B 0 6 g:h A 3 7 C 3
  • 89. 0 1 3 4 5 6 7 8 9 10 11 AN ADJACENCY LIST
  • 90. AN ADJACENCY LIST + CLUSTER 127.0.0.2 127.0.0.3 127.0.0.4 0 1 3 4 5 6 7 8 9 10 11
  • 91. A DISTRIBUTED ADJACENCY LIST 0 1 2 3 4 5 6 7 8 9 10 11 127.0.0.2 127.0.0.3 127.0.0.4
  • 92. HADOOP Hadoop is a distributed computing platform composed of two key components: HDFS: A distributed file system that stores arbitrarily large files within a cluster. MapReduce: A parallel functional computing model for key/value pair data. http://hadoop.apache.org
  • 93. FAUNUS AND HADOOP 0 1 2 3 4 5 6 7 8 9 10 11 Process Structure 127.0.0.2 127.0.0.3 127.0.0.4 Faunus provides graph input/output formats (structure) and a traversal language for graphs (process).
  • 94. THE TITAN OLAP STORY Serial Key/Value Data Structure Indexed Key/Indexed Value Data Structure 0 1 3 4 5 6 7 8 9 10 11 0 1 3 4 5 6 7 8 9 10 11
  • 95. Adam Jacobs. 2009. The Pathologies of Big Data. Communications of the ACM 52, 8 (August 2009), 36-44. doi:10.1145/1536616.1536632 http://doi.acm.org/10.1145/1536616.1536632
  • 96. INTRA-CLUSTER CONFIGURATION Titan/Cassandra Faunus/Hadoop Data is processed on the machine where it is located. Limited network communication.
  • 97. INTER-CLUSTER CONFIGURATION Graph data is offloaded to another cluster. Repeated analysis does not interfere with production graph database.
  • 98. THE TITAN OLAP STORY Titan currently provides a Hadoop OLAP engine Titan-Hadoop (nicknamed Faunus) Titan 1.0 will also provide an in-memory, off head, single-machine OLAP engine. Titan-Memory (nicknamed Fulgora)
  • 99. PART 3: THE PATH TO TINKERPOP3
  • 100. TINKERPOP3 FEATURES A standard, vendor-agnostic graph query language. Both OLTP and OLAP engines for evaluating queries. A way for developers to create custom, domain specific variants. A server system for pushing queries to the server for evaluation. A vertex-centric computing framework for distributed graph algorithms.
  • 102. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] name:gremlin 0 software
  • 103. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] name:titan name:gremlin 0 software 2 software
  • 104. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') ==>e[4][2-dependsOn->0] name:titan name:gremlin 0 software 2 software dependsOn since:0.1
  • 105. Gremlin is a style of graph traversal that is heavily influenced by functional programming. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') ==>e[4][2-dependsOn->0] ////////////////////////////////////////////////////////////////// gremlin> g.V ==>v[0] ==>v[2] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').outE ==>e[4][2-dependsOn->0] gremlin> g.V.has('name','titan').outE.since ==>0.1 gremlin> g.V.has('name','titan').out.name ==>gremlin
  • 106. g.V.has('name','titan').out.name What are the names of the vertices outgoing adjacent to Titan?
  • 107. g.V.has('name','titan').out.name name=gremlin 0 2 name=titan For the graph 'g'... Traversals are defined using fluent, function composition.
  • 108. g.V.has('name','titan').out.name name=gremlin 0 2 name=titan …get all the vertices... Objects flow through the traversal in a lazy evaluation manner.
  • 109. g.V.has('name','titan').out.name 2 name=titan …filter all vertices that don't have the name 'titan'… The general functions are map(), flatMap(), filter(), sideEffect(), and branch().
  • 110. g.V.has('name','titan').out.name name=gremlin 0 …get all outgoing adjacent vertices… Any programming language can provide a Gremlin implementation.
  • 111. g.V.has('name','titan').out.name gremlin …get the names of those vertices... Any graph system vendor can provide a binding to Gremlin.
  • 112. THE GENERIC STEPS ? S filter S S map E S flatMap sideEffect branch A E E E E E E S S S S ... S ? ? has() retain() except() interval() simplePath() dedup() timeLimit() ... back() fold() valueMap() match() orderBy() shuffle() ... out() outE() inV() both() values() unfold() ... aggregate() store() groupCount() groupBy() tree() count() subgraph() ... jump() until() choose() union() ...
  • 113. TINKERPOP'S OLAP STORY Execute parallel, distributed Gremlin traversals. Execute vertex-centric message passing algorithms.
  • 114. TINKERPOP'S OLAP STORY Execute parallel, distributed Gremlin traversals. Execute vertex-centric message passing algorithms. traversers are the messages!
  • 115. OLAP 1 1 1 1 2 2 2 3 1 2 3 OLTP ✓ Real-time queries. ✓ Touching a subset of the graph. ✓ Numerous concurrent queries. ✓ Good for local graph mutations. ✓ Long running queries. ✓ Touching the entire of the graph. ✓ Single user. ✓ Good for bulk loading/mutations.
  • 116. g.compute() GraphComputer Vertex Program program mapReduce distributed to all workers MapReduce MapReduce MapReduce worker 1 worker 2 worker 3 1. Logically distribute the vertex program to all the vertices in the graph. 2. The vertices execute the vertex program sending and receiving messages on each iteration. 3. Any number of MapReduce jobs can follow to aggregate the computed data in the graph.
  • 117. gremlin> result = g.compute().program(PageRankVertexProgram).submit() ==>result[tinkergraph[vertices:6 edges:6],memory[size:0]] gremlin> result.memory.iteration ==>30 gremlin> result.graph ==>tinkergraph[vertices:6 edges:6] gremlin> result.graph.V.map{ [it.id,it.pageRank]} ==>[1, 0.15000000000000002] ==>[2, 0.19250000000000003] ==>[3, 0.4018125] ==>[4, 0.19250000000000003] ==>[5, 0.23181250000000003] ==>[6, 0.15000000000000002] * Minor tweaks to example to make it fit nicely on a single line.
  • 118. gremlin> g.V.both.both.name.groupCount() ==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] gremlin> g.V.both.both.name.groupCount().submit(g.compute()) ==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] OLTP OLAP
  • 119. MULTIPLE GRAPH COMPUTERS FOR THE SAME GRAPH gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.compute(FaunusGraphComputer) .program(PageRankVertexProgram) .submit() ==>result[faunusgraph,memory[size:0]] gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.compute(FulgoraGraphComputer) .program(PageRankVertexProgram) .submit() ==>result[fulgoragraph,memory[size:0]]
  • 121. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] localhost
  • 122. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] localhost
  • 123. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').out localhost 1. get the titan vertex 2. get titan's adjacents
  • 124. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').out localhost 1. get the titan vertex 2. get titan's adjacents CHATTY Each step in the traversal is a round trip
  • 127. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... localhost
  • 128. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... gremlin> :> g.V.has('name','titan') ==>v[2] localhost g.V.has('name','titan')
  • 129. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... gremlin> :> g.V.has('name','titan') ==>v[2] gremlin> :> g.V.has('name','titan').out localhost g.V.has('name','titan').out LESS CHATTY 1. Gremlin Server client knows the hash function to query the machine with the Titan vertex 2. Push the traversal to the server and get a WebSockets iterator back
  • 130. GREMLIN SERVER Allows non-JVM languages to interact with a remote graph. Supports stored procedures -- MyAlgorithms.class Supports WebSockets, REST, and various serialization mechanisms.
  • 131. CREDITS PRESENTED BY MARKO A. RODRIGUEZ MANY THANKS TO MATTHIAS BRöCHELER STEPHEN MALLETTE DAN LAROCQUE DANIEL KUPPITZ BOB BRIODY BRYN COOKE JOSH SHINAVIER PAVEL YASKEVICH AURELIUS COMMUNITY TINKERPOP COMMUNITY KETRINA YIM