SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
The Gremlin in the Graph
                   Marko A. Rodriguez
                 Graph Systems Architect
               http://markorodriguez.com
               http://twitter.com/twarko
             http://slideshare.com/slidarko



First University-Industry Meeting on Graph Databases (UIM-GDB)
      Universitat Polit´cnica de Catalunya – February 8, 2011
                       e

                     February 6, 2011
Abstract
This tutorial/lecture addresses various aspects of the graph traversal
language Gremlin. In particular, the presentation focuses on Gremlin 0.7
and its application to graph analysis and processing.




                              Gremlin       G = (V, E)

                   http://gremlin.tinkerpop.com
TinkerPop Productions




1
    1
    TinkerPop (http://tinkerpop.com), Marko A. Rodriguez (http://markorodriguez.com), Peter
Neubauer (http://www.linkedin.com/in/neubauer), Joshua Shinavier (http://fortytwo.net/), Pavel
Yaskevich (https://github.com/xedin), Darrick Wiebe (http://ofallpossibleworlds.wordpress.
com/), Stephen Mallette (http://stephen.genoprime.com/), Alex Averbuch (http://se.linkedin.
com/in/alexaverbuch)
Gremlin is a Domain Specific Language

• A domain specific language for graph analysis and manipulation.

• Gremlin 0.7+ uses Groovy has its host language.2

• Groovy is a superset of Java and as such, natively exposes the full JDK
  to Gremlin.




 2
     Groovy available at http://groovy.codehaus.org/.
Gremlin is for Property Graphs
                                        name = "lop"
                                        lang = "java"

                       weight = 0.4              3
     name = "marko"
     age = 29            created
                                                                weight = 0.2
                   9
               1                                                created
                   8                     created
                                                                          12
               7       weight = 1.0
weight = 0.5
                                                weight = 0.4                    6
                        knows
          knows                          11                               name = "peter"
                                                                          age = 35
                                                name = "josh"
                                        4       age = 32
               2
                                        10
     name = "vadas"
     age = 27
                                             weight = 1.0

                                      created



                                        5

                                name = "ripple"
                                lang = "java"
Gremlin is for Blueprints-enabled Graph Databases

•   Blueprints can be seen as the JDBC for property graph databases.3
•   Provides a collection of interfaces for graph database providers to implement.
•   Provides tests to ensure the operational semantics of any implementation are correct.
•   Provides support for GraphML and other “helper” utilities.




                                 Blueprints

    3
        Blueprints is available at http://blueprints.tinkerpop.com.
A Blueprints Detour - Basic Example

Graph graph = new Neo4jGraph("/tmp/my_graph");
Vertex a = graph.addVertex(null);
Vertex b = graph.addVertex(null);
a.setProperty("name","marko");
b.setProperty("name","peter");
Edge e = graph.addEdge(null, a, b, "knows");
e.setProperty("since", 2007);
graph.shutdown();


               0            knows           1
                          since=2007
          name=marko                    name=peter
A Blueprints Detour - Sub-Interfaces

• TransactionalGraph extends Graph
       For transaction based graph databases.4

• IndexableGraph extends Graph
       For graph databases that support the indexing of properties.5




 4
     Graph Transactions https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions.
 5
     Graph Indices https://github.com/tinkerpop/blueprints/wiki/Graph-Indices.
A Blueprints Detour - Implementations



TinkerGraph
A Blueprints Detour - Implementations
• TinkerGraph implements IndexableGraph

• Neo4jGraph implements TransactionalGraph, IndexableGraph6

• OrientGraph implements TransactionalGraph, IndexableGraph7

• RexsterGraph implements IndexableGraph: Implementation of a
  Rexster REST API.8

• SailGraph implements TransactionalGraph: Turns any Sail-based
  RDF store into a Blueprints Graph.9
 6
   Neo4j is available at http://neo4j.org.
 7
   OrientDB is available at http://www.orientechnologies.com.
 8
   Rexster is available at http://rexster.tinkerpop.com.
 9
   Sail is available at http://openrdf.org.
A Blueprints Detour - Ouplementations




     JUNG
     Java Universal Network/Graph Framework
A Blueprints Detour - Ouplementations
                                                                          10
• GraphSail: Turns any IndexableGraph into an RDF store.

• GraphJung: Turns any Graph into a JUNG graph.11               12




10
   GraphSail https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation.
11
   JUNG is available at http://jung.sourceforge.net/.
12
   GraphJung https://github.com/tinkerpop/blueprints/wiki/JUNG-Ouplementation.
Gremlin Compiles Down to Pipes
• Pipes is a data flow framework for evaluating lazy graph traversals.13

• A Pipe extends Iterator, Iterable and can be chained together
  to create processing pipelines.

• A Pipe can be embedded into another Pipe to allow for nested
  processing.




                                                  Pipes
 13
      Pipes is available at http://pipes.tinkerpop.com.
A Pipes Detour - Chained Iterators

• This Pipeline takes objects of type A and turns them into objects of
  type D.


                                                                      D
                                                              D
       A
               A   A   Pipe1   B    Pipe2     C   Pipe3   D       D
   A                                                          D
           A
                                   Pipeline



Pipe<A,D> pipeline =
   new Pipeline<A,D>(Pipe1<A,B>, Pipe2<B,C>, Pipe3<C,D>)
A Pipes Detour - Simple Example

“What are the names of the people that marko knows?”

                                     B       name=peter
                        knows



                A       knows        C       name=pavel

          name=marko
                                   created
                        created
                                     D       name=gremlin
A Pipes Detour - Simple Example
Pipe<Vertex,Edge> pipe1 = new VertexEdgePipe(Step.OUT_EDGES);
Pipe<Edge,Edge> pipe2= new LabelFilterPipe("knows",Filter.NOT_EQUAL);
Pipe<Edge,Vertex> pipe3 = new EdgeVertexPipe(Step.IN_VERTEX);
Pipe<Vertex,String> pipe4 = new PropertyPipe<String>("name");
Pipe<Vertex,String> pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4);
pipeline.setStarts(new SingleIterator<Vertex>(graph.getVertex("A"));

                                                      EdgeVertexPipe(IN_VERTEX)


                          VertexEdgePipe(OUT_EDGES)
                                                                             PropertyPipe("name")

                                                                       B     name=peter
                                                    knows



                                     A              knows              C     name=pavel

                               name=marko
                                                                   created
                                                    created
                                                                       D     name=gremlin



                                            LabelFilterPipe("knows")




HINT: The benefit of Gremlin is that this Java verbosity is reduced to g.v(‘A’).outE[[label:‘knows’]].inV.name.
A Pipes Detour - Pipes Library


                       [ GRAPHS ]           [ SIDEEFFECTS ]
[ FILTERS ]            EdgeVertexPipe       AggregatorPipe
AndFilterPipe          IdFilterPipe         GroupCountPipe
CollectionFilterPipe   IdPipe               CountPipe
ComparisonFilterPipe   LabelFilterPipe      SideEffectCapPipe
DuplicateFilterPipe    LabelPipe
FutureFilterPipe       PropertyFilterPipe   [ UTILITIES ]
ObjectFilterPipe       PropertyPipe         GatherPipe
OrFilterPipe           VertexEdgePipe       PathPipe
RandomFilterPipe                            ScatterPipe
RangeFilterPipe                             Pipeline
                                            ...
A Pipes Detour - Creating Pipes


public class NumCharsPipe extends AbstractPipe<String,Integer> {
  public Integer processNextStart() {
    String word = this.starts.next();
    return word.length();
  }
}


When extending the base class AbstractPipe<S,E> all that is required is
an implementation of processNextStart().
Now onto Gremlin proper...
The Gremlin Architecture




   Neo4j   OrientDB   TinkerGraph
The Many Ways of Using Gremlin

• Gremlin has a REPL to be run from the shell.14

• Gremlin can be natively integrated into any Groovy class.

• Gremlin can be interacted with indirectly through Java, via Groovy.

• Gremlin has a JSR 223 ScriptEngine as well.




14
     All examples in this lecture/tutorial are via the command line REPL.
The Seamless Nature of Gremlin/Groovy/Java
Simply Gremlin.load() and add Gremlin to your Groovy class.
// a Groovy class
class GraphAlgorithms {
  static {
    Gremlin.load();
  }

    public static Map<Vertex, Integer> eigenvectorRank(Graph g) {
      Map<Vertex,Integer> m = [:]; int c = 0;
      g.V.outE.inV.groupCount(m).loop(3) {c++ < 1000} >> -1;
      return m;
    }
}


Writing software that mixes Groovy and Java is simple...dead simple.
// a Java class
public class GraphFramework {
  public static void main(String[] args) {
    System.out.println(GraphAlgorithms.eigenvectorRank(new Neo4jGraph("/tmp/graphdata"));
  }
}
Property Graph Model and Gremlin
                                   vertex 1 out edges                  vertex 3 in edges
                    edge 9 out vertex                   edge 9 label                  edge 9 in vertex
                                              edge 9 id


                              1                  9       created                           3

                                            8                                11
                                                 knows             created

                                                           4                      vertex 4 id
                           vertex 4 properties
                                                     name = "josh"
                                                     age = 32




•   outE: outgoing edges from a vertex.
•   inE: incoming edge to a vertex.
•   inV: incoming/head vertex of an edge.
•   outV: outgoing/tail vertex of an edge.15
15
     Documentation on atomic steps: https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps.
Loading a Toy Graph

                                        name = "lop"                                       marko$ ./gremlin.sh
                                        lang = "java"

                       weight = 0.4              3
                                                                                                    ,,,/
     name = "marko"
     age = 29            created
                                                                                                    (o o)
                                                                weight = 0.2
                   9                                                                       -----oOOo-(_)-oOOo-----
               1                                                created                    gremlin> g = TinkerGraphFactory.createTinkerGraph()
                   8                     created
                                                                          12               ==>tinkergraph[vertices:6 edges:6]
               7       weight = 1.0
weight = 0.5
                                                weight = 0.4                    6          gremlin>
                        knows
          knows                          11                               name = "peter"
                                                                          age = 35

                                        4
                                                name = "josh"
                                                age = 32
                                                                                           16
               2
                                        10
     name = "vadas"
     age = 27
                                             weight = 1.0

                                      created



                                        5
                                name = "ripple"
                                lang = "java"



   16
        Load up the toy 6 vertex/6 edge graph that comes hardcoded with Blueprints.
Basic Gremlin Traversals - Part 1
                                             name = "lop"                               gremlin> g.v(1)
                                             lang = "java"                              ==>v[1]
                            weight = 0.4
                                                                                        gremlin> g.v(1).map
       name = "marko"                                 3
       age = 29
                                                                                        ==>name=marko
                                created
                       9
                                                                                        ==>age=29
                1                                                    created            gremlin> g.v(1).outE
                       8                      created
                7
                                                                               12       ==>e[7][1-knows->2]
weight = 0.5                                                                        6   ==>e[9][1-created->3]
                               knows                                                    ==>e[8][1-knows->4]
               knows                          11
                       weight = 1.0                                                     gremlin>
                                                     name = "josh"
                                             4       age = 32
                2
                                             10                                         17
       name = "vadas"
       age = 27

                                           created



                                             5




  17
       Look at the neighborhood around marko.
Basic Gremlin Traversals - Part 2
                               name = "lop"
                                                                  gremlin> g.v(1)
                               lang = "java"                      ==>v[1]
                                                                  gremlin> g.v(1).outE[[label:‘created’]].inV
name = "marko"                         3                          ==>v[3]
age = 29           created
                                                                  gremlin> g.v(1).outE[[label:‘created’]].inV.name
              9
                                                                  ==>lop
                                               created
       1                                                          gremlin>
              8                 created
                                                         12
       7
                                                              6   18
                  knows
      knows                     11


                               4
       2
                               10

                             created



                               5




 18
      What has marko created?
Basic Gremlin Traversals - Part 3
                              name = "lop"
                                                                         gremlin>   g.v(1)
                              lang = "java"                              ==>v[1]
                                                                         gremlin>   g.v(1).outE[[label:‘knows’]].inV
name = "marko"                         3                                 ==>v[2]
age = 29          created                                                ==>v[4]
            9                                                            gremlin>   g.v(1).outE[[label:‘knows’]].inV.name
       1                                              created

            8                  created                                   ==>vadas
                                                                12
       7                                                                 ==>josh
                                                                     6
                                                                         gremlin>   g.v(1).outE[[label:‘knows’]].inV{it.age < 30}.name
                 knows
    knows                      11                                        ==>vadas
                                                                         gremlin>
                                      name = "josh"
                              4       age = 32
       2
                              10                                         19
name = "vadas"
age = 27

                            created



                              5




  19
       Who does marko know? Who does marko know that is under 30 years of age?
Basic Gremlin Traversals - Part 4
                codeveloper                                        gremlin> g.v(1)
                                                                   ==>v[1]
                                                                   gremlin> g.v(1).outE[[label:‘created’]].inV
                                      lop    codeveloper           ==>v[3]
               created                                             gremlin> g.v(1).outE[[label:‘created’]].inV
                     codeveloper               created                       .inE[[label:‘created’]].outV
     marko
                                                                   ==>v[1]
                                                           peter   ==>v[4]
                                   created
                                                                   ==>v[6]
     knows   knows                                                 gremlin> g.v(1).outE[[label:‘created’]].inV
                                                                             .inE[[label:‘created’]].outV.except([g.v(1)])
                          josh                                     ==>v[4]
     vadas
                                                                   ==>v[6]
                                                                   gremlin> g.v(1).outE[[label:‘created’]].inV
                         created                                             .inE[[label:‘created’]].outV.except([g.v(1)]).name
                                                                   ==>josh
                         ripple
                                                                   ==>peter
                                                                   gremlin>



20
  20
    Who are marko’s codevelopers? That is, people who have created the same software as him, but are
not him (marko can’t be a codeveloper of himeself).
Basic Gremlin Traversals - Part 5
                                                                  gremlin> Gremlin.addStep(‘codeveloper’)
                                                                  ==>null
                                      lop                         gremlin> c = {_{x = it}.outE[[label:‘created’]].inV
                                            codeveloper
                                                                            .inE[[label:‘created’]].outV{!x.equals(it)}}
               created
                                                                  ==>groovysh_evaluate$_run_closure1@69e94001
                                              created
     marko                                                        gremlin> [Vertex, Pipe].each{ it.metaClass.codeveloper =
                                  created                                    { Gremlin.compose(delegate, c())}}
                codeveloper                               peter
                                                                  ==>interface com.tinkerpop.blueprints.pgm.Vertex
     knows   knows                                                ==>interface com.tinkerpop.pipes.Pipe
                                                                  gremlin> g.v(1).codeveloper
                                                                  ==>v[4]
                          josh
     vadas                                                        ==>v[6]
                                                                  gremlin> g.v(1).codeveloper.codeveloper
                         created                                  ==>v[1]
                                                                  ==>v[6]
                                                                  ==>v[1]
                         ripple                                   ==>v[4]




21
  21
    I don’t want to talk in terms of outE, inV, etc. Given my domain model, I want to talk in terms of
higher-order, abstract adjacencies — e.g. codeveloper.
Lets explore more complex traversals...
Grateful Dead Concert Graph




The Grateful Dead were an American band that was born out of the San Francisco,
California psychedelic movement of the 1960s. The band played music together
from 1965 to 1995 and is well known for concert performances containing extended
improvisations and long and unique set lists. [http://arxiv.org/abs/0807.2466]
Grateful Dead Concert Graph Schema
• vertices
     song vertices
     ∗ type: always song for song vertices.
     ∗ name: the name of the song.
     ∗ performances: the number of times the song was played in concert.
     ∗ song type: whether the song is a cover song or an original.
     artist vertices
     ∗ type: always artist for artist vertices.
     ∗ name: the name of the artist.
• edges
     followed by (song → song): if the tail song was followed by the head song in
     concert.
     ∗ weight: the number of times these two songs were paired in concert.
     sung by (song → artist): if the tail song was primarily sung by the head artist.
     written by (song → artist): if the tail song was written by the head artist.
A Subset of the Grateful Dead Concert Graph
        type="artist"
                                                                     type="artist"
        name="Hunter"
                                                                     name="Garcia"
                                      type="song"
                                      name="Scarlet.."
               7                                                             5
                         written_by          1           sung_by

                   weight=239

                     followed_by      type="song"
                                      name="Fire on.."             sung_by          sung_by
            written_by
                                             2
                                                                    type="artist"
                     weight=1                                       name="Lesh"
                                      type="song"
                     followed_by      name="Pass.."                          6

        written_by                           3            sung_by



                      followed_by
                                      type="song"
                     weight=2         name="Terrap.."


                                             4
Centrality in a Property Graph
                                                                                    ...
                                              Garcia                           sung_by
                                                                             sung_by
                                                                           sung_by
                                                                         sung_by
                                                                       sung_by
                                                                     sung_by
                                       sung_by     sung_by         sung_by
                             sung_by         sung_by         sung_by



                      Dark      Terrapin      China       Stella         Peggy
                      Star      Station        Doll       Blue             O
                                                                                    ...

                       followed_by                 followed_by
                                     followed_by                 followed_by
                             followed_by                followed_by




• What is the most central vertex in this graph? – Garcia.
• What is the most central song? What do you mean “central?”
• How do you calculate centrality when there are numerous ways in which vertices can
  be related?
Loading the GraphML Representation
gremlin> g = new TinkerGraph()
==>tinkergraph[vertices:0 edges:0]
gremlin> GraphMLReader.inputGraph(g,
          new FileInputStream(‘data/graph-example-2.xml’))
==>null
gremlin> g.V.name
==>OTHER ONE JAM
==>Hunter
==>HIDEAWAY
==>A MIND TO GIVE UP LIVIN
==>TANGLED UP IN BLUE
...

22

 22
    Load the GraphML (http://graphml.graphdrawing.org/) representation of the Grateful Dead
graph, iterate through its vertices and get the name property of each vertex.
Using Graph Indices
gremlin> v = g.idx(T.v)[[name:‘DARK STAR’]] >> 1
==>v[89]
gremlin> v.outE[[label:‘sung_by’]].inV.name
==>Garcia
23




  23
   Use the default vertex index (T.v) and find all vertices index by the key/value pair name/DARK STAR.
Who sung Dark Star?
Emitting Collections
gremlin> v.outE[[label:‘followed_by’]].inV.name
==>EYES OF THE WORLD
==>TRUCKING
==>SING ME BACK HOME
==>MORNING DEW
...
gremlin> v.outE[[label:‘followed_by’]].emit{[it.inV.name >> 1, it.weight]}
==>[EYES OF THE WORLD, 9]
==>[TRUCKING, 1]
==>[SING ME BACK HOME, 1]
==>[MORNING DEW, 11]
...

24

  24
   What followed Dark Star in concert?
How many times did each song follow Dark Start in concert?
Eigenvector Centrality – Indexing Vertices
gremlin> m = [:]; c = 0
==>0
gremlin> g.V.outE[[label:‘followed_by’]].inV.groupCount(m)
           .loop(4){c++ < 1000}.filter{false}
gremlin> m.sort{a,b -> b.value <=> a.value}
==>v[13]=762
==>v[21]=668
==>v[50]=661
==>v[153]=630
==>v[96]=609
==>v[26]=586
...

25
  25
    Emanate from each vertex the followed by path. Index each vertex by how many times its been
traversed over in the map m. Do this for 1000 times. The final filter{false} will ensure nothing is
outputted.
Eigenvector Centrality – Indexing Names
gremlin> m = [:]; c = 0
==>0
gremlin> g.V.outE[[label:‘followed_by’]].inV.name.groupCount(m)
          .back(2).loop(4){c++ < 1000}.filter{false}
gremlin> m.sort{a,b -> b.value <=> a.value}
==>PLAYING IN THE BAND=762
==>TRUCKING=668
==>JACK STRAW=661
==>SUGAR MAGNOLIA=630
==>DRUMS=609
==>PROMISED LAND=586
...

26

  26
    Do the same, index the names of the songs in the map m. However, since you can get the outgoing
edges of a string, jump back 2 steps, then loop.
Paths in a Graph
gremlin> g.v(89).outE.inV.name
==>EYES OF THE WORLD
==>TRUCKING
==>SING ME BACK HOME
==>MORNING DEW
==>HES GONE
...
gremlin> g.v(89).outE.inV.name.paths
==>[v[89], e[7021][89-followed_by->83], v[83], EYES OF THE WORLD]
==>[v[89], e[7022][89-followed_by->21], v[21], TRUCKING]
==>[v[89], e[7023][89-followed_by->206], v[206], SING ME BACK HOME]
==>[v[89], e[7006][89-followed_by->127], v[127], MORNING DEW]
==>[v[89], e[7024][89-followed_by->49], v[49], HES GONE]
...

27
 27
      What are the paths that are out.inV.name emanating from Dark Star (v[89])?
Paths of Length < 4 Between Dark Star and Drums
gremlin> g.v(89).outE.inV.loop(2){it.loops < 4 & !(it.object.equals(g.v(96)))}[[id:‘96’]].paths
==>[v[89], e[7014][89-followed_by->96], v[96]]
==>[v[89], e[7021][89-followed_by->83], v[83], e[1418][83-followed_by->96], v[96]]
==>[v[89], e[7022][89-followed_by->21], v[21], e[6320][21-followed_by->96], v[96]]
==>[v[89], e[7006][89-followed_by->127], v[127], e[6599][127-followed_by->96], v[96]]
==>[v[89], e[7024][89-followed_by->49], v[49], e[6277][49-followed_by->96], v[96]]
==>[v[89], e[7025][89-followed_by->129], v[129], e[5751][129-followed_by->96], v[96]]
==>[v[89], e[7026][89-followed_by->149], v[149], e[2017][149-followed_by->96], v[96]]
==>[v[89], e[7027][89-followed_by->148], v[148], e[1937][148-followed_by->96], v[96]]
==>[v[89], e[7028][89-followed_by->130], v[130], e[1378][130-followed_by->96], v[96]]
==>[v[89], e[7019][89-followed_by->39], v[39], e[6804][39-followed_by->96], v[96]]
==>[v[89], e[7034][89-followed_by->91], v[91], e[925][91-followed_by->96], v[96]]
==>[v[89], e[7035][89-followed_by->70], v[70], e[181][70-followed_by->96], v[96]]
==>[v[89], e[7017][89-followed_by->57], v[57], e[2551][57-followed_by->96], v[96]]
...


28




  28
    Get the adjacent vertices to Dark Star (v[89]). Loop on outE.inV while the amount of loops is less
than 4 and the current path hasn’t reached Drums (v[96]). Return the paths traversed.
Shortest Path Between Dark Star and Drums
gremlin> [g.v(89).outE.inV
           .loop(2){!(it.object.equals(g.v(96)))}[[id:‘96’]].paths >> 1]
==>[v[89], e[7014][89-followed_by->96], v[96]]
gremlin> (g.v(89).outE.inV.loop(2)
          {!(it.object.equals(g.v(96)))}[[id:‘96’]].paths >> 1).size() / 3
==>1

29




  29
     Given the nature of the loop step, the paths are emitted in order of length. Thus, just “pop off” the
first path (>> 1) and that is the shortest path. You can then gets its length if you want to know the length
of the shortest path. Be sure to divide by 3 as the path has the start vertex, edge, and end vertex.
Integrating the JDK (Java API)
• Groovy is the host language for Gremlin. Thus, the JDK is natively
  available in a path expression.

gremlin> v.outE[[label:‘followed_by’]].inV{it.name.matches(‘D.*’)}.name
==>DEAL
==>DRUMS


• Examples below are not with respect to the Grateful Dead schema
  presented previously. They help to further articulate the point at hand.

gremlin> x.outE.inV.emit{JSONParser.get(it.uri).reviews}.mean()
...
gremlin> y.outE[[label:‘rated’]].filter{TextAnalysis.isElated(it.review)}.inV
...
The Concept of Abstract Adjacency – Part 1
• Loop on outE.inV.

  g.v(89).outE.inV.loop(2){it.loops < 4}


• Loop on followed by.

  g.v(89).outE[[label:‘followed_by’]].inV.loop(3){it.loops < 4}


• Loop on cofollowed by.

  g.v(89).outE[[label:‘followed_by’]].inV
      .inE[[label:‘followed_by’]].outV.loop(6){it.loops < 4}
The Concept of Abstract Adjacency – Part 2
                              outE


                                         in                                 outE
                                              V


                                     loop(2){..}                   back(3)
                                               {i
                                                    t.
                                                         sa
                                                              la




                                      }                            ry
                                                                        }


                       friend_of_a_friend_who_earns_less_than_friend_at_work




• outE, inV, etc. is low-level graph speak (the domain is the graph).
• codeveloper is high-level domain speak (the domain is software development).30

  30
     In this way, Gremlin can be seen as a DSL (domain-specific language) for creating DSL’s for your
graph applications. Gremlin’s domain is “the graph.” Build languages for your domain on top of Gremlin
(e.g. “software development domain”).
Developer's FOAF Import
               Graph

                                     Friend-Of-A-Friend
                                           Graph




Developer Imports                                         Friends at Work
     Graph                                                     Graph



                                                                            You need not make
                       Software Imports   Friendship                        derived graphs explicit.
                            Graph           Graph                           You can, at runtime,
                                                                            compute them.
                                                                            Moreover, generate
                                                                            them locally, not
   Developer Created                                                        globally (e.g. ``Marko's
                                                          Employment
        Graph                                                               friends from work
                                                            Graph
                                                                            relations").

                                                                            This concept is related
                                                                            to automated reasoning
                                                                            and whether reasoned
                                                                            relations are inserted
                                                                            into the explicit graph or
                              Explicit Graph                                computed at query time.
The Concept of Abstract Adjacency – Part 3

• Gremlin provides fine grained (Turing-complete) control over how your
  traverser moves through a property graph.

• As such, there are as many shortest paths, eigenvectors/PageRanks,
  clusters, cliques, etc. as there are “abstract adjacencies” in the
  graph.31 32
        This is the power of the property graph over standard unlabeled graphs.
        There are many ways to slice and dice your data.


  31
      Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to Single-Relational Network
Analysis Algorithms, Journal of Informetrics, 4(1), pp. 29–41, 2009. [http://arxiv.org/abs/0806.2274]
   32
      Other terms for abstract adjacencies include domain-specific relations, inferred relations, implicit edges,
virtual edges, abstract paths, derived adjacencies, higher-order relations, etc.
In the End, Gremlin is Just Pipes
gremlin> (g.v(89).outE[[label:‘followed_by’]].inV.name).toString()
==>[VertexEdgePipe<OUT_EDGES>, LabelFilterPipe<NOT_EQUAL,followed_by>,
    EdgeVertexPipe<IN_VERTEX>, PropertyPipe<name>]




33


 33
    Gremlin is a domain specific language for constructing lazy evaluation, data flow Pipes (http:
//pipes.tinkerpop.com). In the end, what is evaluated is a pipeline process in pure Java over a
Blueprints-enabled property graph (http://blueprints.tinkerpop.com).
Credits
• Marko A. Rodriguez: designed, developed, and documented Gremlin.
• Pavel Yaskevich: heavy development on earlier versions of Gremlin.
• Darrick Wiebe: development on Pipes, Blueprints, and inspired the move to using
  Groovy as the host language of Gremlin.
• Peter Neubauer: promoter and evangelist of Gremlin.
• Ketrina Yim: designed the Gremlin logo.




                                  Gremlin     G = (V, E)


                  http://gremlin.tinkerpop.com
          http://groups.google.com/group/gremlin-users
                      http://tinkerpop.com

Más contenido relacionado

La actualidad más candente

From SQL to SPARQL
From SQL to SPARQLFrom SQL to SPARQL
From SQL to SPARQLGeorge Roth
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
 
Groovy On Trading Desk (2010)
Groovy On Trading Desk (2010)Groovy On Trading Desk (2010)
Groovy On Trading Desk (2010)Jonathan Felch
 
Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Chris Richardson
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaAleksander Pohl
 
Apache Con Us2007 Jcr In Action
Apache Con Us2007 Jcr In ActionApache Con Us2007 Jcr In Action
Apache Con Us2007 Jcr In Actionday
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
An Introduction to the Jena API
An Introduction to the Jena APIAn Introduction to the Jena API
An Introduction to the Jena APICraig Trim
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of DataRinke Hoekstra
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshellFabien Gandon
 
DevNation'15 - Using Lambda Expressions to Query a Datastore
DevNation'15 - Using Lambda Expressions to Query a DatastoreDevNation'15 - Using Lambda Expressions to Query a Datastore
DevNation'15 - Using Lambda Expressions to Query a DatastoreXavier Coulon
 
Strabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemStrabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemKostis Kyzirakos
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolLeigh Dodds
 
SPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesSPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesKnud Möller
 
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...Guillaume Laforge
 

La actualidad más candente (20)

Jena Programming
Jena ProgrammingJena Programming
Jena Programming
 
From SQL to SPARQL
From SQL to SPARQLFrom SQL to SPARQL
From SQL to SPARQL
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Groovy On Trading Desk (2010)
Groovy On Trading Desk (2010)Groovy On Trading Desk (2010)
Groovy On Trading Desk (2010)
 
Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...
 
Groovy Finance
Groovy FinanceGroovy Finance
Groovy Finance
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Jena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for JavaJena – A Semantic Web Framework for Java
Jena – A Semantic Web Framework for Java
 
Apache Con Us2007 Jcr In Action
Apache Con Us2007 Jcr In ActionApache Con Us2007 Jcr In Action
Apache Con Us2007 Jcr In Action
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
An Introduction to the Jena API
An Introduction to the Jena APIAn Introduction to the Jena API
An Introduction to the Jena API
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of Data
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
DevNation'15 - Using Lambda Expressions to Query a Datastore
DevNation'15 - Using Lambda Expressions to Query a DatastoreDevNation'15 - Using Lambda Expressions to Query a Datastore
DevNation'15 - Using Lambda Expressions to Query a Datastore
 
Practical Groovy DSL
Practical Groovy DSLPractical Groovy DSL
Practical Groovy DSL
 
Strabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemStrabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database System
 
Twinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query ToolTwinkle: A SPARQL Query Tool
Twinkle: A SPARQL Query Tool
 
SPARQL - Basic and Federated Queries
SPARQL - Basic and Federated QueriesSPARQL - Basic and Federated Queries
SPARQL - Basic and Federated Queries
 
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...
Groovy DSLs, from Beginner to Expert - Guillaume Laforge and Paul King - Spri...
 

Similar a The Gremlin in the Graph

Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoAchim Friedland
 
JSPP: Morphing C++ into Javascript
JSPP: Morphing C++ into JavascriptJSPP: Morphing C++ into Javascript
JSPP: Morphing C++ into JavascriptChristopher Chedeau
 
Atlassian Groovy Plugins
Atlassian Groovy PluginsAtlassian Groovy Plugins
Atlassian Groovy PluginsPaul King
 
1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real WorldAchim Friedland
 
Application Modeling with Graph Databases
Application Modeling with Graph DatabasesApplication Modeling with Graph Databases
Application Modeling with Graph DatabasesJosh Adell
 
Rapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and RailsRapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and Railselliando dias
 
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 FallJavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 FallYuichi Sakuraba
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Jordi Valverde
 
Gradle build tool that rocks with DSL JavaOne India 4th May 2012
Gradle build tool that rocks with DSL JavaOne India 4th May 2012Gradle build tool that rocks with DSL JavaOne India 4th May 2012
Gradle build tool that rocks with DSL JavaOne India 4th May 2012Rajmahendra Hegde
 
Javascriptbootcamp
JavascriptbootcampJavascriptbootcamp
Javascriptbootcamposcon2007
 
Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Agora Group
 
Buildr In Action @devoxx france 2012
Buildr In Action @devoxx france 2012Buildr In Action @devoxx france 2012
Buildr In Action @devoxx france 2012alexismidon
 
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning Talks
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning TalksBuilding Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning Talks
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning TalksAtlassian
 
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code styleRuby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code styleAnton Shemerey
 
Scala and Hadoop @ eBay
Scala and Hadoop @ eBayScala and Hadoop @ eBay
Scala and Hadoop @ eBayebaynyc
 
In The Land Of Graphs...
In The Land Of Graphs...In The Land Of Graphs...
In The Land Of Graphs...Fernand Galiana
 
Groovy presentation
Groovy presentationGroovy presentation
Groovy presentationManav Prasad
 

Similar a The Gremlin in the Graph (20)

Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and MonoFosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
Fosdem 2011 - A Common Graph Database Access Layer for .Net and Mono
 
JSPP: Morphing C++ into Javascript
JSPP: Morphing C++ into JavascriptJSPP: Morphing C++ into Javascript
JSPP: Morphing C++ into Javascript
 
J
JJ
J
 
Atlassian Groovy Plugins
Atlassian Groovy PluginsAtlassian Groovy Plugins
Atlassian Groovy Plugins
 
1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World1st UIM-GDB - Connections to the Real World
1st UIM-GDB - Connections to the Real World
 
Application Modeling with Graph Databases
Application Modeling with Graph DatabasesApplication Modeling with Graph Databases
Application Modeling with Graph Databases
 
Rapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and RailsRapid Development with Ruby/JRuby and Rails
Rapid Development with Ruby/JRuby and Rails
 
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 FallJavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
JavaOne報告会 Java SE/JavaFX 編 - JJUG CCC 2010 Fall
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011
 
Gradle build tool that rocks with DSL JavaOne India 4th May 2012
Gradle build tool that rocks with DSL JavaOne India 4th May 2012Gradle build tool that rocks with DSL JavaOne India 4th May 2012
Gradle build tool that rocks with DSL JavaOne India 4th May 2012
 
Javascriptbootcamp
JavascriptbootcampJavascriptbootcamp
Javascriptbootcamp
 
Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011Terence Barr - jdk7+8 - 24mai2011
Terence Barr - jdk7+8 - 24mai2011
 
Buildr In Action @devoxx france 2012
Buildr In Action @devoxx france 2012Buildr In Action @devoxx france 2012
Buildr In Action @devoxx france 2012
 
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning Talks
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning TalksBuilding Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning Talks
Building Atlassian Plugins with Groovy - Atlassian Summit 2010 - Lightning Talks
 
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code styleRuby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style
Ruby: OOP, metaprogramming, blocks, iterators, mix-ins, duck typing. Code style
 
Scala and Hadoop @ eBay
Scala and Hadoop @ eBayScala and Hadoop @ eBay
Scala and Hadoop @ eBay
 
In The Land Of Graphs...
In The Land Of Graphs...In The Land Of Graphs...
In The Land Of Graphs...
 
Dex Technical Seminar (April 2011)
Dex Technical Seminar (April 2011)Dex Technical Seminar (April 2011)
Dex Technical Seminar (April 2011)
 
Groovy presentation
Groovy presentationGroovy presentation
Groovy presentation
 
Groovy & Grails
Groovy & GrailsGroovy & Grails
Groovy & Grails
 

Más de Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data TypeMarko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageMarko Rodriguez
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataMarko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming PatternMarko Rodriguez
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 

Más de Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
The Path Forward
The Path ForwardThe Path Forward
The Path Forward
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 

Último

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Último (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

The Gremlin in the Graph

  • 1. The Gremlin in the Graph Marko A. Rodriguez Graph Systems Architect http://markorodriguez.com http://twitter.com/twarko http://slideshare.com/slidarko First University-Industry Meeting on Graph Databases (UIM-GDB) Universitat Polit´cnica de Catalunya – February 8, 2011 e February 6, 2011
  • 2. Abstract This tutorial/lecture addresses various aspects of the graph traversal language Gremlin. In particular, the presentation focuses on Gremlin 0.7 and its application to graph analysis and processing. Gremlin G = (V, E) http://gremlin.tinkerpop.com
  • 3. TinkerPop Productions 1 1 TinkerPop (http://tinkerpop.com), Marko A. Rodriguez (http://markorodriguez.com), Peter Neubauer (http://www.linkedin.com/in/neubauer), Joshua Shinavier (http://fortytwo.net/), Pavel Yaskevich (https://github.com/xedin), Darrick Wiebe (http://ofallpossibleworlds.wordpress. com/), Stephen Mallette (http://stephen.genoprime.com/), Alex Averbuch (http://se.linkedin. com/in/alexaverbuch)
  • 4. Gremlin is a Domain Specific Language • A domain specific language for graph analysis and manipulation. • Gremlin 0.7+ uses Groovy has its host language.2 • Groovy is a superset of Java and as such, natively exposes the full JDK to Gremlin. 2 Groovy available at http://groovy.codehaus.org/.
  • 5. Gremlin is for Property Graphs name = "lop" lang = "java" weight = 0.4 3 name = "marko" age = 29 created weight = 0.2 9 1 created 8 created 12 7 weight = 1.0 weight = 0.5 weight = 0.4 6 knows knows 11 name = "peter" age = 35 name = "josh" 4 age = 32 2 10 name = "vadas" age = 27 weight = 1.0 created 5 name = "ripple" lang = "java"
  • 6. Gremlin is for Blueprints-enabled Graph Databases • Blueprints can be seen as the JDBC for property graph databases.3 • Provides a collection of interfaces for graph database providers to implement. • Provides tests to ensure the operational semantics of any implementation are correct. • Provides support for GraphML and other “helper” utilities. Blueprints 3 Blueprints is available at http://blueprints.tinkerpop.com.
  • 7. A Blueprints Detour - Basic Example Graph graph = new Neo4jGraph("/tmp/my_graph"); Vertex a = graph.addVertex(null); Vertex b = graph.addVertex(null); a.setProperty("name","marko"); b.setProperty("name","peter"); Edge e = graph.addEdge(null, a, b, "knows"); e.setProperty("since", 2007); graph.shutdown(); 0 knows 1 since=2007 name=marko name=peter
  • 8. A Blueprints Detour - Sub-Interfaces • TransactionalGraph extends Graph For transaction based graph databases.4 • IndexableGraph extends Graph For graph databases that support the indexing of properties.5 4 Graph Transactions https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions. 5 Graph Indices https://github.com/tinkerpop/blueprints/wiki/Graph-Indices.
  • 9. A Blueprints Detour - Implementations TinkerGraph
  • 10. A Blueprints Detour - Implementations • TinkerGraph implements IndexableGraph • Neo4jGraph implements TransactionalGraph, IndexableGraph6 • OrientGraph implements TransactionalGraph, IndexableGraph7 • RexsterGraph implements IndexableGraph: Implementation of a Rexster REST API.8 • SailGraph implements TransactionalGraph: Turns any Sail-based RDF store into a Blueprints Graph.9 6 Neo4j is available at http://neo4j.org. 7 OrientDB is available at http://www.orientechnologies.com. 8 Rexster is available at http://rexster.tinkerpop.com. 9 Sail is available at http://openrdf.org.
  • 11. A Blueprints Detour - Ouplementations JUNG Java Universal Network/Graph Framework
  • 12. A Blueprints Detour - Ouplementations 10 • GraphSail: Turns any IndexableGraph into an RDF store. • GraphJung: Turns any Graph into a JUNG graph.11 12 10 GraphSail https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation. 11 JUNG is available at http://jung.sourceforge.net/. 12 GraphJung https://github.com/tinkerpop/blueprints/wiki/JUNG-Ouplementation.
  • 13. Gremlin Compiles Down to Pipes • Pipes is a data flow framework for evaluating lazy graph traversals.13 • A Pipe extends Iterator, Iterable and can be chained together to create processing pipelines. • A Pipe can be embedded into another Pipe to allow for nested processing. Pipes 13 Pipes is available at http://pipes.tinkerpop.com.
  • 14. A Pipes Detour - Chained Iterators • This Pipeline takes objects of type A and turns them into objects of type D. D D A A A Pipe1 B Pipe2 C Pipe3 D D A D A Pipeline Pipe<A,D> pipeline = new Pipeline<A,D>(Pipe1<A,B>, Pipe2<B,C>, Pipe3<C,D>)
  • 15. A Pipes Detour - Simple Example “What are the names of the people that marko knows?” B name=peter knows A knows C name=pavel name=marko created created D name=gremlin
  • 16. A Pipes Detour - Simple Example Pipe<Vertex,Edge> pipe1 = new VertexEdgePipe(Step.OUT_EDGES); Pipe<Edge,Edge> pipe2= new LabelFilterPipe("knows",Filter.NOT_EQUAL); Pipe<Edge,Vertex> pipe3 = new EdgeVertexPipe(Step.IN_VERTEX); Pipe<Vertex,String> pipe4 = new PropertyPipe<String>("name"); Pipe<Vertex,String> pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4); pipeline.setStarts(new SingleIterator<Vertex>(graph.getVertex("A")); EdgeVertexPipe(IN_VERTEX) VertexEdgePipe(OUT_EDGES) PropertyPipe("name") B name=peter knows A knows C name=pavel name=marko created created D name=gremlin LabelFilterPipe("knows") HINT: The benefit of Gremlin is that this Java verbosity is reduced to g.v(‘A’).outE[[label:‘knows’]].inV.name.
  • 17. A Pipes Detour - Pipes Library [ GRAPHS ] [ SIDEEFFECTS ] [ FILTERS ] EdgeVertexPipe AggregatorPipe AndFilterPipe IdFilterPipe GroupCountPipe CollectionFilterPipe IdPipe CountPipe ComparisonFilterPipe LabelFilterPipe SideEffectCapPipe DuplicateFilterPipe LabelPipe FutureFilterPipe PropertyFilterPipe [ UTILITIES ] ObjectFilterPipe PropertyPipe GatherPipe OrFilterPipe VertexEdgePipe PathPipe RandomFilterPipe ScatterPipe RangeFilterPipe Pipeline ...
  • 18. A Pipes Detour - Creating Pipes public class NumCharsPipe extends AbstractPipe<String,Integer> { public Integer processNextStart() { String word = this.starts.next(); return word.length(); } } When extending the base class AbstractPipe<S,E> all that is required is an implementation of processNextStart().
  • 19. Now onto Gremlin proper...
  • 20. The Gremlin Architecture Neo4j OrientDB TinkerGraph
  • 21. The Many Ways of Using Gremlin • Gremlin has a REPL to be run from the shell.14 • Gremlin can be natively integrated into any Groovy class. • Gremlin can be interacted with indirectly through Java, via Groovy. • Gremlin has a JSR 223 ScriptEngine as well. 14 All examples in this lecture/tutorial are via the command line REPL.
  • 22. The Seamless Nature of Gremlin/Groovy/Java Simply Gremlin.load() and add Gremlin to your Groovy class. // a Groovy class class GraphAlgorithms { static { Gremlin.load(); } public static Map<Vertex, Integer> eigenvectorRank(Graph g) { Map<Vertex,Integer> m = [:]; int c = 0; g.V.outE.inV.groupCount(m).loop(3) {c++ < 1000} >> -1; return m; } } Writing software that mixes Groovy and Java is simple...dead simple. // a Java class public class GraphFramework { public static void main(String[] args) { System.out.println(GraphAlgorithms.eigenvectorRank(new Neo4jGraph("/tmp/graphdata")); } }
  • 23. Property Graph Model and Gremlin vertex 1 out edges vertex 3 in edges edge 9 out vertex edge 9 label edge 9 in vertex edge 9 id 1 9 created 3 8 11 knows created 4 vertex 4 id vertex 4 properties name = "josh" age = 32 • outE: outgoing edges from a vertex. • inE: incoming edge to a vertex. • inV: incoming/head vertex of an edge. • outV: outgoing/tail vertex of an edge.15 15 Documentation on atomic steps: https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps.
  • 24. Loading a Toy Graph name = "lop" marko$ ./gremlin.sh lang = "java" weight = 0.4 3 ,,,/ name = "marko" age = 29 created (o o) weight = 0.2 9 -----oOOo-(_)-oOOo----- 1 created gremlin> g = TinkerGraphFactory.createTinkerGraph() 8 created 12 ==>tinkergraph[vertices:6 edges:6] 7 weight = 1.0 weight = 0.5 weight = 0.4 6 gremlin> knows knows 11 name = "peter" age = 35 4 name = "josh" age = 32 16 2 10 name = "vadas" age = 27 weight = 1.0 created 5 name = "ripple" lang = "java" 16 Load up the toy 6 vertex/6 edge graph that comes hardcoded with Blueprints.
  • 25. Basic Gremlin Traversals - Part 1 name = "lop" gremlin> g.v(1) lang = "java" ==>v[1] weight = 0.4 gremlin> g.v(1).map name = "marko" 3 age = 29 ==>name=marko created 9 ==>age=29 1 created gremlin> g.v(1).outE 8 created 7 12 ==>e[7][1-knows->2] weight = 0.5 6 ==>e[9][1-created->3] knows ==>e[8][1-knows->4] knows 11 weight = 1.0 gremlin> name = "josh" 4 age = 32 2 10 17 name = "vadas" age = 27 created 5 17 Look at the neighborhood around marko.
  • 26. Basic Gremlin Traversals - Part 2 name = "lop" gremlin> g.v(1) lang = "java" ==>v[1] gremlin> g.v(1).outE[[label:‘created’]].inV name = "marko" 3 ==>v[3] age = 29 created gremlin> g.v(1).outE[[label:‘created’]].inV.name 9 ==>lop created 1 gremlin> 8 created 12 7 6 18 knows knows 11 4 2 10 created 5 18 What has marko created?
  • 27. Basic Gremlin Traversals - Part 3 name = "lop" gremlin> g.v(1) lang = "java" ==>v[1] gremlin> g.v(1).outE[[label:‘knows’]].inV name = "marko" 3 ==>v[2] age = 29 created ==>v[4] 9 gremlin> g.v(1).outE[[label:‘knows’]].inV.name 1 created 8 created ==>vadas 12 7 ==>josh 6 gremlin> g.v(1).outE[[label:‘knows’]].inV{it.age < 30}.name knows knows 11 ==>vadas gremlin> name = "josh" 4 age = 32 2 10 19 name = "vadas" age = 27 created 5 19 Who does marko know? Who does marko know that is under 30 years of age?
  • 28. Basic Gremlin Traversals - Part 4 codeveloper gremlin> g.v(1) ==>v[1] gremlin> g.v(1).outE[[label:‘created’]].inV lop codeveloper ==>v[3] created gremlin> g.v(1).outE[[label:‘created’]].inV codeveloper created .inE[[label:‘created’]].outV marko ==>v[1] peter ==>v[4] created ==>v[6] knows knows gremlin> g.v(1).outE[[label:‘created’]].inV .inE[[label:‘created’]].outV.except([g.v(1)]) josh ==>v[4] vadas ==>v[6] gremlin> g.v(1).outE[[label:‘created’]].inV created .inE[[label:‘created’]].outV.except([g.v(1)]).name ==>josh ripple ==>peter gremlin> 20 20 Who are marko’s codevelopers? That is, people who have created the same software as him, but are not him (marko can’t be a codeveloper of himeself).
  • 29. Basic Gremlin Traversals - Part 5 gremlin> Gremlin.addStep(‘codeveloper’) ==>null lop gremlin> c = {_{x = it}.outE[[label:‘created’]].inV codeveloper .inE[[label:‘created’]].outV{!x.equals(it)}} created ==>groovysh_evaluate$_run_closure1@69e94001 created marko gremlin> [Vertex, Pipe].each{ it.metaClass.codeveloper = created { Gremlin.compose(delegate, c())}} codeveloper peter ==>interface com.tinkerpop.blueprints.pgm.Vertex knows knows ==>interface com.tinkerpop.pipes.Pipe gremlin> g.v(1).codeveloper ==>v[4] josh vadas ==>v[6] gremlin> g.v(1).codeveloper.codeveloper created ==>v[1] ==>v[6] ==>v[1] ripple ==>v[4] 21 21 I don’t want to talk in terms of outE, inV, etc. Given my domain model, I want to talk in terms of higher-order, abstract adjacencies — e.g. codeveloper.
  • 30. Lets explore more complex traversals...
  • 31. Grateful Dead Concert Graph The Grateful Dead were an American band that was born out of the San Francisco, California psychedelic movement of the 1960s. The band played music together from 1965 to 1995 and is well known for concert performances containing extended improvisations and long and unique set lists. [http://arxiv.org/abs/0807.2466]
  • 32. Grateful Dead Concert Graph Schema • vertices song vertices ∗ type: always song for song vertices. ∗ name: the name of the song. ∗ performances: the number of times the song was played in concert. ∗ song type: whether the song is a cover song or an original. artist vertices ∗ type: always artist for artist vertices. ∗ name: the name of the artist. • edges followed by (song → song): if the tail song was followed by the head song in concert. ∗ weight: the number of times these two songs were paired in concert. sung by (song → artist): if the tail song was primarily sung by the head artist. written by (song → artist): if the tail song was written by the head artist.
  • 33. A Subset of the Grateful Dead Concert Graph type="artist" type="artist" name="Hunter" name="Garcia" type="song" name="Scarlet.." 7 5 written_by 1 sung_by weight=239 followed_by type="song" name="Fire on.." sung_by sung_by written_by 2 type="artist" weight=1 name="Lesh" type="song" followed_by name="Pass.." 6 written_by 3 sung_by followed_by type="song" weight=2 name="Terrap.." 4
  • 34. Centrality in a Property Graph ... Garcia sung_by sung_by sung_by sung_by sung_by sung_by sung_by sung_by sung_by sung_by sung_by sung_by Dark Terrapin China Stella Peggy Star Station Doll Blue O ... followed_by followed_by followed_by followed_by followed_by followed_by • What is the most central vertex in this graph? – Garcia. • What is the most central song? What do you mean “central?” • How do you calculate centrality when there are numerous ways in which vertices can be related?
  • 35. Loading the GraphML Representation gremlin> g = new TinkerGraph() ==>tinkergraph[vertices:0 edges:0] gremlin> GraphMLReader.inputGraph(g, new FileInputStream(‘data/graph-example-2.xml’)) ==>null gremlin> g.V.name ==>OTHER ONE JAM ==>Hunter ==>HIDEAWAY ==>A MIND TO GIVE UP LIVIN ==>TANGLED UP IN BLUE ... 22 22 Load the GraphML (http://graphml.graphdrawing.org/) representation of the Grateful Dead graph, iterate through its vertices and get the name property of each vertex.
  • 36. Using Graph Indices gremlin> v = g.idx(T.v)[[name:‘DARK STAR’]] >> 1 ==>v[89] gremlin> v.outE[[label:‘sung_by’]].inV.name ==>Garcia 23 23 Use the default vertex index (T.v) and find all vertices index by the key/value pair name/DARK STAR. Who sung Dark Star?
  • 37. Emitting Collections gremlin> v.outE[[label:‘followed_by’]].inV.name ==>EYES OF THE WORLD ==>TRUCKING ==>SING ME BACK HOME ==>MORNING DEW ... gremlin> v.outE[[label:‘followed_by’]].emit{[it.inV.name >> 1, it.weight]} ==>[EYES OF THE WORLD, 9] ==>[TRUCKING, 1] ==>[SING ME BACK HOME, 1] ==>[MORNING DEW, 11] ... 24 24 What followed Dark Star in concert? How many times did each song follow Dark Start in concert?
  • 38. Eigenvector Centrality – Indexing Vertices gremlin> m = [:]; c = 0 ==>0 gremlin> g.V.outE[[label:‘followed_by’]].inV.groupCount(m) .loop(4){c++ < 1000}.filter{false} gremlin> m.sort{a,b -> b.value <=> a.value} ==>v[13]=762 ==>v[21]=668 ==>v[50]=661 ==>v[153]=630 ==>v[96]=609 ==>v[26]=586 ... 25 25 Emanate from each vertex the followed by path. Index each vertex by how many times its been traversed over in the map m. Do this for 1000 times. The final filter{false} will ensure nothing is outputted.
  • 39. Eigenvector Centrality – Indexing Names gremlin> m = [:]; c = 0 ==>0 gremlin> g.V.outE[[label:‘followed_by’]].inV.name.groupCount(m) .back(2).loop(4){c++ < 1000}.filter{false} gremlin> m.sort{a,b -> b.value <=> a.value} ==>PLAYING IN THE BAND=762 ==>TRUCKING=668 ==>JACK STRAW=661 ==>SUGAR MAGNOLIA=630 ==>DRUMS=609 ==>PROMISED LAND=586 ... 26 26 Do the same, index the names of the songs in the map m. However, since you can get the outgoing edges of a string, jump back 2 steps, then loop.
  • 40. Paths in a Graph gremlin> g.v(89).outE.inV.name ==>EYES OF THE WORLD ==>TRUCKING ==>SING ME BACK HOME ==>MORNING DEW ==>HES GONE ... gremlin> g.v(89).outE.inV.name.paths ==>[v[89], e[7021][89-followed_by->83], v[83], EYES OF THE WORLD] ==>[v[89], e[7022][89-followed_by->21], v[21], TRUCKING] ==>[v[89], e[7023][89-followed_by->206], v[206], SING ME BACK HOME] ==>[v[89], e[7006][89-followed_by->127], v[127], MORNING DEW] ==>[v[89], e[7024][89-followed_by->49], v[49], HES GONE] ... 27 27 What are the paths that are out.inV.name emanating from Dark Star (v[89])?
  • 41. Paths of Length < 4 Between Dark Star and Drums gremlin> g.v(89).outE.inV.loop(2){it.loops < 4 & !(it.object.equals(g.v(96)))}[[id:‘96’]].paths ==>[v[89], e[7014][89-followed_by->96], v[96]] ==>[v[89], e[7021][89-followed_by->83], v[83], e[1418][83-followed_by->96], v[96]] ==>[v[89], e[7022][89-followed_by->21], v[21], e[6320][21-followed_by->96], v[96]] ==>[v[89], e[7006][89-followed_by->127], v[127], e[6599][127-followed_by->96], v[96]] ==>[v[89], e[7024][89-followed_by->49], v[49], e[6277][49-followed_by->96], v[96]] ==>[v[89], e[7025][89-followed_by->129], v[129], e[5751][129-followed_by->96], v[96]] ==>[v[89], e[7026][89-followed_by->149], v[149], e[2017][149-followed_by->96], v[96]] ==>[v[89], e[7027][89-followed_by->148], v[148], e[1937][148-followed_by->96], v[96]] ==>[v[89], e[7028][89-followed_by->130], v[130], e[1378][130-followed_by->96], v[96]] ==>[v[89], e[7019][89-followed_by->39], v[39], e[6804][39-followed_by->96], v[96]] ==>[v[89], e[7034][89-followed_by->91], v[91], e[925][91-followed_by->96], v[96]] ==>[v[89], e[7035][89-followed_by->70], v[70], e[181][70-followed_by->96], v[96]] ==>[v[89], e[7017][89-followed_by->57], v[57], e[2551][57-followed_by->96], v[96]] ... 28 28 Get the adjacent vertices to Dark Star (v[89]). Loop on outE.inV while the amount of loops is less than 4 and the current path hasn’t reached Drums (v[96]). Return the paths traversed.
  • 42. Shortest Path Between Dark Star and Drums gremlin> [g.v(89).outE.inV .loop(2){!(it.object.equals(g.v(96)))}[[id:‘96’]].paths >> 1] ==>[v[89], e[7014][89-followed_by->96], v[96]] gremlin> (g.v(89).outE.inV.loop(2) {!(it.object.equals(g.v(96)))}[[id:‘96’]].paths >> 1).size() / 3 ==>1 29 29 Given the nature of the loop step, the paths are emitted in order of length. Thus, just “pop off” the first path (>> 1) and that is the shortest path. You can then gets its length if you want to know the length of the shortest path. Be sure to divide by 3 as the path has the start vertex, edge, and end vertex.
  • 43. Integrating the JDK (Java API) • Groovy is the host language for Gremlin. Thus, the JDK is natively available in a path expression. gremlin> v.outE[[label:‘followed_by’]].inV{it.name.matches(‘D.*’)}.name ==>DEAL ==>DRUMS • Examples below are not with respect to the Grateful Dead schema presented previously. They help to further articulate the point at hand. gremlin> x.outE.inV.emit{JSONParser.get(it.uri).reviews}.mean() ... gremlin> y.outE[[label:‘rated’]].filter{TextAnalysis.isElated(it.review)}.inV ...
  • 44. The Concept of Abstract Adjacency – Part 1 • Loop on outE.inV. g.v(89).outE.inV.loop(2){it.loops < 4} • Loop on followed by. g.v(89).outE[[label:‘followed_by’]].inV.loop(3){it.loops < 4} • Loop on cofollowed by. g.v(89).outE[[label:‘followed_by’]].inV .inE[[label:‘followed_by’]].outV.loop(6){it.loops < 4}
  • 45. The Concept of Abstract Adjacency – Part 2 outE in outE V loop(2){..} back(3) {i t. sa la } ry } friend_of_a_friend_who_earns_less_than_friend_at_work • outE, inV, etc. is low-level graph speak (the domain is the graph). • codeveloper is high-level domain speak (the domain is software development).30 30 In this way, Gremlin can be seen as a DSL (domain-specific language) for creating DSL’s for your graph applications. Gremlin’s domain is “the graph.” Build languages for your domain on top of Gremlin (e.g. “software development domain”).
  • 46. Developer's FOAF Import Graph Friend-Of-A-Friend Graph Developer Imports Friends at Work Graph Graph You need not make Software Imports Friendship derived graphs explicit. Graph Graph You can, at runtime, compute them. Moreover, generate them locally, not Developer Created globally (e.g. ``Marko's Employment Graph friends from work Graph relations"). This concept is related to automated reasoning and whether reasoned relations are inserted into the explicit graph or Explicit Graph computed at query time.
  • 47. The Concept of Abstract Adjacency – Part 3 • Gremlin provides fine grained (Turing-complete) control over how your traverser moves through a property graph. • As such, there are as many shortest paths, eigenvectors/PageRanks, clusters, cliques, etc. as there are “abstract adjacencies” in the graph.31 32 This is the power of the property graph over standard unlabeled graphs. There are many ways to slice and dice your data. 31 Rodriguez M.A., Shinavier, J., “Exposing Multi-Relational Networks to Single-Relational Network Analysis Algorithms, Journal of Informetrics, 4(1), pp. 29–41, 2009. [http://arxiv.org/abs/0806.2274] 32 Other terms for abstract adjacencies include domain-specific relations, inferred relations, implicit edges, virtual edges, abstract paths, derived adjacencies, higher-order relations, etc.
  • 48. In the End, Gremlin is Just Pipes gremlin> (g.v(89).outE[[label:‘followed_by’]].inV.name).toString() ==>[VertexEdgePipe<OUT_EDGES>, LabelFilterPipe<NOT_EQUAL,followed_by>, EdgeVertexPipe<IN_VERTEX>, PropertyPipe<name>] 33 33 Gremlin is a domain specific language for constructing lazy evaluation, data flow Pipes (http: //pipes.tinkerpop.com). In the end, what is evaluated is a pipeline process in pure Java over a Blueprints-enabled property graph (http://blueprints.tinkerpop.com).
  • 49. Credits • Marko A. Rodriguez: designed, developed, and documented Gremlin. • Pavel Yaskevich: heavy development on earlier versions of Gremlin. • Darrick Wiebe: development on Pipes, Blueprints, and inspired the move to using Groovy as the host language of Gremlin. • Peter Neubauer: promoter and evangelist of Gremlin. • Ketrina Yim: designed the Gremlin logo. Gremlin G = (V, E) http://gremlin.tinkerpop.com http://groups.google.com/group/gremlin-users http://tinkerpop.com