SlideShare una empresa de Scribd logo
1 de 114
Descargar para leer sin conexión
Outrageous Ideas
Data Day Texas - June 13, 2022
For Graph Databases
@maxdemarzi


maxdemarzi.com


GitHub.com/maxdemarzi


Max De Marzi
Ten Years
In the Graph game.
Let’s talk about
Money
1.8%
2018
Ideas are Wrong
• Too Many Back-ends (aka
Tinkerpop is wrong)


• No lessons applied from
Relational Databases


• API is incomplete (bulk)


• Query Languages are
Incompetent
Implementations


are Wrong
• Nodes as Objects sucks


• No internal algebras


• Incompetent Query Optimizers


• Incompetent Query Executors


• Incompetent Engineering


• A short clip of the talk
GitHub.com/ldbc/lsqb
https://homepages.cwi.nl/~boncz/edbt2022.pdf
Peter Suggests:
https://homepages.cwi.nl/~boncz/edbt2022.pdf
1. Row Storage for Properties of Nodes/Relationships


2. Less Indexing


3. Less Joins


4. Be more Relational then add Graph Functionality


5. Don’t rely on the query optimizer


6. Don’t allow generic recursive queries


7. Limit the query language
Completely Sensible Ideas
Data Day Texas - June 13, 2022
For Graph Databases
Outrageous Ideas
Data Day Texas - June 13, 2022
For Graph Databases
Why?
How many Paths are there from


The top left node to


the bottom right node?
2 Paths
6 Paths
14x14 = 11 minutes


15x15 = 10 Hours


20x20 = Nope
How many Paths are there?
20x20 = 10 Minutes
How many Paths are there?
137 Billion
How many Paths are there?
-[*]-
Death Star Queries
Blows up Alderaaning Servers
How many Paths are there?
20 x 20 in


0.41 Seconds
137 Billion
42
http://relational.ai
Column Storage
Idea One
Graph Normal Form


Narrow Tables
Key-Value or Key-Key
Traditional 3rd Normal Form
Graph Normal Form
More Indexes
Idea Two
One Index Per Column
Composite Index Explosion
Dual Indexed Narrow Tables = Dynamic Composite Indexes
More Joins
Idea Three
Problem with Joins
Table 1
ID
0
1
3
4
5
6
7
8
9
11
Table 2
ID
0
2
6
7
8
9
Table 3
ID
2
4
5
8
10
Results
Table 1 Table 2 Table 3
8 8 8
Intermediate Results
Table1 and Table 2
0
6
7
8
9
Worst Case Optimal Joins
● Worst-Case Optimal Join Algorithms: Techniques, Results, and
Open Problems. Ngo. (Gems of PODS 2018)
● Worst-Case Optimal Join Algorithms: Techniques, Results, and
Open Problems. Ngo, Porat, Re, Rudra. (Journal of the ACM
2018)
● What do Shannon-type inequalities, submodular width, and
disjunctive datalog have to do with one another? Abo Khamis,
Ngo, Suciu, (PODS 2017 - Invited to Journal of ACM)
● Computing Join Queries with Functional Dependencies. Abo
Khamis, Ngo, Suciu. (PODS 2017)
● Joins via Geometric Resolutions: Worst-case and Beyond. Abo
Khamis, Ngo, Re, Rudra. (PODS 2015, Invited to TODS 2015)
● Beyond Worst-Case Analysis for Joins with Minesweeper. Abo
Khamis, Ngo, Re, Rudra. (PODS 2014)
● Leapfrog Triejoin: A Simple Worst-Case Optimal Join Algorithm.
Veldhuizen (ICDT 2014 - Best Newcomer)
● Skew Strikes Back: New Developments in the Theory of Join
Algorithms. Ngo, Re, Rudra. (Invited to SIGMOD Record 2013)
● Worst Case Optimal Join Algorithms. Ngo, Porat, Re,
Rudra. (PODS 2012 – Best Paper)
LeapFrog Join
Table 1
ID
0
1
3
4
5
6
7
8
9
11
Table 2
ID
0
2
6
7
8
9
Table 3
ID
2
4
5
8
10
Table IDs Action
Table 1 Table 2 Table 3
0 0 2 Table 1: Seek 2
3 0 2 Table 2: Seek 3
3 6 2 Table 3: Seek 6
3 6 8 Table 1: Seek 8
8 6 8 Table 2: Seek 8
8 8 8 Emit, Table 3: Next
8 8 10 Table 1: Seek 10
11 8 10 Table 2: Seek 11 END
Results
Table 1 Table 2 Table 3
8 8 8
Start
End
Seek 2 Seek 3 Seek 6
Seek 8
Seek 10
Seek 8
Next
Seek 11
More than 3 Tables
m
a
14
Brand
Category
Retailer
Rating
p
o
n
b
7) seek m
6) seek m
3) seek f
5) seek m
4) seek g
2) seek c
1) seek c
c d e f g
Worst-Case Optimal Joins take advantage of sorted keys and gaps in the data to
eliminate intermediate results, speed up queries and get rid of the Join problem.
in Legacy GraphDBs:
How do you model Flight Data?
Don’t we care about Flights only on particular Days?
How do you model Flight Data?
Group Destinations together!
How do you model Flight Data?
OMG WAT!
How do you model Flight Data?
Reduce the Search Space
m
a
14
Airport
Day
Flight
Destination
p
o
n
b
7) seek m
6) seek m
3) seek f
5) seek m
4) seek g
2) seek c
1) seek c
c d e f g
What if you wanted to earn miles on your frequent flyer program and filter by Airline? No
problem here, the more joins the merrier.
Real Relational
Idea Four
Vision Reality
Relational Databases
Drop The “Null”
What’s wrong with NULL?
SELECT *

FROM parts
WHERE (price <= 99) OR (price > 99)
SELECT *

FROM parts
WHERE (price <= 99) OR (price > 99) OR isNull(price)
SELECT AVG(height)

FROM parts
SELECT orders.id, parts.id

FROM orders LEFT OUTER JOIN
parts ON parts.id = orders.part_id
SELECT orders.id, parts.id

FROM parts LEFT OUTER JOIN
orders ON parts.id = orders.part_id


●(a and NOT(a)) != True
●Aggregation requires special cases
●Outer Joins are not commutative 

a x b != b x a
Query Optimizers hate Nulls. The 3 valued
logic cause major headaches.
Lose the “Bags”
Sets vs Bags
Set: {1,2,3}, {8,3,4}
Bags: {1,2,2,3}, {3, 3, 3, 3}
Sets have Unique Values
Bags allow Duplicate Values
●Queries that use only ANDs (no ORs)
are called “conjunctive queries”
●Conjunctive Queries under Set
Semantics are Much Easier to Optimize
Query Optimizers hate Bags. Duplicates cause
major headaches.
Smarter Optimizer
Idea Five
Traditional Query Optimizers
• Predicate pushdown (push selection through join)


• Projection pushdown (push projection through join)


• Aggregation pushdown


• Their “pull ups” counter parts


• Split conjunctive predicates (split AND statements)


• Replace cartesian products (use inner joins with predicates)


• (Un)Nesting Sub-Queries


• Etc.
Data Answer
Query
Equivalent Query

Math
Semantic

Optimizer
Optimized

Query
Semantic Query Optimizer
Math
You learned this in middle school
• 1 + (2 + 3) = (1 + 2) + 3


• 3 + 4 = 4 + 3


• 3 + 0 = 3


• 1 + (-1) = 0
• 2 x (3 x 4) = (2 x 3) x 4


• 2 x 5 = 5 x 2


• 2 x 1 = 2


• 2 x 0.5 = 1
• 2 x (3 + 4) = (2 x 3) + (2 x 4)


• (3 + 4) x 2 = (3 x 2) + (4 x 2)
Math
You learned this in high school
• a + (b + c) = (a + b) + c


• a + b = b + a


• a + 0 = a


• a + (-a) = 0
• a x (b x c) = (a x b) x c


• a x b = b x a


• a x 1 = a


• a x a-1 = 1, a != 0
• a x (b + c) = (a x b) + (a x c)


• (a + b) x c = (a x c) + (b x c)
Math
You forgot this in high school
• Addition:


• Associativity:


• a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c


• Commutativity:


• a ⊕ b = b ⊕ a


• Identity: a ⊕ ō = a


• Inverse: a ⊕ (-a) = ō
• Multiplication


• Associativity:


• a ⊗ (b ⊗ c) = (a ⊗ b) ⊗ c


• Commutativity:


• a ⊗ b = b ⊗ a


• Identity: a ⊗ ī = a


• Inverse: a ⊗ a-1 = ī
• Distribution of Multiplication over Addition:


• a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c)


• (a ⊕ b) ⊗ c = (a ⊗ c) ⊕ (b ⊗ c)
Example 1
Query: find the count of the combined rows a, b, c in tables R, S and T



	
	
def result = count[a,b,c: R(a) and S(b) and T(c)]
Mathematic Representation:
77
Math
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T
Example 1
Query: count the number of combined rows a, b, c in tables R, S and T



	
	
def result = count[a,b,c: R(a) and S(b) and T(c)]
Optimized Query:
def result = count[R] * count[S] * count[T]
n^3 is much slower than 3n
Example 2
Query: find the minimum sum of rows a, b, c in tables R, S and T:



	
	
	
def result = min[a,b,c,v: v = R[a] + S[b] + T[c]]
Mathematic Representation:
82
Math
Example 2
Query: find the minimum sum of rows a, b, c in tables R, S and T:



	
	
	
def result = min[a,b,c,v: v = R[a] + S[b] + T[c]]
Optimized Query:
def result = min[R] + min[S] + min[T]
C
B D
A E F
1
2
9 4
6
3
5
AEF = 9 + 4 = 13


ABDF = 1 + 6 + 5 = 12


ABCDF = 1 + 2 + 3 + 5 = 11
min{13,12,11} = 11
Shortest Path


from A to F
C
B D
A E F
0.9
0.9
0.4 0.8
0.2
1.0
0.7
AEF = 0.4 x 0.8 = 0.32


ABDF = 0.9 x 0.2 x 0.7 = 0.126


ABCDF = 0.9 x 0.9 x 1.0 x 0.7 = 0.567
max{0.32,0.126,0.567} = 0.567
Maximum Reliability


from A to F
C
B D
A E F
T
I
A T
H
M
E
AEF = A · T = AT


ABDF = T · H · E = THE


ABCDF = T · I · M · E = TIME
union{at, the, time} = at the time
Words


from A to F
Math
You skipped this in college
• min { (9 + 4), (1 + 6 + 5), ( 1 + 2 + 3 + 5 ) }


• max { (0.4 x 0.8), (0.9 x 0.2 x 0.7), (0.9 x 0.9 x 1.0 x 0.7) }


• union { (A · T), (T · H · E), (T · I · M · E) }
Math
You skipped this in college
• ⊕ { (9 ⊗ 4), (1 ⊗ 6 ⊗ 5), ( 1 ⊗ 2 ⊗ 3 ⊗ 5 ) }


• ⊕ { (0.4 ⊗ 0.8), (0.9 ⊗ 0.2 ⊗ 0.7), (0.9 ⊗ 0.9 ⊗ 1.0 ⊗ 0.7) }


• ⊕ { (A ⊗ T), (T ⊗ H ⊗ E), (T ⊗ I ⊗ M ⊗ E) }
Example 3
Query: count the number of 3-hop paths per node in a graph


def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d)


def result[a] = count[path3[a]]
Mathematic Representation:
A B C D
Query: count the number of 3-hop paths per node in a graph
A B C D
Example 3
Query: count the number of 3-hop paths per node in a graph


def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d)


def result[a] = count[path3[a]]
Optimized Query:
def path1[c] = count[edge[c]]


def path2[b] = sum[path1[c] for c in edge[b]]


def result[a] = sum[path2[b] for b in edge[a]]
A B C D
Semantic Query Optimizer
It knows math!
• Compute Discrete Fourier Transform in Fast Fourier Transform-time


• Junction Tree Algorithm for inference in Probabilistic Graphical Models


• Message passing, belief propagation


• Viterbi Algorithm, forward/backward for Hidden Markov Models most probable
paths


• Counting sub-graph patterns (motifs)


• Yannakakis Algorithm for acyclic conjunctive queries in Polynomial Time


• Fractional hypertree-width time algorithm for Constraint Satisfaction Problems


• Best known results for Conjunctive Queries and Quanti
f
ied Conjunctive Queries
Semantic Query Optimizer
It knows math!
• This optimizer produces much better code than the average developer
because it knows a ton more math than the average developer.
• Maryam Mirzakhani


• Terence Tao


• Ramanujan


• Katherine Goble


• Good Will Hunting
Add Recursion
Idea Six
95
def reachable = edge; reachable.edge
Recursion
How many Paths are there from


The top left node to


the bottom right node?
2 Paths
6 Paths
def number_of_paths_of_length(node_number, path_length, path_count) =


	
node_number=1, path_length=0, path_count=1


def number_of_paths_of_length[node_number, path_length] =


sum[other_node, paths_of_length : paths_of_length =


	
number_of_paths_of_length[other_node, path_length - 1]


	
and edge(other_node, node_number)]


def output = number_of_paths_of_length[number_of_nodes, 2 * lattice_size]
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 1
Evaluating `_intermediate#0`:
(1, 1) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(4, 1) => (1,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(4, 1, 1)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 2
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(4, 2) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(7, 2) => (1,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(7, 2, 1)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 3
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(3, 3) => (1,)
(4, 2) => (1,)
(5, 3) => (2,)
(7, 3) => (1,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(6, 3) => (3,)
(7, 2) => (1,)
(8, 3) => (3,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(6, 3, 3)
(7, 2, 1)
(8, 3, 3)
@function @transient
def :_intermediate#0(other_node#1, path_length#0, _t#0) =
reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0),
(x#8, paths_of_length#1) :
:number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and
:rel_primitive_add(1, x#8, path_length#0),
(_no_init#0) : false](_t#0)


@function @transient
def :_intermediate#1(node_number#0, path_length#0, path_count#0) =
reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1),
(other_node#1, _t#0) :
:edge(other_node#1, node_number#0) and
:_intermediate#0(other_node#1, path_length#0, _t#0),
(_no_init#1) : false](path_count#0)


def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) =
:_base_case#0(node_number#0, path_length#0, path_count#0) or
:_intermediate#1(node_number#0, path_length#0, path_count#0)
Naive recursion, iteration 4
Evaluating `_intermediate#0`:
(1, 1) => (1,)
(2, 2) => (1,)
(3, 3) => (1,)
(4, 2) => (1,)
(5, 3) => (2,)
(6, 4) => (3,)
(7, 3) => (1,)
(8, 4) => (3,)
Evaluating `_intermediate#1`:
(2, 1) => (1,)
(3, 2) => (1,)
(4, 1) => (1,)
(5, 2) => (2,)
(6, 3) => (3,)
(7, 2) => (1,)
(8, 3) => (3,)
(9, 4) => (6,)
Evaluating `number_of_paths_of_length`:
(1, 0, 1)
(2, 1, 1)
(3, 2, 1)
(4, 1, 1)
(5, 2, 2)
(6, 3, 3)
(7, 2, 1)
(8, 3, 3)
(9, 4, 6)
No Language Limits
Idea Seven
Graph Analytics
module graph_analytics[G]

with G use node, edge



def neighbor(x, y) = edge(x, y) or edge(y, x)

def outdegree[x] = count[edge[x]]

def degree[x] = count[neighbor[x]]

def cn[x, y] = count[intersect[neighbor[x], neighbor[y]]] // Count of Common Neighbors



def reachable = edge; reachable.edge

def reachable_undirected = neighbor; reachable_undirected.neighbor



def scc[x] = min[v: reachable(x, v) and reachable(v, x)] // Strongly Connected Component

def wcc[x] = min[reachable_undirected[x]] // Weakly Connected Component



def cosine_sim[x, y] = cn[x, y] / sqrt[degree[x] * degree[y]]

def jaccard_sim[x, y] = cn[x, y] / count[neighbor[x]] + count[neighbor[y]] - cn[x, y]

…

end
Betweenness Centrality


Graph Algorithms
One of many of graph centrality measures which are
useful for assessing the importance of a node.

High Level Definition: Number of times a node
appears on shortest paths within a network

Why it’s Useful: Identify which nodes control
information flow between different areas of the
graph; also called “Bridge Nodes”

Business Use-Cases:

Communication Analysis: Identify important
people which communicate across different
groups

Retail Purchase Analysis: Which products
introduce customers to new categories
Betweenness Centrality


Computation
Brandes Algorithm is applied as follows:

1. For each pair of nodes, compute all
shortest paths and capture nodes
(less endpoints) on said path(s)

2. For each pair of nodes, assign each
node along path a value of one if there
is only one shortest path, or the
fractional contribution (1/n) if n
shortest paths

3. Sum the value from step 2 for each
node; this is the Betweenness
Centrality
Betweenness Centrality Implementation
// Shortest path between s and t when they are the same is 0. 

def shortest_path[s, t] = Min[

v, w:

(shortest_path(s, t, w) and v = 1) or

(w = shortest_path[s,v] +1 and E(v, t))

]
// When s and t are the same, there is only one shortest path between
// them, namely the one with length 0.
def nb_shortest(s, t, n) = V(s) and V(t) and s = t and n = 1
// When s and t are *not* the same, it is the sum of the number of
shortest
// paths between s and v for all the v's adjacent to t and on the shortest
// path between s and t.
def nb_shortest(s, t, n) =
s != t and
n = sum[v, m:
shortest_path[s, v] + 1 = shortest_path[s, t] and E(v, t) and
nb_shortest(s, v, m)
]
// sum over all t's such that there is an edge between v and t,
// and v is on the shortest path between s and t
def C[s, v] = sum[t, r:
E(v, t) and shortest_path[s, t] = shortest_path[s, v] + 1 and
(
a = C[s, t] or
not C(s, t, _) and a = 0.0
) and
r = (nb_shortest[s, v] / nb_shortest[s, t]) * (1 + a)
] from a
// Note that below we divide by 2 because we are double
counting every edge.
def betweenness_centrality_brandes[v] =
sum[s, p : s != v and C[s, v] = p]/2
Betweenness Centrality ReComputation
Incremental updates to
data and recomputation
of Betweenness
Centrality takes only a
few seconds, whereas
the entire graph needs to
be re-computed in other
systems.
Algorithm Change ReComputation
Incremental updates to
code is also
recomputated, whereas
the entire algorithm
needs to be re-
computed in other
systems.
Code Dependency Graph
Incremental Maintenance
1. Dependency tracking to figure out which views are affected by a change.

2. Demand-driven execution to only compute what users are actively interested in.

3. Differential computation to incrementally maintain even general recursion.

4. Semantic optimization to recover better maintenance algorithms where possible.
112
http://relational.ai
113
Raised Money Too
114
http://relational.ai

Más contenido relacionado

La actualidad más candente

Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemMd. Hasan Basri (Angel)
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4jNeo4j
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphsmukuljoshi
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesSlideTeam
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataJoey Li
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURESachin Batham
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Amazon Web Services
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaDatabricks
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasEric Matthews
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)AkashBorse2
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesOsama Jomaa
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoopMaulik Thaker
 

La actualidad más candente (20)

Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-System
 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4j
 
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphs
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Big Data
Big DataBig Data
Big Data
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
 
ETL big data with apache hadoop
ETL big data with apache hadoopETL big data with apache hadoop
ETL big data with apache hadoop
 

Similar a Outrageous Ideas for Graph Databases

Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesMax De Marzi
 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxssuser01e301
 
Lecture 6 operators
Lecture 6   operatorsLecture 6   operators
Lecture 6 operatorseShikshak
 
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...Heroku
 
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel   - Dr. Shailesh KumarMap reduce and the art of Thinking Parallel   - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel - Dr. Shailesh KumarHyderabad Scalability Meetup
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classificationYanchang Zhao
 
Mid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docxMid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docxannandleola
 
Day 2 review with sat
Day 2 review with satDay 2 review with sat
Day 2 review with satjbianco9910
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevJavaDayUA
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHESVARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHESIAEME Publication
 

Similar a Outrageous Ideas for Graph Databases (20)

Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
 
Adobe
AdobeAdobe
Adobe
 
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLAB
 
3rd Semester Computer Science and Engineering (ACU) Question papers
3rd Semester Computer Science and Engineering  (ACU) Question papers3rd Semester Computer Science and Engineering  (ACU) Question papers
3rd Semester Computer Science and Engineering (ACU) Question papers
 
Unit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptxUnit-1 Basic Concept of Algorithm.pptx
Unit-1 Basic Concept of Algorithm.pptx
 
Curvefitting
CurvefittingCurvefitting
Curvefitting
 
Lecture 6 operators
Lecture 6   operatorsLecture 6   operators
Lecture 6 operators
 
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
Rdio's Alex Gaynor at Heroku's Waza 2013: Why Python, Ruby and Javascript are...
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel   - Dr. Shailesh KumarMap reduce and the art of Thinking Parallel   - Dr. Shailesh Kumar
Map reduce and the art of Thinking Parallel - Dr. Shailesh Kumar
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classification
 
Mid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docxMid-Term ExamName___________________________________MU.docx
Mid-Term ExamName___________________________________MU.docx
 
Day 2 review with sat
Day 2 review with satDay 2 review with sat
Day 2 review with sat
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy Dyagilev
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
Algorithm Design and Analysis
Algorithm Design and AnalysisAlgorithm Design and Analysis
Algorithm Design and Analysis
 
LalitBDA2015V3
LalitBDA2015V3LalitBDA2015V3
LalitBDA2015V3
 
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHESVARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
VARIOUS FUZZY NUMBERS AND THEIR VARIOUS RANKING APPROACHES
 
Qp cdsi18-math
Qp cdsi18-mathQp cdsi18-math
Qp cdsi18-math
 
Lecture1a data types
Lecture1a data typesLecture1a data types
Lecture1a data types
 

Más de Max De Marzi

DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 PresentationMax De Marzi
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesMax De Marzi
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training ModelingMax De Marzi
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training IntroductionMax De Marzi
 
Detenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jDetenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jMax De Marzi
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jMax De Marzi
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j Max De Marzi
 
Detecion de Fraude con Neo4j
Detecion de Fraude con Neo4jDetecion de Fraude con Neo4j
Detecion de Fraude con Neo4jMax De Marzi
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science PresentationMax De Marzi
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Max De Marzi
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Max De Marzi
 
Decision Trees in Neo4j
Decision Trees in Neo4jDecision Trees in Neo4j
Decision Trees in Neo4jMax De Marzi
 
Neo4j y Fraude Spanish
Neo4j y Fraude SpanishNeo4j y Fraude Spanish
Neo4j y Fraude SpanishMax De Marzi
 
Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorialMax De Marzi
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j FundamentalsMax De Marzi
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j PresentationMax De Marzi
 
Fraud Detection Class Slides
Fraud Detection Class SlidesFraud Detection Class Slides
Fraud Detection Class SlidesMax De Marzi
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Max De Marzi
 
What Finance can learn from Dating Sites
What Finance can learn from Dating SitesWhat Finance can learn from Dating Sites
What Finance can learn from Dating SitesMax De Marzi
 

Más de Max De Marzi (20)

DataDay 2023 Presentation
DataDay 2023 PresentationDataDay 2023 Presentation
DataDay 2023 Presentation
 
DataDay 2023 Presentation - Notes
DataDay 2023 Presentation - NotesDataDay 2023 Presentation - Notes
DataDay 2023 Presentation - Notes
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
 
Detenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4jDetenga el fraude complejo con Neo4j
Detenga el fraude complejo con Neo4j
 
Data Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4jData Modeling Tricks for Neo4j
Data Modeling Tricks for Neo4j
 
Fraud Detection and Neo4j
Fraud Detection and Neo4j Fraud Detection and Neo4j
Fraud Detection and Neo4j
 
Detecion de Fraude con Neo4j
Detecion de Fraude con Neo4jDetecion de Fraude con Neo4j
Detecion de Fraude con Neo4j
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science Presentation
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
 
Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1Neo4j Stored Procedure Training Part 1
Neo4j Stored Procedure Training Part 1
 
Decision Trees in Neo4j
Decision Trees in Neo4jDecision Trees in Neo4j
Decision Trees in Neo4j
 
Neo4j y Fraude Spanish
Neo4j y Fraude SpanishNeo4j y Fraude Spanish
Neo4j y Fraude Spanish
 
Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
 
Neo4j Fundamentals
Neo4j FundamentalsNeo4j Fundamentals
Neo4j Fundamentals
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
 
Fraud Detection Class Slides
Fraud Detection Class SlidesFraud Detection Class Slides
Fraud Detection Class Slides
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
 
What Finance can learn from Dating Sites
What Finance can learn from Dating SitesWhat Finance can learn from Dating Sites
What Finance can learn from Dating Sites
 

Último

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 

Outrageous Ideas for Graph Databases

  • 1. Outrageous Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 3.
  • 4. Ten Years In the Graph game.
  • 5.
  • 6.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. 1.8%
  • 14.
  • 15.
  • 16. 2018
  • 17. Ideas are Wrong • Too Many Back-ends (aka Tinkerpop is wrong) • No lessons applied from Relational Databases • API is incomplete (bulk) • Query Languages are Incompetent
  • 18. Implementations are Wrong • Nodes as Objects sucks • No internal algebras • Incompetent Query Optimizers • Incompetent Query Executors • Incompetent Engineering • A short clip of the talk
  • 19.
  • 20.
  • 22.
  • 23.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Peter Suggests: https://homepages.cwi.nl/~boncz/edbt2022.pdf 1. Row Storage for Properties of Nodes/Relationships 2. Less Indexing 3. Less Joins 4. Be more Relational then add Graph Functionality 5. Don’t rely on the query optimizer 6. Don’t allow generic recursive queries 7. Limit the query language
  • 32.
  • 33. Completely Sensible Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 34. Outrageous Ideas Data Day Texas - June 13, 2022 For Graph Databases
  • 35. Why?
  • 36. How many Paths are there from The top left node to the bottom right node? 2 Paths 6 Paths
  • 37. 14x14 = 11 minutes 15x15 = 10 Hours 20x20 = Nope How many Paths are there?
  • 38. 20x20 = 10 Minutes How many Paths are there? 137 Billion
  • 39. How many Paths are there? -[*]- Death Star Queries Blows up Alderaaning Servers
  • 40. How many Paths are there? 20 x 20 in 0.41 Seconds 137 Billion
  • 41.
  • 44. Graph Normal Form Narrow Tables Key-Value or Key-Key
  • 48. One Index Per Column
  • 49. Composite Index Explosion Dual Indexed Narrow Tables = Dynamic Composite Indexes
  • 51. Problem with Joins Table 1 ID 0 1 3 4 5 6 7 8 9 11 Table 2 ID 0 2 6 7 8 9 Table 3 ID 2 4 5 8 10 Results Table 1 Table 2 Table 3 8 8 8 Intermediate Results Table1 and Table 2 0 6 7 8 9
  • 52. Worst Case Optimal Joins ● Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. Ngo. (Gems of PODS 2018) ● Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. Ngo, Porat, Re, Rudra. (Journal of the ACM 2018) ● What do Shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another? Abo Khamis, Ngo, Suciu, (PODS 2017 - Invited to Journal of ACM) ● Computing Join Queries with Functional Dependencies. Abo Khamis, Ngo, Suciu. (PODS 2017) ● Joins via Geometric Resolutions: Worst-case and Beyond. Abo Khamis, Ngo, Re, Rudra. (PODS 2015, Invited to TODS 2015) ● Beyond Worst-Case Analysis for Joins with Minesweeper. Abo Khamis, Ngo, Re, Rudra. (PODS 2014) ● Leapfrog Triejoin: A Simple Worst-Case Optimal Join Algorithm. Veldhuizen (ICDT 2014 - Best Newcomer) ● Skew Strikes Back: New Developments in the Theory of Join Algorithms. Ngo, Re, Rudra. (Invited to SIGMOD Record 2013) ● Worst Case Optimal Join Algorithms. Ngo, Porat, Re, Rudra. (PODS 2012 – Best Paper)
  • 53. LeapFrog Join Table 1 ID 0 1 3 4 5 6 7 8 9 11 Table 2 ID 0 2 6 7 8 9 Table 3 ID 2 4 5 8 10 Table IDs Action Table 1 Table 2 Table 3 0 0 2 Table 1: Seek 2 3 0 2 Table 2: Seek 3 3 6 2 Table 3: Seek 6 3 6 8 Table 1: Seek 8 8 6 8 Table 2: Seek 8 8 8 8 Emit, Table 3: Next 8 8 10 Table 1: Seek 10 11 8 10 Table 2: Seek 11 END Results Table 1 Table 2 Table 3 8 8 8 Start End Seek 2 Seek 3 Seek 6 Seek 8 Seek 10 Seek 8 Next Seek 11
  • 54. More than 3 Tables m a 14 Brand Category Retailer Rating p o n b 7) seek m 6) seek m 3) seek f 5) seek m 4) seek g 2) seek c 1) seek c c d e f g Worst-Case Optimal Joins take advantage of sorted keys and gaps in the data to eliminate intermediate results, speed up queries and get rid of the Join problem.
  • 55. in Legacy GraphDBs: How do you model Flight Data?
  • 56. Don’t we care about Flights only on particular Days? How do you model Flight Data?
  • 57. Group Destinations together! How do you model Flight Data?
  • 58. OMG WAT! How do you model Flight Data?
  • 59. Reduce the Search Space m a 14 Airport Day Flight Destination p o n b 7) seek m 6) seek m 3) seek f 5) seek m 4) seek g 2) seek c 1) seek c c d e f g What if you wanted to earn miles on your frequent flyer program and filter by Airline? No problem here, the more joins the merrier.
  • 62.
  • 64. What’s wrong with NULL? SELECT *
 FROM parts WHERE (price <= 99) OR (price > 99) SELECT *
 FROM parts WHERE (price <= 99) OR (price > 99) OR isNull(price) SELECT AVG(height)
 FROM parts SELECT orders.id, parts.id
 FROM orders LEFT OUTER JOIN parts ON parts.id = orders.part_id SELECT orders.id, parts.id
 FROM parts LEFT OUTER JOIN orders ON parts.id = orders.part_id 
 ●(a and NOT(a)) != True ●Aggregation requires special cases ●Outer Joins are not commutative 
 a x b != b x a Query Optimizers hate Nulls. The 3 valued logic cause major headaches.
  • 65.
  • 67. Sets vs Bags Set: {1,2,3}, {8,3,4} Bags: {1,2,2,3}, {3, 3, 3, 3} Sets have Unique Values Bags allow Duplicate Values ●Queries that use only ANDs (no ORs) are called “conjunctive queries” ●Conjunctive Queries under Set Semantics are Much Easier to Optimize Query Optimizers hate Bags. Duplicates cause major headaches.
  • 68.
  • 70. Traditional Query Optimizers • Predicate pushdown (push selection through join) • Projection pushdown (push projection through join) • Aggregation pushdown • Their “pull ups” counter parts • Split conjunctive predicates (split AND statements) • Replace cartesian products (use inner joins with predicates) • (Un)Nesting Sub-Queries • Etc.
  • 72.
  • 73. Math You learned this in middle school • 1 + (2 + 3) = (1 + 2) + 3 • 3 + 4 = 4 + 3 • 3 + 0 = 3 • 1 + (-1) = 0 • 2 x (3 x 4) = (2 x 3) x 4 • 2 x 5 = 5 x 2 • 2 x 1 = 2 • 2 x 0.5 = 1 • 2 x (3 + 4) = (2 x 3) + (2 x 4) • (3 + 4) x 2 = (3 x 2) + (4 x 2)
  • 74. Math You learned this in high school • a + (b + c) = (a + b) + c • a + b = b + a • a + 0 = a • a + (-a) = 0 • a x (b x c) = (a x b) x c • a x b = b x a • a x 1 = a • a x a-1 = 1, a != 0 • a x (b + c) = (a x b) + (a x c) • (a + b) x c = (a x c) + (b x c)
  • 75. Math You forgot this in high school • Addition: • Associativity: • a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c • Commutativity: • a ⊕ b = b ⊕ a • Identity: a ⊕ ō = a • Inverse: a ⊕ (-a) = ō • Multiplication • Associativity: • a ⊗ (b ⊗ c) = (a ⊗ b) ⊗ c • Commutativity: • a ⊗ b = b ⊗ a • Identity: a ⊗ ī = a • Inverse: a ⊗ a-1 = ī • Distribution of Multiplication over Addition: • a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) • (a ⊕ b) ⊗ c = (a ⊗ c) ⊕ (b ⊗ c)
  • 76. Example 1 Query: find the count of the combined rows a, b, c in tables R, S and T
 
 def result = count[a,b,c: R(a) and S(b) and T(c)] Mathematic Representation:
  • 78. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
  • 79. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
  • 80. Example 1 Query: count the number of combined rows a, b, c in tables R, S and T
 
 def result = count[a,b,c: R(a) and S(b) and T(c)] Optimized Query: def result = count[R] * count[S] * count[T] n^3 is much slower than 3n
  • 81. Example 2 Query: find the minimum sum of rows a, b, c in tables R, S and T:
 
 def result = min[a,b,c,v: v = R[a] + S[b] + T[c]] Mathematic Representation:
  • 83. Example 2 Query: find the minimum sum of rows a, b, c in tables R, S and T:
 
 def result = min[a,b,c,v: v = R[a] + S[b] + T[c]] Optimized Query: def result = min[R] + min[S] + min[T]
  • 84. C B D A E F 1 2 9 4 6 3 5 AEF = 9 + 4 = 13 ABDF = 1 + 6 + 5 = 12 ABCDF = 1 + 2 + 3 + 5 = 11 min{13,12,11} = 11 Shortest Path from A to F
  • 85. C B D A E F 0.9 0.9 0.4 0.8 0.2 1.0 0.7 AEF = 0.4 x 0.8 = 0.32 ABDF = 0.9 x 0.2 x 0.7 = 0.126 ABCDF = 0.9 x 0.9 x 1.0 x 0.7 = 0.567 max{0.32,0.126,0.567} = 0.567 Maximum Reliability from A to F
  • 86. C B D A E F T I A T H M E AEF = A · T = AT ABDF = T · H · E = THE ABCDF = T · I · M · E = TIME union{at, the, time} = at the time Words from A to F
  • 87. Math You skipped this in college • min { (9 + 4), (1 + 6 + 5), ( 1 + 2 + 3 + 5 ) } • max { (0.4 x 0.8), (0.9 x 0.2 x 0.7), (0.9 x 0.9 x 1.0 x 0.7) } • union { (A · T), (T · H · E), (T · I · M · E) }
  • 88. Math You skipped this in college • ⊕ { (9 ⊗ 4), (1 ⊗ 6 ⊗ 5), ( 1 ⊗ 2 ⊗ 3 ⊗ 5 ) } • ⊕ { (0.4 ⊗ 0.8), (0.9 ⊗ 0.2 ⊗ 0.7), (0.9 ⊗ 0.9 ⊗ 1.0 ⊗ 0.7) } • ⊕ { (A ⊗ T), (T ⊗ H ⊗ E), (T ⊗ I ⊗ M ⊗ E) }
  • 89. Example 3 Query: count the number of 3-hop paths per node in a graph def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d) def result[a] = count[path3[a]] Mathematic Representation: A B C D
  • 90. Query: count the number of 3-hop paths per node in a graph A B C D
  • 91. Example 3 Query: count the number of 3-hop paths per node in a graph def path3(a, b, c, d) = edge(a,b) and edge(b,c) and edge(c,d) def result[a] = count[path3[a]] Optimized Query: def path1[c] = count[edge[c]] def path2[b] = sum[path1[c] for c in edge[b]] def result[a] = sum[path2[b] for b in edge[a]] A B C D
  • 92. Semantic Query Optimizer It knows math! • Compute Discrete Fourier Transform in Fast Fourier Transform-time • Junction Tree Algorithm for inference in Probabilistic Graphical Models • Message passing, belief propagation • Viterbi Algorithm, forward/backward for Hidden Markov Models most probable paths • Counting sub-graph patterns (motifs) • Yannakakis Algorithm for acyclic conjunctive queries in Polynomial Time • Fractional hypertree-width time algorithm for Constraint Satisfaction Problems • Best known results for Conjunctive Queries and Quanti f ied Conjunctive Queries
  • 93. Semantic Query Optimizer It knows math! • This optimizer produces much better code than the average developer because it knows a ton more math than the average developer. • Maryam Mirzakhani • Terence Tao • Ramanujan • Katherine Goble • Good Will Hunting
  • 95. 95 def reachable = edge; reachable.edge Recursion
  • 96. How many Paths are there from The top left node to the bottom right node? 2 Paths 6 Paths
  • 97. def number_of_paths_of_length(node_number, path_length, path_count) = node_number=1, path_length=0, path_count=1 def number_of_paths_of_length[node_number, path_length] = sum[other_node, paths_of_length : paths_of_length = number_of_paths_of_length[other_node, path_length - 1] and edge(other_node, node_number)] def output = number_of_paths_of_length[number_of_nodes, 2 * lattice_size]
  • 98. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 1 Evaluating `_intermediate#0`: (1, 1) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (4, 1) => (1,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (4, 1, 1)
  • 99. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 2 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (4, 2) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (7, 2) => (1,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (7, 2, 1)
  • 100. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 3 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (3, 3) => (1,) (4, 2) => (1,) (5, 3) => (2,) (7, 3) => (1,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (6, 3) => (3,) (7, 2) => (1,) (8, 3) => (3,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (6, 3, 3) (7, 2, 1) (8, 3, 3)
  • 101. @function @transient def :_intermediate#0(other_node#1, path_length#0, _t#0) = reduce[(_x#0, _y#0, _z#0) : :rel_primitive_add(_x#0, _y#0, _z#0), (x#8, paths_of_length#1) : :number_of_paths_of_length(other_node#1, x#8, paths_of_length#1) and :rel_primitive_add(1, x#8, path_length#0), (_no_init#0) : false](_t#0) @function @transient def :_intermediate#1(node_number#0, path_length#0, path_count#0) = reduce[(_x#1, _y#1, _z#1) : :rel_primitive_add(_x#1, _y#1, _z#1), (other_node#1, _t#0) : :edge(other_node#1, node_number#0) and :_intermediate#0(other_node#1, path_length#0, _t#0), (_no_init#1) : false](path_count#0) def :number_of_paths_of_length(node_number#0, path_length#0, path_count#0) = :_base_case#0(node_number#0, path_length#0, path_count#0) or :_intermediate#1(node_number#0, path_length#0, path_count#0) Naive recursion, iteration 4 Evaluating `_intermediate#0`: (1, 1) => (1,) (2, 2) => (1,) (3, 3) => (1,) (4, 2) => (1,) (5, 3) => (2,) (6, 4) => (3,) (7, 3) => (1,) (8, 4) => (3,) Evaluating `_intermediate#1`: (2, 1) => (1,) (3, 2) => (1,) (4, 1) => (1,) (5, 2) => (2,) (6, 3) => (3,) (7, 2) => (1,) (8, 3) => (3,) (9, 4) => (6,) Evaluating `number_of_paths_of_length`: (1, 0, 1) (2, 1, 1) (3, 2, 1) (4, 1, 1) (5, 2, 2) (6, 3, 3) (7, 2, 1) (8, 3, 3) (9, 4, 6)
  • 103. Graph Analytics module graph_analytics[G]
 with G use node, edge
 
 def neighbor(x, y) = edge(x, y) or edge(y, x)
 def outdegree[x] = count[edge[x]]
 def degree[x] = count[neighbor[x]]
 def cn[x, y] = count[intersect[neighbor[x], neighbor[y]]] // Count of Common Neighbors
 
 def reachable = edge; reachable.edge
 def reachable_undirected = neighbor; reachable_undirected.neighbor
 
 def scc[x] = min[v: reachable(x, v) and reachable(v, x)] // Strongly Connected Component
 def wcc[x] = min[reachable_undirected[x]] // Weakly Connected Component
 
 def cosine_sim[x, y] = cn[x, y] / sqrt[degree[x] * degree[y]]
 def jaccard_sim[x, y] = cn[x, y] / count[neighbor[x]] + count[neighbor[y]] - cn[x, y] … end
  • 104. Betweenness Centrality Graph Algorithms One of many of graph centrality measures which are useful for assessing the importance of a node. High Level Definition: Number of times a node appears on shortest paths within a network Why it’s Useful: Identify which nodes control information flow between different areas of the graph; also called “Bridge Nodes” Business Use-Cases: Communication Analysis: Identify important people which communicate across different groups Retail Purchase Analysis: Which products introduce customers to new categories
  • 105. Betweenness Centrality Computation Brandes Algorithm is applied as follows: 1. For each pair of nodes, compute all shortest paths and capture nodes (less endpoints) on said path(s) 2. For each pair of nodes, assign each node along path a value of one if there is only one shortest path, or the fractional contribution (1/n) if n shortest paths 3. Sum the value from step 2 for each node; this is the Betweenness Centrality
  • 106. Betweenness Centrality Implementation // Shortest path between s and t when they are the same is 0. def shortest_path[s, t] = Min[ v, w: (shortest_path(s, t, w) and v = 1) or (w = shortest_path[s,v] +1 and E(v, t)) ] // When s and t are the same, there is only one shortest path between // them, namely the one with length 0. def nb_shortest(s, t, n) = V(s) and V(t) and s = t and n = 1 // When s and t are *not* the same, it is the sum of the number of shortest // paths between s and v for all the v's adjacent to t and on the shortest // path between s and t. def nb_shortest(s, t, n) = s != t and n = sum[v, m: shortest_path[s, v] + 1 = shortest_path[s, t] and E(v, t) and nb_shortest(s, v, m) ] // sum over all t's such that there is an edge between v and t, // and v is on the shortest path between s and t def C[s, v] = sum[t, r: E(v, t) and shortest_path[s, t] = shortest_path[s, v] + 1 and ( a = C[s, t] or not C(s, t, _) and a = 0.0 ) and r = (nb_shortest[s, v] / nb_shortest[s, t]) * (1 + a) ] from a // Note that below we divide by 2 because we are double counting every edge. def betweenness_centrality_brandes[v] = sum[s, p : s != v and C[s, v] = p]/2
  • 107. Betweenness Centrality ReComputation Incremental updates to data and recomputation of Betweenness Centrality takes only a few seconds, whereas the entire graph needs to be re-computed in other systems.
  • 108. Algorithm Change ReComputation Incremental updates to code is also recomputated, whereas the entire algorithm needs to be re- computed in other systems.
  • 110. Incremental Maintenance 1. Dependency tracking to figure out which views are affected by a change. 2. Demand-driven execution to only compute what users are actively interested in. 3. Differential computation to incrementally maintain even general recursion. 4. Semantic optimization to recover better maintenance algorithms where possible.
  • 111.