SlideShare una empresa de Scribd logo
1 de 73
Descargar para leer sin conexión
Graph 
Database 
Prototyping 
@ 
AMS 
GraphDB 
meetup
Agenda 
for 
Tonight 
• Building 
a 
Graph 
Database 
Prototype 
• 3 
parts 
– Graph 
database 
& 
modeling 
concepts 
– Prototyping 
tools 
& 
import 
– Graph 
querying 
with 
Cypher
Data 
Modeling 
With 
Neo4j
Topics 
• Graph 
model 
building 
blocks 
• Quick 
intro 
to 
Cypher 
• Example 
modeling 
process 
• Modeling 
Eps 
• Recipes 
for 
common 
modeling 
scenarios 
• Refactoring 
• Test-­‐driven 
data 
modeling
Graph 
Model 
Building 
Blocks
Property 
Graph 
Data 
Model
Four 
Building 
Blocks 
• Nodes 
• RelaEonships 
• ProperEes 
• Labels
Nodes
Nodes 
• Used 
to 
represent 
en##es 
and 
complex 
value 
types 
in 
your 
domain 
• Can 
contain 
properEes 
– Used 
to 
represent 
enEty 
a1ributes 
and/or 
metadata 
(e.g. 
Emestamps, 
version) 
– Key-­‐value 
pairs 
• Java 
primiEves 
• Arrays 
• null 
is 
not 
a 
valid 
value 
– Every 
node 
can 
have 
different 
properEes
EnEEes 
and 
Value 
Types 
• EnEEes 
– Have 
unique 
conceptual 
idenEty 
– Change 
aWribute 
values, 
but 
idenEty 
remains 
the 
same 
• Value 
types 
– No 
conceptual 
idenEty 
– Can 
subsEtute 
for 
each 
other 
if 
they 
have 
the 
same 
value 
• Simple: 
single 
value 
(e.g. 
colour, 
category) 
• Complex: 
mulEple 
aWributes 
(e.g. 
address)
RelaEonships
RelaEonships 
• Every 
relaEonship 
has 
a 
name 
and 
a 
direc#on 
– Add 
structure 
to 
the 
graph 
– Provide 
semanEc 
context 
for 
nodes 
• Can 
contain 
properEes 
– Used 
to 
represent 
quality 
or 
weight 
of 
relaEonship, 
or 
metadata 
• Every 
relaEonship 
must 
have 
a 
start 
node 
and 
end 
node 
– No 
dangling 
relaEonships
RelaEonships 
(conEnued) 
Nodes 
can 
have 
more 
than 
one 
relaEonship 
Nodes 
can 
be 
connected 
by 
more 
than 
one 
relaEonship 
Self 
relaEonships 
are 
allowed
Variable 
Structure 
• RelaEonships 
are 
defined 
with 
regard 
to 
node 
instances, 
not 
classes 
of 
nodes 
– Two 
nodes 
represenEng 
the 
same 
kind 
of 
“thing” 
can 
be 
connected 
in 
very 
different 
ways 
• Allows 
for 
structural 
variaEon 
in 
the 
domain 
– Contrast 
with 
relaEonal 
schemas, 
where 
foreign 
key 
relaEonships 
apply 
to 
all 
rows 
in 
a 
table 
• No 
need 
to 
use 
null 
to 
represent 
the 
absence 
of 
a 
connecEon
Labels
Labels 
• Every 
node 
can 
have 
zero 
or 
more 
labels 
• Used 
to 
represent 
roles 
(e.g. 
user, 
product, 
company) 
– Group 
nodes 
– Allow 
us 
to 
associate 
indexes 
and 
constraints 
with 
groups 
of 
nodes
Four 
Building 
Blocks 
• Nodes 
– EnEEes 
• RelaEonships 
– Connect 
enEEes 
and 
structure 
domain 
• ProperEes 
– EnEty 
aWributes, 
relaEonship 
qualiEes, 
and 
metadata 
• Labels 
– Group 
nodes 
by 
role
Designing 
a 
Graph 
Model
Models 
Purposeful 
abstracEon 
of 
a 
domain 
designed 
to 
saEsfy 
parEcular 
applicaEon/end-­‐user 
goals 
Images: 
en.wikipedia.org
Design 
for 
Queryability 
MQuoedreyl
Method 
1. IdenEfy 
applicaEon/end-­‐user 
goals 
2. Figure 
out 
what 
quesEons 
to 
ask 
of 
the 
domain 
3. IdenEfy 
enEEes 
in 
each 
quesEon 
4. IdenEfy 
relaEonships 
between 
enEEes 
in 
each 
quesEon 
5. Convert 
enEEes 
and 
relaEonships 
to 
paths 
– These 
become 
the 
basis 
of 
the 
data 
model 
6. Express 
quesEons 
as 
graph 
paWerns 
– These 
become 
the 
basis 
for 
queries
ApplicaEon/End-­‐User 
Goals 
As 
an 
employee 
I 
want 
to 
know 
who 
in 
the 
company 
has 
similar 
skills 
to 
me 
So 
that 
we 
can 
exchange 
knowledge
QuesEons 
To 
Ask 
of 
the 
Domain 
As 
an 
employee 
I 
want 
to 
know 
who 
in 
the 
company 
has 
similar 
skills 
to 
me 
So 
that 
we 
can 
exchange 
knowledge 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me?
IdenEfy 
EnEEes 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
Person 
Company 
Skill
IdenEfy 
RelaEonships 
Between 
EnEEes 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
Person 
WORKS_FOR 
Company 
Person 
HAS_SKILL 
Skill
Convert 
to 
Cypher 
Paths 
RelaEonship 
Person 
WORKS_FOR 
Company 
Person 
HAS_SKILL 
Skill 
Label 
(:Person)-[:WORKS_FOR]->(:Company), 
(:Person)-[:HAS_SKILL]->(:Skill)
Consolidate 
Paths 
(:Person)-[:WORKS_FOR]->(:Company), 
(:Person)-[:HAS_SKILL]->(:Skill) 
(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)
Create 
Person 
Subgraph 
MERGE (c:Company{name:'Acme'}) 
MERGE (p:Person{name:'Ian'}) 
MERGE (s1:Skill{name:'Java'}) 
MERGE (s2:Skill{name:'C#'}) 
MERGE (s3:Skill{name:'Neo4j'}) 
CREATE UNIQUE (c)<-[:WORKS_FOR]-(p), 
(p)-[:HAS_SKILL]->(s1), 
(p)-[:HAS_SKILL]->(s2), 
(p)-[:HAS_SKILL]->(s3) 
RETURN c, p, s1, s2, s3
Candidate 
Data 
Model 
(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)
Express 
QuesEon 
as 
Graph 
PaWern 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me?
Cypher 
Query 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), 
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) 
WHERE me.name = {name} 
RETURN colleague.name AS name, 
count(skill) AS score, 
collect(skill.name) AS skills 
ORDER BY score DESC
Graph 
PaWern 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), 
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) 
WHERE me.name = {name} 
RETURN colleague.name AS name, 
count(skill) AS score, 
collect(skill.name) AS skills 
ORDER BY score DESC
Anchor 
PaWern 
in 
Graph 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), 
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) 
WHERE me.name = {name} 
RETURN colleague.name AS name, 
count(skill) AS score, 
collect(skill.name) AS skills 
ORDER BY score DESC 
If 
an 
index 
for 
Person.name 
exists, 
Cypher 
will 
use 
it
Create 
ProjecEon 
of 
Results 
Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me? 
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), 
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) 
WHERE me.name = {name} 
RETURN colleague.name AS name, 
count(skill) AS score, 
collect(skill.name) AS skills 
ORDER BY score DESC
First 
Match
Second 
Match
Third 
Match
Running 
the 
Query 
+-----------------------------------+ 
| name | score | skills | 
+-----------------------------------+ 
| "Lucy" | 2 | ["Java","Neo4j"] | 
| "Bill" | 1 | ["Neo4j"] | 
+-----------------------------------+ 
2 rows
From 
User 
Story 
to 
Model 
and 
Query 
MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), 
(company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) 
WHERE me.name = {name} 
RETURN colleague.name AS name, 
count(skill) AS score, 
collect(skill.name) AS skills 
ORDER BY score DESC 
As 
an 
employee 
I 
want 
to 
know 
who 
in 
the 
company 
has 
similar 
skills 
to 
me 
So 
that 
we 
can 
exchange 
knowledge 
Person 
WORKS_FOR 
Company 
Person 
HAS_SKILL 
Skill 
(:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill) 
? Which 
people, 
who 
work 
for 
the 
same 
company 
as 
me, 
have 
similar 
skills 
to 
me?
Modeling 
Tips
ProperEes 
Versus 
RelaEonships
Use 
RelaEonships 
When… 
• You 
need 
to 
specify 
the 
weight, 
strength, 
or 
some 
other 
quality 
of 
the 
rela#onship 
• AND/OR 
the 
aWribute 
value 
comprises 
a 
complex 
value 
type 
(e.g. 
address) 
• Examples: 
– Find 
all 
my 
colleagues 
who 
are 
expert 
(relaEonship 
quality) 
at 
a 
skill 
(aWribute 
value) 
we 
have 
in 
common 
– Find 
all 
recent 
orders 
delivered 
to 
the 
same 
delivery 
address 
(complex 
value 
type)
Use 
ProperEes 
When… 
• There’s 
no 
need 
to 
qualify 
the 
relaEonship 
• AND 
the 
aWribute 
value 
comprises 
a 
simple 
value 
type 
(e.g. 
colour) 
• Examples: 
– Find 
those 
projects 
wriWen 
by 
contributors 
to 
my 
projects 
that 
use 
the 
same 
language 
(aWribute 
value) 
as 
my 
projects
If 
Performance 
is 
CriEcal… 
• Small 
property 
lookup 
on 
a 
node 
will 
be 
quicker 
than 
traversing 
a 
relaEonship 
– But 
traversing 
a 
relaEonship 
is 
sEll 
faster 
than 
a 
SQL 
join… 
• However, 
many 
small 
proper#es 
on 
a 
node, 
or 
a 
lookup 
on 
a 
large 
string 
or 
large 
array 
property 
will 
impact 
performance 
– Always 
performance 
test 
against 
a 
representaEve 
dataset
RelaEonship 
Granularity
Align 
With 
Use 
Cases 
• RelaEonships 
are 
the 
“royal 
road” 
into 
the 
graph 
• When 
querying, 
well-­‐named 
relaEonships 
help 
discover 
only 
what 
is 
absolutely 
necessary 
– And 
eliminate 
unnecessary 
porEons 
of 
the 
graph 
from 
consideraEon
General 
RelaEonships 
• Qualified 
by 
property
Specific 
RelaEonships
Best 
of 
Both 
Worlds
Model 
and 
Query 
Recipes
Events 
and 
AcEons 
• Oken 
involve 
mulEple 
parEes 
• Can 
include 
other 
circumstanEal 
detail, 
which 
may 
be 
common 
to 
mulEple 
events 
• Examples 
– Patrick 
worked 
for 
Acme 
from 
2001 
to 
2005 
as 
a 
Sokware 
Developer 
– Sarah 
sent 
an 
email 
to 
Lucy, 
copying 
in 
David 
and 
Claire
Timeline 
Trees 
• Discrete 
events 
– No 
natural 
relaEonships 
to 
other 
events 
• You 
need 
to 
find 
events 
at 
differing 
levels 
of 
granularity 
– Between 
two 
days 
– Between 
two 
months 
– Between 
two 
minutes
Example 
Timeline 
Tree
Pimalls 
and 
AnE-­‐PaWerns
Modeling 
EnEEes 
as 
RelaEonships 
• Limits 
data 
model 
evoluEon 
– A 
relaEonship 
connects 
two 
things 
– Modeling 
an 
enEty 
as 
a 
relaEonship 
prevents 
it 
from 
being 
related 
to 
more 
than 
two 
things 
• Smells: 
– Lots 
of 
aWribute-­‐like 
properEes 
– Heavy 
use 
of 
relaEonship 
indexes 
• EnEEes 
hidden 
in 
verbs: 
– E.g. 
emailed, 
reviewed
Example: 
Movie 
Reviews 
• IniEal 
requirements: 
– People 
review 
films 
– ApplicaEon 
aggregates 
reviews 
from 
mulEple 
sites
IniEal 
Model
New 
Requirements 
• Allow 
user 
to 
comment 
on 
each 
other’s 
reviews 
– Can’t 
connect 
a 
review 
to 
a 
third 
enEty
Revised 
model
Model 
AcEons 
in 
Terms 
of 
Products
Now 
for 
Some 
Prototyping!
Draw 
a 
Model! 
Eg. 
Using 
Visio, 
www.apcjones.com/arrows, 
hWp://graphjson.io, 
Omnigraffle
CreaEng 
a 
prototype 
DB 
out 
of 
our 
model?
Now 
for 
Some 
Queries!
Next 
meetup! 
• January 
22nd 
: 
how 
to 
create 
an 
APPLICATION 
on 
top 
of 
our 
newly 
created 
database
BACKUP 
slides: 
Cypher 
Query 
Language
Nodes 
and 
RelaEonships 
()-->()
Labels 
and 
RelaEonship 
Types 
(:Person)-[:FRIEND]->(:Person)
ProperEes 
(:Person{name:'Peter'})-[:FRIEND]->(:Person{name:'Lucy'})
IdenEfiers 
(p1:Person{name:'Peter'})-[r:FRIEND]->(p2:Person{name:'Lucy'})
Cypher 
MATCH graph_pattern 
WHERE binding_and_filter_criteria 
RETURN results
Cypher 
MATCH (p:Person)-[:FRIEND]->(friends) 
WHERE p.name = 'Peter' 
RETURN friends
Lookup 
Using 
IdenEfier 
+ 
Label 
MATCH (p:Person)-[:FRIEND]->(friends) 
WHERE p.name = 'Peter' 
RETURN friends

Más contenido relacionado

La actualidad más candente

KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge base
Laurent Alquier
 

La actualidad más candente (20)

Data Visulalization
Data VisulalizationData Visulalization
Data Visulalization
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
 
Jethro for tableau webinar (11 15)
Jethro for tableau webinar (11 15)Jethro for tableau webinar (11 15)
Jethro for tableau webinar (11 15)
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
 
How to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer CertificationHow to obtain the Cloudera Data Engineer Certification
How to obtain the Cloudera Data Engineer Certification
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
Scoring at Scale: Generating Follow Recommendations for Over 690 Million Link...
 
KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge base
 
Online Model Updating with Spark Streaming
Online Model Updating with Spark StreamingOnline Model Updating with Spark Streaming
Online Model Updating with Spark Streaming
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and Spark
 
L15 Data Source Layer
L15 Data Source LayerL15 Data Source Layer
L15 Data Source Layer
 

Destacado

Destacado (6)

20141015 how graphs revolutionize access management
20141015 how graphs revolutionize access management20141015 how graphs revolutionize access management
20141015 how graphs revolutionize access management
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Graph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with GraphgenGraph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with Graphgen
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
Document Classification with Neo4j
Document Classification with Neo4jDocument Classification with Neo4j
Document Classification with Neo4j
 
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphC...
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphC...Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphC...
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphC...
 

Similar a 20141216 graph database prototyping ams meetup

Entity Relationship Model
Entity Relationship ModelEntity Relationship Model
Entity Relationship Model
Slideshare
 

Similar a 20141216 graph database prototyping ams meetup (20)

Data modeling with neo4j tutorial
Data modeling with neo4j tutorialData modeling with neo4j tutorial
Data modeling with neo4j tutorial
 
Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...Designing and Building a Graph Database Application – Architectural Choices, ...
Designing and Building a Graph Database Application – Architectural Choices, ...
 
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
 
Introducing the MySQL Workbench CASE tool
Introducing the MySQL Workbench CASE toolIntroducing the MySQL Workbench CASE tool
Introducing the MySQL Workbench CASE tool
 
Efficient Rails Test Driven Development (class 3) by Wolfram Arnold
Efficient Rails Test Driven Development (class 3) by Wolfram ArnoldEfficient Rails Test Driven Development (class 3) by Wolfram Arnold
Efficient Rails Test Driven Development (class 3) by Wolfram Arnold
 
Neo4j Presentation
Neo4j PresentationNeo4j Presentation
Neo4j Presentation
 
Thinking about graphs
Thinking about graphsThinking about graphs
Thinking about graphs
 
Anchor modeling
Anchor modelingAnchor modeling
Anchor modeling
 
Erd
ErdErd
Erd
 
Revision ch 3
Revision ch 3Revision ch 3
Revision ch 3
 
Entityrelationshipmodel
EntityrelationshipmodelEntityrelationshipmodel
Entityrelationshipmodel
 
Entity Relationship Model
Entity Relationship ModelEntity Relationship Model
Entity Relationship Model
 
Database part3-
Database part3-Database part3-
Database part3-
 
Implementing the Database Server session 01
Implementing the Database Server  session 01Implementing the Database Server  session 01
Implementing the Database Server session 01
 
Database design
Database designDatabase design
Database design
 
Exploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through CassandraExploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through Cassandra
 
DBMS & Data Models - In Introduction
DBMS & Data Models - In IntroductionDBMS & Data Models - In Introduction
DBMS & Data Models - In Introduction
 
HBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQLHBase and Drill: How loosley typed SQL is ideal for NoSQL
HBase and Drill: How loosley typed SQL is ideal for NoSQL
 
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQLHBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 

Más de Rik Van Bruggen

Más de Rik Van Bruggen (11)

1 rik van bruggen - intro and state of the graph
1   rik van bruggen - intro and state of the graph1   rik van bruggen - intro and state of the graph
1 rik van bruggen - intro and state of the graph
 
3 surya gupta - tabloid proteome
3  surya gupta - tabloid proteome3  surya gupta - tabloid proteome
3 surya gupta - tabloid proteome
 
4 tom michiels - graph platform enabler
4   tom michiels - graph platform enabler4   tom michiels - graph platform enabler
4 tom michiels - graph platform enabler
 
Reinventing Identity and Access Management with Graph Databases
Reinventing Identity and Access Management with Graph DatabasesReinventing Identity and Access Management with Graph Databases
Reinventing Identity and Access Management with Graph Databases
 
Cevora ICT Symposium - Graph Databases
Cevora ICT Symposium - Graph DatabasesCevora ICT Symposium - Graph Databases
Cevora ICT Symposium - Graph Databases
 
20150624 Belgian GraphDB meetup at Ordina
20150624 Belgian GraphDB meetup at Ordina20150624 Belgian GraphDB meetup at Ordina
20150624 Belgian GraphDB meetup at Ordina
 
20150619 GOTO Amsterdam Conference - What Business can learn from Dating
20150619 GOTO Amsterdam Conference - What Business can learn from Dating20150619 GOTO Amsterdam Conference - What Business can learn from Dating
20150619 GOTO Amsterdam Conference - What Business can learn from Dating
 
Intro to Graphs for Fedict
Intro to Graphs for FedictIntro to Graphs for Fedict
Intro to Graphs for Fedict
 
20150326 data innovation summit IGNITE talk
20150326 data innovation summit IGNITE talk20150326 data innovation summit IGNITE talk
20150326 data innovation summit IGNITE talk
 
20150121 wolters kluwer innovation pitch
20150121 wolters kluwer innovation pitch20150121 wolters kluwer innovation pitch
20150121 wolters kluwer innovation pitch
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection
 

Último

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 

20141216 graph database prototyping ams meetup

  • 1. Graph Database Prototyping @ AMS GraphDB meetup
  • 2. Agenda for Tonight • Building a Graph Database Prototype • 3 parts – Graph database & modeling concepts – Prototyping tools & import – Graph querying with Cypher
  • 4. Topics • Graph model building blocks • Quick intro to Cypher • Example modeling process • Modeling Eps • Recipes for common modeling scenarios • Refactoring • Test-­‐driven data modeling
  • 7. Four Building Blocks • Nodes • RelaEonships • ProperEes • Labels
  • 9. Nodes • Used to represent en##es and complex value types in your domain • Can contain properEes – Used to represent enEty a1ributes and/or metadata (e.g. Emestamps, version) – Key-­‐value pairs • Java primiEves • Arrays • null is not a valid value – Every node can have different properEes
  • 10. EnEEes and Value Types • EnEEes – Have unique conceptual idenEty – Change aWribute values, but idenEty remains the same • Value types – No conceptual idenEty – Can subsEtute for each other if they have the same value • Simple: single value (e.g. colour, category) • Complex: mulEple aWributes (e.g. address)
  • 12. RelaEonships • Every relaEonship has a name and a direc#on – Add structure to the graph – Provide semanEc context for nodes • Can contain properEes – Used to represent quality or weight of relaEonship, or metadata • Every relaEonship must have a start node and end node – No dangling relaEonships
  • 13. RelaEonships (conEnued) Nodes can have more than one relaEonship Nodes can be connected by more than one relaEonship Self relaEonships are allowed
  • 14. Variable Structure • RelaEonships are defined with regard to node instances, not classes of nodes – Two nodes represenEng the same kind of “thing” can be connected in very different ways • Allows for structural variaEon in the domain – Contrast with relaEonal schemas, where foreign key relaEonships apply to all rows in a table • No need to use null to represent the absence of a connecEon
  • 16. Labels • Every node can have zero or more labels • Used to represent roles (e.g. user, product, company) – Group nodes – Allow us to associate indexes and constraints with groups of nodes
  • 17. Four Building Blocks • Nodes – EnEEes • RelaEonships – Connect enEEes and structure domain • ProperEes – EnEty aWributes, relaEonship qualiEes, and metadata • Labels – Group nodes by role
  • 19. Models Purposeful abstracEon of a domain designed to saEsfy parEcular applicaEon/end-­‐user goals Images: en.wikipedia.org
  • 21. Method 1. IdenEfy applicaEon/end-­‐user goals 2. Figure out what quesEons to ask of the domain 3. IdenEfy enEEes in each quesEon 4. IdenEfy relaEonships between enEEes in each quesEon 5. Convert enEEes and relaEonships to paths – These become the basis of the data model 6. Express quesEons as graph paWerns – These become the basis for queries
  • 22. ApplicaEon/End-­‐User Goals As an employee I want to know who in the company has similar skills to me So that we can exchange knowledge
  • 23. QuesEons To Ask of the Domain As an employee I want to know who in the company has similar skills to me So that we can exchange knowledge Which people, who work for the same company as me, have similar skills to me?
  • 24. IdenEfy EnEEes Which people, who work for the same company as me, have similar skills to me? Person Company Skill
  • 25. IdenEfy RelaEonships Between EnEEes Which people, who work for the same company as me, have similar skills to me? Person WORKS_FOR Company Person HAS_SKILL Skill
  • 26. Convert to Cypher Paths RelaEonship Person WORKS_FOR Company Person HAS_SKILL Skill Label (:Person)-[:WORKS_FOR]->(:Company), (:Person)-[:HAS_SKILL]->(:Skill)
  • 27. Consolidate Paths (:Person)-[:WORKS_FOR]->(:Company), (:Person)-[:HAS_SKILL]->(:Skill) (:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)
  • 28. Create Person Subgraph MERGE (c:Company{name:'Acme'}) MERGE (p:Person{name:'Ian'}) MERGE (s1:Skill{name:'Java'}) MERGE (s2:Skill{name:'C#'}) MERGE (s3:Skill{name:'Neo4j'}) CREATE UNIQUE (c)<-[:WORKS_FOR]-(p), (p)-[:HAS_SKILL]->(s1), (p)-[:HAS_SKILL]->(s2), (p)-[:HAS_SKILL]->(s3) RETURN c, p, s1, s2, s3
  • 29. Candidate Data Model (:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill)
  • 30. Express QuesEon as Graph PaWern Which people, who work for the same company as me, have similar skills to me?
  • 31. Cypher Query Which people, who work for the same company as me, have similar skills to me? MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = {name} RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 32. Graph PaWern Which people, who work for the same company as me, have similar skills to me? MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = {name} RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 33. Anchor PaWern in Graph Which people, who work for the same company as me, have similar skills to me? MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = {name} RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC If an index for Person.name exists, Cypher will use it
  • 34. Create ProjecEon of Results Which people, who work for the same company as me, have similar skills to me? MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = {name} RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC
  • 38. Running the Query +-----------------------------------+ | name | score | skills | +-----------------------------------+ | "Lucy" | 2 | ["Java","Neo4j"] | | "Bill" | 1 | ["Neo4j"] | +-----------------------------------+ 2 rows
  • 39. From User Story to Model and Query MATCH (company)<-[:WORKS_FOR]-(me:Person)-[:HAS_SKILL]->(skill), (company)<-[:WORKS_FOR]-(colleague)-[:HAS_SKILL]->(skill) WHERE me.name = {name} RETURN colleague.name AS name, count(skill) AS score, collect(skill.name) AS skills ORDER BY score DESC As an employee I want to know who in the company has similar skills to me So that we can exchange knowledge Person WORKS_FOR Company Person HAS_SKILL Skill (:Company)<-[:WORKS_FOR]-(:Person)-[:HAS_SKILL]->(:Skill) ? Which people, who work for the same company as me, have similar skills to me?
  • 42. Use RelaEonships When… • You need to specify the weight, strength, or some other quality of the rela#onship • AND/OR the aWribute value comprises a complex value type (e.g. address) • Examples: – Find all my colleagues who are expert (relaEonship quality) at a skill (aWribute value) we have in common – Find all recent orders delivered to the same delivery address (complex value type)
  • 43. Use ProperEes When… • There’s no need to qualify the relaEonship • AND the aWribute value comprises a simple value type (e.g. colour) • Examples: – Find those projects wriWen by contributors to my projects that use the same language (aWribute value) as my projects
  • 44. If Performance is CriEcal… • Small property lookup on a node will be quicker than traversing a relaEonship – But traversing a relaEonship is sEll faster than a SQL join… • However, many small proper#es on a node, or a lookup on a large string or large array property will impact performance – Always performance test against a representaEve dataset
  • 46. Align With Use Cases • RelaEonships are the “royal road” into the graph • When querying, well-­‐named relaEonships help discover only what is absolutely necessary – And eliminate unnecessary porEons of the graph from consideraEon
  • 47. General RelaEonships • Qualified by property
  • 49. Best of Both Worlds
  • 50. Model and Query Recipes
  • 51. Events and AcEons • Oken involve mulEple parEes • Can include other circumstanEal detail, which may be common to mulEple events • Examples – Patrick worked for Acme from 2001 to 2005 as a Sokware Developer – Sarah sent an email to Lucy, copying in David and Claire
  • 52. Timeline Trees • Discrete events – No natural relaEonships to other events • You need to find events at differing levels of granularity – Between two days – Between two months – Between two minutes
  • 55. Modeling EnEEes as RelaEonships • Limits data model evoluEon – A relaEonship connects two things – Modeling an enEty as a relaEonship prevents it from being related to more than two things • Smells: – Lots of aWribute-­‐like properEes – Heavy use of relaEonship indexes • EnEEes hidden in verbs: – E.g. emailed, reviewed
  • 56. Example: Movie Reviews • IniEal requirements: – People review films – ApplicaEon aggregates reviews from mulEple sites
  • 58. New Requirements • Allow user to comment on each other’s reviews – Can’t connect a review to a third enEty
  • 60. Model AcEons in Terms of Products
  • 61. Now for Some Prototyping!
  • 62. Draw a Model! Eg. Using Visio, www.apcjones.com/arrows, hWp://graphjson.io, Omnigraffle
  • 63. CreaEng a prototype DB out of our model?
  • 64. Now for Some Queries!
  • 65. Next meetup! • January 22nd : how to create an APPLICATION on top of our newly created database
  • 66. BACKUP slides: Cypher Query Language
  • 68. Labels and RelaEonship Types (:Person)-[:FRIEND]->(:Person)
  • 71. Cypher MATCH graph_pattern WHERE binding_and_filter_criteria RETURN results
  • 72. Cypher MATCH (p:Person)-[:FRIEND]->(friends) WHERE p.name = 'Peter' RETURN friends
  • 73. Lookup Using IdenEfier + Label MATCH (p:Person)-[:FRIEND]->(friends) WHERE p.name = 'Peter' RETURN friends