SlideShare una empresa de Scribd logo
1 de 79
Descargar para leer sin conexión
SAN 
FRANCISCO 
| 
10.22.2014 
Scaling 
Neo4j 
Applica0ons 
@iansrobinson
The 
Burden 
of 
Success 
• More 
users 
• Larger 
datasets 
• More 
concurrent 
requests 
• More 
complex 
queries
Scaling 
is 
a 
Feature 
• It 
doesn’t 
come 
for 
free 
• Condi0ons 
of 
success: 
– Understand 
current 
needs 
• Design 
for 
an 
order 
of 
magnitude 
growth 
– Itera0ve 
and 
incremental 
development 
– Unit 
tests 
• Bedrock 
of 
asserted 
behaviour 
– Performance 
tests
Overview 
• Scaling 
Reads 
– Latency 
– Throughput 
• Scaling 
Writes 
• Hardware
Scaling 
Reads 
-­‐ 
Latency
Query 
Latency 
latency = f(search_area)
Query 
Latency 
latency = f(search_area)
Query 
Latency 
latency = f(search_area)
Query 
Latency 
latency = f(search_area)
Query 
Latency 
latency = f(search_area)
Query 
Latency 
latency = f(search_area)
Search 
Area 
search_area = f(domain_invariants)
Search 
Area 
search_area = f(domain_invariants) 
Absolute 
Every 
user 
has 
50 
friends
Search 
Area 
search_area = f(domain_invariants) 
Absolute 
Every 
user 
has 
50 
friends
Search 
Area 
search_area = f(domain_invariants) 
Absolute 
Every 
user 
has 
50 
friends 
Rela,ve 
Every 
user 
is 
friends 
with 
10% 
of 
the 
user 
base
Search 
Area 
search_area = f(domain_invariants) 
Absolute 
Every 
user 
has 
50 
friends 
Rela,ve 
Every 
user 
is 
friends 
with 
10% 
of 
the 
user 
base
Reducing 
Read 
Latency 
• The 
Blackadder 
solu0on
Reducing 
Read 
Latency 
• The 
Blackadder 
solu0on 
• Improve 
the 
Cypher 
query 
• Change 
the 
model 
• Use 
an 
Unmanaged 
Extension
Improve 
Cypher 
Query 
• Small 
queries, 
separated 
by 
WITH 
• Start 
from 
low-­‐cardinality 
nodes 
hp://thought-­‐bytes.blogspot.co.uk/2013/01/op0mizing-­‐neo4j-­‐cypher-­‐queries.html 
hp://wes.skeweredrook.com/pragma0c-­‐cypher-­‐op0miza0on-­‐2-­‐0-­‐m06/
Change 
the 
Model 
Goal 
Do 
less 
work 
(in 
the 
query) 
– By 
exploring 
less 
of 
the 
graph 
How? 
Iden0fy 
inferred 
rela-onships 
– Replace 
with 
use-­‐case 
specific 
shortcuts
Change 
the 
Model 
-­‐ 
From 
MATCH (:Person{username:'ben'}) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(colleague:Person)
Change 
the 
Model 
-­‐ 
From 
MATCH (:Person{username:'ben'}) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(colleague:Person)
Change 
the 
Model 
-­‐ 
To 
MATCH (:Person{username:'ben'}) 
-[:WORKED_WITH]- 
(colleague:Person)
Tradeoff 
More 
expensive 
writes 
More 
data 
Cheaper 
reads 
When 
to 
add 
the 
new 
rela0onship? 
• With 
tx 
• Queue 
for 
subsequent 
tx 
• Periodic/batch
Refactor 
Exis0ng 
Data 
MATCH (p1:Person) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(p2:Person) 
WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) 
WITH DISTINCT p1, p2 LIMIT 10 
MERGE (p1)-[r:WORKED_WITH]-(p2) 
RETURN count(r)
Select 
Batch 
MATCH (p1:Person) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(p2:Person) 
WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) 
WITH DISTINCT p1, p2 LIMIT 10 
MERGE (p1)-[r:WORKED_WITH]-(p2) 
RETURN count(r) 
Batch 
size
Add 
New 
Rela0onship 
MATCH (p1:Person) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(p2:Person) 
WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) 
WITH DISTINCT p1, p2 LIMIT 10 
MERGE (p1)-[r:WORKED_WITH]-(p2) 
RETURN count(r)
Con0nue 
While 
count(r) 
> 
0 
MATCH (p1:Person) 
-[:WORKED_ON]->(:Project)<-[:WORKED_ON]- 
(p2:Person) 
WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) 
WITH DISTINCT p1, p2 LIMIT 10 
MERGE (p1)-[r:WORKED_WITH]-(p2) 
RETURN count(r)
Use 
Unmanaged 
Extensions 
/db/data/cypher /my-extension/service 
REST 
API 
Extensions
RESTful 
Resource 
@Path("/similar-skills") 
public class ColleagueFinderExtension { 
private static final ObjectMapper MAPPER = new ObjectMapper(); 
private final ColleagueFinder colleagueFinder; 
public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { 
this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); 
} 
@GET 
@Produces(MediaType.APPLICATION_JSON) 
@Path("/{name}") 
public Response getColleagues( @PathParam("name") String name ) 
throws IOException { 
String json = MAPPER 
.writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); 
return Response.ok().entity( json ).build(); 
} 
}
JAX-­‐RS 
Annota0ons 
@Path("/similar-skills") 
public class ColleagueFinderExtension { 
private static final ObjectMapper MAPPER = new ObjectMapper(); 
private final ColleagueFinder colleagueFinder; 
public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { 
this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); 
} 
@GET 
@Produces(MediaType.APPLICATION_JSON) 
@Path("/{name}") 
public Response getColleagues( @PathParam("name") String name ) 
throws IOException { 
String json = MAPPER 
.writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); 
return Response.ok().entity( json ).build(); 
} 
}
Inject 
Database/Cypher 
Execu0on 
Engine 
@Path("/similar-skills") 
public class ColleagueFinderExtension { 
private static final ObjectMapper MAPPER = new ObjectMapper(); 
private final ColleagueFinder colleagueFinder; 
public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { 
this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); 
} 
@GET 
@Produces(MediaType.APPLICATION_JSON) 
@Path("/{name}") 
public Response getColleagues( @PathParam("name") String name ) 
throws IOException { 
String json = MAPPER 
.writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); 
return Response.ok().entity( json ).build(); 
} 
}
1. 
Get 
Close 
to 
the 
Data 
Applica0on 
MATCH 
MATCH 
CREATE 
DELETE 
MERGE 
MATCH 
Single 
request, 
many 
opera0ons 
– 
Reduce 
network 
latencies
2. 
Mul0ple 
Implementa0on 
Op0ons 
REST 
API 
Extensions 
Cypher 
Traversal 
Framework 
Graph 
Algo 
Package 
Core 
API
3. 
Control 
Request/Response 
Format 
{ 
users: [ 
{ id: 1234}, 
{ id: 9876} 
] 
} 
JSON, 
CSV, 
protobuf, 
etc 
1a 03 08 96 01 Domain-­‐specific 
representa0ons 
– Compact 
– Conserve 
bandwidth
4. 
Control 
HTTP 
Headers 
GET /my-extension/service/top-10 
Applica0on 
Reverse 
Proxy 
HTTP/1.1 200 OK 
Cache-Control: max-age=60
5. 
Integrate 
with 
Backend 
Systems 
Applica0on 
REST 
API 
Extensions 
RDBMS 
LDAP
Migra0ng 
to 
Extensions 
• Re-­‐implement 
original 
query 
inside 
extension 
• Modify 
request/response 
formats 
and 
headers 
• Refactor 
implementa0on 
to 
use 
lower 
parts 
of 
the 
stack 
where 
necessary 
• Measure, 
measure, 
measure
Scaling 
Reads 
-­‐ 
Throughput
Scale 
Horizontally 
For 
High 
Read 
Throughput 
Applica0on
Scale 
Horizontally 
For 
High 
Read 
Throughput 
Applica0on 
Load 
Balancer 
Master 
Slave 
Slave
Scale 
Horizontally 
For 
High 
Read 
Throughput 
Applica0on 
Read 
Load 
Balancer 
Write 
Load 
Balancer 
Master 
Slave 
Slave
Configure 
HAProxy 
as 
Read 
Load 
Balancer 
global 
daemon 
maxconn 256 
defaults 
mode http 
timeout connect 5000ms 
timeout client 50000ms 
timeout server 50000ms 
frontend http-in 
bind *:80 
default_backend neo4j-slaves 
backend neo4j-slaves 
option httpchk GET /db/manage/server/ha/slave 
server s1 10.0.1.10:7474 maxconn 32 check 
server s2 10.0.1.11:7474 maxconn 32 check 
server s3 10.0.1.12:7474 maxconn 32 check 
listen admin 
bind *:8080 
stats enable
Configure 
HAProxy 
as 
Read 
Load 
Balancer 
global 
daemon 
maxconn 256 
defaults 
mode http 
timeout connect 5000ms 
timeout client 50000ms 
timeout server 50000ms 
frontend http-in 
bind *:80 
default_backend neo4j-slaves 
backend neo4j-slaves 
option httpchk GET /db/manage/server/ha/slave 
server s1 10.0.1.10:7474 maxconn 32 check 
server s2 10.0.1.11:7474 maxconn 32 check 
server s3 10.0.1.12:7474 maxconn 32 check 
listen admin 
bind *:8080 
stats enable 
Master 
404 Not Found 
false 
Slave 
200 OK 
true 
404 Not Found 
UNKNOWN 
Unknown
This 
Isn’t 
The 
Throughput 
You 
Were 
Looking 
For 
Applica0on 
MATCH (c:Country{name:'NZAaoumrsbwtairaya'l}i)a.'.}.) ... 
Load 
Balancer 
1 
2 
3
Cache 
Sharding 
Using 
Consistent 
Rou0ng 
Applica0on 
Load 
Balancer 
1 
2 
3 
NZAaoumrsbwtairaya'l}i)a.'.}.) ... 
A-­‐I 
1 
J-­‐R 
2 
S-­‐Z 
3 
MATCH (c:Country{name:'BJZraiapmzabinal'b'}w})e)'..}..).. ..
Configure 
HAProxy 
for 
Cache 
Sharding 
global 
daemon 
maxconn 256 
defaults 
mode http 
timeout connect 5000ms 
timeout client 50000ms 
timeout server 50000ms 
frontend http-in 
bind *:80 
default_backend neo4j-slaves 
backend neo4j-slaves 
balance url_param country_code 
server s1 10.0.1.10:7474 maxconn 32 
server s2 10.0.1.11:7474 maxconn 32 
server s3 10.0.1.12:7474 maxconn 32 
listen admin 
bind *:8080 
stats enable
Configure 
HAProxy 
for 
Cache 
Sharding 
global 
daemon 
maxconn 256 
defaults 
mode http 
timeout connect 5000ms 
timeout client 50000ms 
timeout server 50000ms 
frontend http-in 
bind *:80 
default_backend neo4j-slaves 
backend neo4j-slaves 
balance url_param country_code 
server s1 10.0.1.10:7474 maxconn 32 
server s2 10.0.1.11:7474 maxconn 32 
server s3 10.0.1.12:7474 maxconn 32 
listen admin 
bind *:8080 
stats enable
Scaling 
Writes 
-­‐ 
Throughput
Factors 
Impac0ng 
Write 
Performance 
• Managing 
transac0onal 
state 
– Crea0ng 
and 
commilng 
are 
expensive 
opera0ons 
• Contending 
for 
locks 
– Nodes 
and 
rela0onships
Improving 
Write 
Throughput 
• Delay 
taking 
expensive 
locks 
• Batch/queue 
writes
Delay 
Expensive 
Locks 
• Iden0fy 
contended 
nodes 
• Involve 
them 
as 
late 
as 
possible 
in 
a 
transac0on
Add 
Linked 
List 
Item 
+ 
Update 
Pointers
Add 
Linked 
List 
Item 
+ 
Update 
Pointers 
Locked
Add 
Linked 
List 
Item 
+ 
Update 
Pointers 
Locked
Add 
Linked 
List 
Item 
+ 
Update 
Pointers 
Locked
Add 
Linked 
List 
Item
Add 
Linked 
List
Add 
Linked 
List
Add 
Linked 
List
Add 
Pointers 
Locked
Batch 
Writes 
• Mul0ple 
CREATE/MERGE 
statements 
per 
request 
– Good 
for 
integra0on 
with 
backend 
systems 
• Queue 
– Good 
for 
small, 
online 
transac0ons
Single-­‐Threaded 
Queue 
Write 
Write 
Write 
Queue 
Single 
Thread 
Batch
Queue 
Loca0on 
Op0ons 
Applica0on 
Applica0on
Benefits 
of 
Batched 
Writes 
• Less 
transac0onal 
state 
management 
– Create/commit 
per 
batch 
rather 
than 
per 
write 
• No 
conten0on 
for 
locks 
– No 
deadlocks 
• Query 
consolida0on 
– Reduce 
the 
amount 
of 
work 
inside 
the 
database
Query 
Consolida0on 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sam 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
MATCH sam 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Duplicate 
Lookups 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sam 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
MATCH sam 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Duplicate 
Lookups 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sam 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
MATCH sam 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Duplicate 
Lookups 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Duplicate 
Lookups 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Unnecessary 
Writes 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Unnecessary 
Writes 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address1 
CREATE address2 
DELETE address1 
CREATE sam-[:LIVES_AT]-address2
Eliminate 
Unnecessary 
Writes 
MATCH sam 
MATCH jenny 
CREATE sam-[:KNOWS]-jenny 
MATCH sarah 
CREATE sam-[:KNOWS]-sarah 
CREATE address2 
CREATE sam-[:LIVES_AT]-address2
Tradeoff 
Latency 
Higher 
throughput 
In-­‐memory 
or 
durable 
queues? 
• Lost 
writes 
in 
event 
of 
crash 
• Transac0onal 
dequeue?
Further 
Reading 
hp://maxdemarzi.com/2013/09/05/scaling-­‐writes/ 
hp://maxdemarzi.com/2014/07/01/scaling-­‐concurrent-­‐writes-­‐in-­‐neo4j/
Hardware
Memory 
• SLC 
(single-­‐level 
cell) 
SSD 
w/SATA 
• Lots 
of 
RAM 
– 8-­‐12G 
heap 
– Explicitly 
memory-­‐map 
store 
files
Object 
Cache 
• 2G 
for 
12G 
heap 
• No 
object 
cache 
– consistent 
throughput 
at 
expense 
of 
latency
AWS 
• HVM 
(hardware 
virtual 
machine) 
over 
PV 
(paravirtual) 
• EBS-­‐op0mized 
instances 
• Provisioned 
IOPS

Más contenido relacionado

La actualidad más candente

High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
Albert Chen
 

La actualidad más candente (18)

HA Deployment Architecture with HAProxy and Keepalived
HA Deployment Architecture with HAProxy and KeepalivedHA Deployment Architecture with HAProxy and Keepalived
HA Deployment Architecture with HAProxy and Keepalived
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat Gulec
 
Apache Traffic Server
Apache Traffic ServerApache Traffic Server
Apache Traffic Server
 
NGINX: High Performance Load Balancing
NGINX: High Performance Load BalancingNGINX: High Performance Load Balancing
NGINX: High Performance Load Balancing
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
ReplacingSquidWithATS
ReplacingSquidWithATSReplacingSquidWithATS
ReplacingSquidWithATS
 
Mysql8 advance tuning with resource group
Mysql8 advance tuning with resource groupMysql8 advance tuning with resource group
Mysql8 advance tuning with resource group
 
Large scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloudLarge scale near real-time log indexing with Flume and SolrCloud
Large scale near real-time log indexing with Flume and SolrCloud
 
[India Merge World Tour] Meru Networks
[India Merge World Tour] Meru Networks[India Merge World Tour] Meru Networks
[India Merge World Tour] Meru Networks
 
HBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded DataHBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded Data
 
HAProxy scale out using open source
HAProxy scale out using open sourceHAProxy scale out using open source
HAProxy scale out using open source
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
Kea DHCP – the new open source DHCP server from ISC
Kea DHCP – the new open source DHCP server from ISCKea DHCP – the new open source DHCP server from ISC
Kea DHCP – the new open source DHCP server from ISC
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
Caching reboot: javax.cache & Ehcache 3
Caching reboot: javax.cache & Ehcache 3Caching reboot: javax.cache & Ehcache 3
Caching reboot: javax.cache & Ehcache 3
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with Nginx
 

Similar a GraphConnect 2014 SF: From Zero to Graph in 120: Scale

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
nodejs_at_a_glance.ppt
nodejs_at_a_glance.pptnodejs_at_a_glance.ppt
nodejs_at_a_glance.ppt
WalaSidhom1
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
Saltlux Inc.
 

Similar a GraphConnect 2014 SF: From Zero to Graph in 120: Scale (20)

How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case
 
Spring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloudSpring Cloud: API gateway upgrade & configuration in the cloud
Spring Cloud: API gateway upgrade & configuration in the cloud
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStack
 
From Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata SingaporeFrom Kafka to BigQuery - Strata Singapore
From Kafka to BigQuery - Strata Singapore
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
Java colombo-deep-dive-into-jax-rs
Java colombo-deep-dive-into-jax-rsJava colombo-deep-dive-into-jax-rs
Java colombo-deep-dive-into-jax-rs
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Proxysql sharding
Proxysql shardingProxysql sharding
Proxysql sharding
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2Swift distributed tracing method and tools v2
Swift distributed tracing method and tools v2
 
nodejs_at_a_glance.ppt
nodejs_at_a_glance.pptnodejs_at_a_glance.ppt
nodejs_at_a_glance.ppt
 
SenchaCon 2016: Upgrading an Ext JS 4.x Application to Ext JS 6.x - Mark Linc...
SenchaCon 2016: Upgrading an Ext JS 4.x Application to Ext JS 6.x - Mark Linc...SenchaCon 2016: Upgrading an Ext JS 4.x Application to Ext JS 6.x - Mark Linc...
SenchaCon 2016: Upgrading an Ext JS 4.x Application to Ext JS 6.x - Mark Linc...
 
Web Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC ProjectWeb Scale Reasoning and the LarKC Project
Web Scale Reasoning and the LarKC Project
 
Getting started with Apollo Client and GraphQL
Getting started with Apollo Client and GraphQLGetting started with Apollo Client and GraphQL
Getting started with Apollo Client and GraphQL
 
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
OSDC 2015: Mitchell Hashimoto | Automating the Modern Datacenter, Development...
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBase
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
Coherence RoadMap 2018
Coherence RoadMap 2018Coherence RoadMap 2018
Coherence RoadMap 2018
 
TIAD : Automating the modern datacenter
TIAD : Automating the modern datacenterTIAD : Automating the modern datacenter
TIAD : Automating the modern datacenter
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 

Más de Neo4j

Más de Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Último

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Último (20)

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

GraphConnect 2014 SF: From Zero to Graph in 120: Scale

  • 1. SAN FRANCISCO | 10.22.2014 Scaling Neo4j Applica0ons @iansrobinson
  • 2. The Burden of Success • More users • Larger datasets • More concurrent requests • More complex queries
  • 3. Scaling is a Feature • It doesn’t come for free • Condi0ons of success: – Understand current needs • Design for an order of magnitude growth – Itera0ve and incremental development – Unit tests • Bedrock of asserted behaviour – Performance tests
  • 4. Overview • Scaling Reads – Latency – Throughput • Scaling Writes • Hardware
  • 6. Query Latency latency = f(search_area)
  • 7. Query Latency latency = f(search_area)
  • 8. Query Latency latency = f(search_area)
  • 9. Query Latency latency = f(search_area)
  • 10. Query Latency latency = f(search_area)
  • 11. Query Latency latency = f(search_area)
  • 12. Search Area search_area = f(domain_invariants)
  • 13. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends
  • 14. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends
  • 15. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base
  • 16. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base
  • 17. Reducing Read Latency • The Blackadder solu0on
  • 18. Reducing Read Latency • The Blackadder solu0on • Improve the Cypher query • Change the model • Use an Unmanaged Extension
  • 19. Improve Cypher Query • Small queries, separated by WITH • Start from low-­‐cardinality nodes hp://thought-­‐bytes.blogspot.co.uk/2013/01/op0mizing-­‐neo4j-­‐cypher-­‐queries.html hp://wes.skeweredrook.com/pragma0c-­‐cypher-­‐op0miza0on-­‐2-­‐0-­‐m06/
  • 20. Change the Model Goal Do less work (in the query) – By exploring less of the graph How? Iden0fy inferred rela-onships – Replace with use-­‐case specific shortcuts
  • 21. Change the Model -­‐ From MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)
  • 22. Change the Model -­‐ From MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)
  • 23. Change the Model -­‐ To MATCH (:Person{username:'ben'}) -[:WORKED_WITH]- (colleague:Person)
  • 24. Tradeoff More expensive writes More data Cheaper reads When to add the new rela0onship? • With tx • Queue for subsequent tx • Periodic/batch
  • 25. Refactor Exis0ng Data MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)
  • 26. Select Batch MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r) Batch size
  • 27. Add New Rela0onship MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)
  • 28. Con0nue While count(r) > 0 MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)
  • 29. Use Unmanaged Extensions /db/data/cypher /my-extension/service REST API Extensions
  • 30. RESTful Resource @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }
  • 31. JAX-­‐RS Annota0ons @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }
  • 32. Inject Database/Cypher Execu0on Engine @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }
  • 33. 1. Get Close to the Data Applica0on MATCH MATCH CREATE DELETE MERGE MATCH Single request, many opera0ons – Reduce network latencies
  • 34. 2. Mul0ple Implementa0on Op0ons REST API Extensions Cypher Traversal Framework Graph Algo Package Core API
  • 35. 3. Control Request/Response Format { users: [ { id: 1234}, { id: 9876} ] } JSON, CSV, protobuf, etc 1a 03 08 96 01 Domain-­‐specific representa0ons – Compact – Conserve bandwidth
  • 36. 4. Control HTTP Headers GET /my-extension/service/top-10 Applica0on Reverse Proxy HTTP/1.1 200 OK Cache-Control: max-age=60
  • 37. 5. Integrate with Backend Systems Applica0on REST API Extensions RDBMS LDAP
  • 38. Migra0ng to Extensions • Re-­‐implement original query inside extension • Modify request/response formats and headers • Refactor implementa0on to use lower parts of the stack where necessary • Measure, measure, measure
  • 39. Scaling Reads -­‐ Throughput
  • 40. Scale Horizontally For High Read Throughput Applica0on
  • 41. Scale Horizontally For High Read Throughput Applica0on Load Balancer Master Slave Slave
  • 42. Scale Horizontally For High Read Throughput Applica0on Read Load Balancer Write Load Balancer Master Slave Slave
  • 43. Configure HAProxy as Read Load Balancer global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 check listen admin bind *:8080 stats enable
  • 44. Configure HAProxy as Read Load Balancer global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 check listen admin bind *:8080 stats enable Master 404 Not Found false Slave 200 OK true 404 Not Found UNKNOWN Unknown
  • 45. This Isn’t The Throughput You Were Looking For Applica0on MATCH (c:Country{name:'NZAaoumrsbwtairaya'l}i)a.'.}.) ... Load Balancer 1 2 3
  • 46. Cache Sharding Using Consistent Rou0ng Applica0on Load Balancer 1 2 3 NZAaoumrsbwtairaya'l}i)a.'.}.) ... A-­‐I 1 J-­‐R 2 S-­‐Z 3 MATCH (c:Country{name:'BJZraiapmzabinal'b'}w})e)'..}..).. ..
  • 47. Configure HAProxy for Cache Sharding global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32 listen admin bind *:8080 stats enable
  • 48. Configure HAProxy for Cache Sharding global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32 listen admin bind *:8080 stats enable
  • 49. Scaling Writes -­‐ Throughput
  • 50. Factors Impac0ng Write Performance • Managing transac0onal state – Crea0ng and commilng are expensive opera0ons • Contending for locks – Nodes and rela0onships
  • 51. Improving Write Throughput • Delay taking expensive locks • Batch/queue writes
  • 52. Delay Expensive Locks • Iden0fy contended nodes • Involve them as late as possible in a transac0on
  • 53. Add Linked List Item + Update Pointers
  • 54. Add Linked List Item + Update Pointers Locked
  • 55. Add Linked List Item + Update Pointers Locked
  • 56. Add Linked List Item + Update Pointers Locked
  • 62. Batch Writes • Mul0ple CREATE/MERGE statements per request – Good for integra0on with backend systems • Queue – Good for small, online transac0ons
  • 63. Single-­‐Threaded Queue Write Write Write Queue Single Thread Batch
  • 64. Queue Loca0on Op0ons Applica0on Applica0on
  • 65. Benefits of Batched Writes • Less transac0onal state management – Create/commit per batch rather than per write • No conten0on for locks – No deadlocks • Query consolida0on – Reduce the amount of work inside the database
  • 66. Query Consolida0on MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2
  • 67. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2
  • 68. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2
  • 69. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2
  • 70. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2
  • 71. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2
  • 72. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2
  • 73. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address2 CREATE sam-[:LIVES_AT]-address2
  • 74. Tradeoff Latency Higher throughput In-­‐memory or durable queues? • Lost writes in event of crash • Transac0onal dequeue?
  • 75. Further Reading hp://maxdemarzi.com/2013/09/05/scaling-­‐writes/ hp://maxdemarzi.com/2014/07/01/scaling-­‐concurrent-­‐writes-­‐in-­‐neo4j/
  • 77. Memory • SLC (single-­‐level cell) SSD w/SATA • Lots of RAM – 8-­‐12G heap – Explicitly memory-­‐map store files
  • 78. Object Cache • 2G for 12G heap • No object cache – consistent throughput at expense of latency
  • 79. AWS • HVM (hardware virtual machine) over PV (paravirtual) • EBS-­‐op0mized instances • Provisioned IOPS