2. Why do we need to tune?
‣ No query planner is ever perfect
‣ You know your domain better than the
database
2
3. The planners - the Rule planner
‣ This is the original planner
‣ It consists of rules that use the indexes to
produce the query execution plan
‣ All write queries only use the Rule planner
3
4. The planners - the Cost planner
‣ This is our new, cost-based planner
‣ Introduced in 2.2.0
‣ It uses the statistics service in Neo4j to
assign costs to various query execution
plans, picking the cheapest one
‣ All read-only queries use this by default
4
5. Configuring a planner
‣ Read-only queries can still be run with Rule
• Prepend query with CYPHER planner=rule
• Set dbms.cypher.planner to RULE
‣ http://neo4j.com/docs/stable/how-are-queries-executed.html 5
7. How do I view a query plan?
‣ EXPLAIN
• shows the execution plan without actually
executing it or returning any results.
‣ PROFILE
• executes the statement and returns the results
along with profiling information.
7
11. What is our goal?
At a high level, the goal is
simple: get the number of
db hits down.
11
12. an abstract unit of storage
engine work.
What is a database hit?
“
”
12
13. ‣ Operators to look out for
• All nodes scan expensive
• Label scan cheaper
• Node index seek cheapest
• Node index scan used for range queries
‣ http://neo4j.com/docs/2.3.0-M03/execution-plans.html
Execution plan operators
13
19. Finding The Matrix
MATCH (movie
{title: "The Matrix"})
RETURN movie
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
19
20. Tip: Use indexes and constraints
‣ Indexes for non unique values
‣ Constraints for unique values
CREATE INDEX ON :Movie(title)
CREATE INDEX ON :Person(name)
CREATE CONSTRAINT ON (g:Genre)
ASSERT g.name IS UNIQUE
20
21. How does Neo4j use indexes?
‣ Indexes are only used to find the starting
point for queries.
21
22. How does Neo4j use indexes?
‣ Indexes are only used to find the starting
point for queries.
Use index scans to look up
rows in tables and join them
with rows from other tables
Use indexes to find the starting
points for a query.
Relational
Graph
22
23. Tip: Use indexes and constraints
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
23
24. Finding The Matrix
(no index)
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
(index)
MATCH (movie:Movie
{title: "The Matrix"})
RETURN movie
24
25. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
25
26. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
26
27. Tip: Enforce index usage
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
27
28. Tip: Enforce index usage
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
28
29. Actors who appeared together
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
RETURN COUNT(*)
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()<-[:ACTS_IN]-
(b:Person {name:"Meg Ryan"})
USING INDEX a:Person(name)
USING INDEX b:Person(name)
RETURN COUNT(*)
29
30. Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
30
31. Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
31
32. Tip: Reduce cardinality of WIP
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title 32
33. Tip: Reduce cardinality of WIP
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title 33
34. MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-(coActor)
WITH DISTINCT coActor
MATCH (coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title
Tom Hanks’ colleagues’ movies
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m1)<-[:ACTS_IN]-
(coActor)-[:ACTS_IN]->(m2)
RETURN distinct m2.title;
34
35. Counting number of movies
MATCH (n:Actor)-[:ACTS_IN]->()
RETURN n, COUNT(*) AS count
ORDER BY count DESC
35
36. Counting number of movies
MATCH (n:Actor)-[:ACTS_IN]->()
RETURN n, COUNT(*) AS count
ORDER BY count DESC
36
37. Tip: Use `SIZE` for fast counting
MATCH (n:Actor)
RETURN n,
SIZE((n)-[:ACTS_IN]->())
AS count
ORDER BY count DESC
37
38. Counting number of movies
MATCH (n:Actor)-[:ACTS_IN]->()
RETURN n, COUNT(*) AS count
ORDER BY count DESC
MATCH (n:Actor)
RETURN n,
SIZE((n)-[:ACTS_IN]->()) AS count
ORDER BY count DESC
38
39. Hints
USING INDEX
• Force the use of a specific index
MATCH (a:Person {name:"Tom Hanks"})
-[:ACTS_IN]->()
USING INDEX a:Person(name)
RETURN count(*)
‣ http://neo4j.com/docs/2.3.0-M03/query-using.html 39
40. Hints
USING SCAN
• Forces a label scan on lower cardinality labels
MATCH (a:Actor)-->(m:Movie:Comedy)
USING SCAN m:Comedy
RETURN count(distinct a)
40
43. Use parameters
MATCH (p:Person {name: {name}})
-[:ACTS_IN]->(m)
RETURN m.title
MATCH (p:Person {name:"Tom Hanks"})
-[:ACTS_IN]->(m)
RETURN m.title
43
44. Avoid Cartesian products
‣ Easy to do this inadvertently:
MATCH (a:Actor), (m:Movie)
RETURN count(a), count(m)
‣ This is correct, and performs better
MATCH (a:Actor)
WITH count(a) as a_count
MATCH (m:Movie)
RETURN a_count, count(m) 44
47. Only RETURN what you need
‣ This is not recommended:
MATCH (a:Actor)
RETURN a
‣ Use this instead:
MATCH (a:Actor)
RETURN a.name, a.birthdate, a.height
47
48. Keep it short ‘n sweet
‣ Keep queries as short as possible
• Better to have many, smaller queries than one
larger one
‣ Keep read and write queries separate
• If not, only the RULE planner will be used
48
49. tl;dr
‣ View query plans with EXPLAIN and PROFILE
‣ Use labels
‣ Index your starting points
‣ Reduce work in progress
‣ Use SIZE for fast relationship counting
‣ Remember the hints
49
50. Thanks for coming
‣ And don’t forget, if the tips aren’t working
ask us for help on Stack Overflow
Mark Needham @markhneedham
Petra Selmer @Aethelraed
50