Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Understanding Graph Databases with Neo4j and Cypher

1.413 visualizaciones

Publicado el

Inroduction of Grapgh database concepts, explained by comparing the widely popular relational databases and the the sql query language. Neo4j and cypher is used to describe how graph databases work in real life

Publicado en: Educación
  • Sé el primero en comentar

Understanding Graph Databases with Neo4j and Cypher

  1. 1. Understanding Graph Databases with Neo4j and Cypher Group Members S.S. Niranga MS-14901836 Nipuna Pannala MS-14902208 Ruhaim Izmeth MS-14901218
  2. 2. Trends in Data Data is getting bigger: “Every 2 days we create as much information as we did up to 2003” – Eric Schmidt, Google
  3. 3. The History of Graph Theory ● 1736: Leonard Euler writes a paper on the “Seven Bridges of Konisberg” ● 1845: Gustav Kirchoff publishes his electrical circuit laws ● 1852: Francis Guthrie poses the “Four Color Problem” ● 1878: Sylvester publishes an article in Nature magazine that describes graphs ● 1936: Dénes Kőnig publishes a textbook on Graph Theory ● 1941: Ramsey and Turán define Extremal Graph Theory ● 1959: De Bruijn publishes a paper summarizing Enumerative Graph Theory ● 1959: Erdos, Renyi and Gilbert define Random Graph Theory ● 1969: Heinrich Heesch solves the “Four Color” problem ● 2003: Commercial Graph Database products start appearing on the market
  4. 4. What is Graph database? “A traditional relational database may tell you the average age of everyone in this room.. ..but a graph database will tell you who is most likely to buy you a beer!”
  5. 5. What does a Graph database look like?
  6. 6. What is a Graph Database? ● A database with an explicit graph structure ● Each node knows its adjacent nodes ● As the number of nodes increases, the cost of a local step (or hop) remains the same ● Plus an Index for lookups
  7. 7. Compared to Relational Databases Optimized for aggregation Optimized for connections
  8. 8. Complexity Vs Size
  9. 9. What to Choose? http://db-engines.com/en/ranking/graph+dbms
  10. 10. What is Neo4j? ● Neo4j is an open-source graph database, implemented in Java. ● Neo4j version 1.0 was released in February, 2010. ● Neo4j version 2.0 was released in December, 2013 ● Neo4j was developed by Neo Technology, Inc. ● Neo Technology board of directors consists of Rod Johnson, (founder of the Spring Framework), Magnus Christerson (Vice President of Intentional Software Corp), Nikolaj Nyholm (CEO of Polar Rose), Sami Ahvenniemi (Partner at Conor Venture Partners) and Johan Svensson (CTO of Neo Technology).
  11. 11. Entities in Graph DBs (Neo4j) ● Nodes ● Relationships ● Properties ● Labels ● Paths ● Traversal ● Schema (index and constraints)
  12. 12. Neo4j Properties Ex.
  13. 13. Ex. Neo4j Labels
  14. 14. Ex. Neo4j Nodes
  15. 15. Neo4j Relationships Ex.
  16. 16. Neo4j Paths Ex.
  17. 17. Introducing - Cypher Query Language for Neo4j
  18. 18. Relational Schema Person p_id p_name Book b_id b_title p_type Wrote p_id b_id Purchased p_id b_id pur_date
  19. 19. Cypher - Few Keywords General Clauses ● Return ● Order by ● Limit Writing Clauses ● Create ● Merge ● Set ● Delete ● Remove Reading Clauses ● Match ● Optional Match ● Where ● Aggregation Functions ● Predicates ● Scalar functions ● Collection functions ● Mathematical functions ● String functions See Full list at Cypher RefCard http://neo4j.com/docs/stable/cypher-refcard/
  20. 20. Cypher Demo http://console.neo4j.org/ or if Neo4j is locally installed http://localhost:7474
  21. 21. Cypher Creating nodes CREATE (:Person) CREATE (:Person { name:"John Le Carre" }) CREATE ({ name:"John Le Carre" }) CREATE (:Person:Author { name:"John Le Carre" }) CREATE (:Person:Author { name:"Graham Greene" }), (:Book { title:"Tinker, Tailor, Soldier, Spy" }), (:Book { title:"Our Man in Havana" }), (:Person { name:"Ian" }), (:Person { name:"Alan" })
  22. 22. Cypher Modifying nodes MATCH (p:Person { namme:"Alan" }) SET p += {name2 : "Alan2"} MATCH (p:Person { namme:"Alan" }) SET p.name = "Alan" MATCH (p:Person { namme:"Alan" }) SET p = {name : "Alan"} CREATE (:Person { namme:"Alan" }) MATCH (p:Person { name2:"Alan2" }) DELETE p MATCH (p:Person { namme:"Alan" }) REMOVE p.namme
  23. 23. Cypher Relationships
  24. 24. Cypher - Creating Relationships CREATE (john:Person:Author { name:"John Le Carre" }), (b:Book { title:"Tinker, Tailor, Soldier, Spy" }), (john)-[:WROTE]->(b) MATCH (p:Person { name:"Ian" }), (b:Book { title:"Our Man in Havana" }) MERGE (p)-[:PURCHASED { date:"09-09-2011" }]->(b) MATCH (graham:Person:Author { name:" Graham Greene" }), (b:Book { title:"Our Man in Havana" }) MERGE (graham)-[:WROTE]-> (b) MATCH (t:Book { title:"Tinker, Tailor, Soldier, Spy" }), (i:Person { name:"Ian" }), (a:Person { name:"Alan" }) MERGE (i)-[:PURCHASED { date:"03-02-2011" }]->(t)<-[:PURCHASED { date:"05-07-2011" }]-(a)
  25. 25. Cypher - Modifying Relationships MATCH (graham:Person:Author { name:"Graham Greene" }), (b:Book { title:"Our Man in Havana" }) MERGE (graham)-[:WORTE]->(b) MATCH (graham:Person {name:"Graham Greene"})-[r]->(b:Book {title:"Our Man in Havana" }) DELETE r MATCH (p:Person { name:"Ian" })-[r]->(b:Book { title:"Our Man in Havana" }) SET r.date = "09-09-2012"
  26. 26. Cypher - Querying DBs Find All Books SQL SELECT * FROM Books Cypher Query MATCH (b:Book) RETURN b Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) Cypher Result +-----------------------------------------------+ | b | +-----------------------------------------------+ | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} | | Node[3]{title:"Our Man in Havana"} | +-----------------------------------------------+ 2 rows 2 ms
  27. 27. Cypher - Querying DBs Find All Authors SQL SELECT * FROM Person where p_type=” Author” Cypher Query MATCH (a:Author) RETURN a Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) Cypher Result +-------------------------------+ | a | +-------------------------------+ | Node[0]{name:"John Le Carre"} | | Node[1]{name:"Graham Greene"} | +-------------------------------+ 2 rows 8 ms
  28. 28. Cypher - Querying DBs Find All Authors and the Books written by them SQL SELECT p.p_name, b.b_title FROM Person p, Wrote w, Book b where p.p_type=”Author” and w.p_id = p.p_id and w.b_id = b.b_id Cypher Query Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) MATCH (a:Author)-[:WROTE]->(b: Book) RETURN a,b Cypher Result +-------------------------------------------------------------------------------+ | a | b | +-------------------------------------------------------------------------------+ | Node[0]{name:"John Le Carre"} | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} | | Node[1]{name:"Graham Greene"} | Node[3]{title:"Our Man in Havana"} | +-------------------------------------------------------------------------------+ 2 rows 12 ms
  29. 29. Cypher - Querying DBs Find Books written by Graham Greene SQL SELECT b.b_title FROM Person p, Wrote w, Book b where p.p_type=”Author” and w.p_id = p.p_id and w.b_id = b.b_id and p.name = “Graham Greene” Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) Cypher Query MATCH (a:Author)-[:WROTE]->(b: Book) WHERE a.name = 'Graham Greene' RETURN b Cypher Result +------------------------------------+ | b | +------------------------------------+ | Node[3]{title:"Our Man in Havana"} | +------------------------------------+ 1 row 13 ms
  30. 30. Cypher - Querying DBs Find names of all persons, the books they purchased and the date the purchase was made SQL SELECT p.p_name, pur.pur_date, b.b_title FROM Person p, Book b, Purchased pur WHERE pur.p_id=p.p_id and b. b_id = pur.b_id Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) Cypher Query MATCH (a)-[r:PURCHASED]->(b) RETURN a,r.date,b Cypher Result +-------------------------------------------------------------------------------------+ | a | r.date | b | +-------------------------------------------------------------------------------------+ | Node[4]{name:"Ian"} | "09-09-2011" | Node[3]{title:"Our Man in Havana"} | | Node[4]{name:"Ian"} | "03-02-2011" | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} | | Node[5]{name:"Alan"} | "05-07-2011" | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} | +-------------------------------------------------------------------------------------+ 3 rows
  31. 31. Cypher - Querying DBs Find how Graham Greene is related to Ian SQL I won’t attempt!!! Person (p_id, p_name, p_type) Wrote (p_id, b_id) Book (b_id, b_title ) Purchased (p_id, b_id, pur_date) Cypher Query MATCH (a:Author)-[r*]-(p:Person { name:'Ian' }) WHERE a.name = 'Graham Greene' RETURN a,r,p Cypher Result +--------------------------------------------------------------------------------------------------------+ | a | r | p | +--------------------------------------------------------------------------------------------------------+ | Node[1]{name:"Graham Greene"} | [:WROTE[1] {},:PURCHASED[0] {date:"09-09-2011"}] | Node[4]{name:"Ian"} | +--------------------------------------------------------------------------------------------------------+ 1 row 38 ms
  32. 32. Support for Graph Algorithms ● shortestPath ● allSimplePaths ● allPaths ● dijkstra (optionally with cost_property and default_cost parameters)
  33. 33. Neo4j - Default locking behavior for Concurrency ● When adding, changing or removing a property on a node or relationship a write lock will be taken on the specific node or relationship. ● When creating or deleting a node a write lock will be taken for the specific node. ● When creating or deleting a relationship a write lock will be taken on the specific relationship and both its nodes.
  34. 34. Neo4j - Performance ● As JVM runs on a shared environment, the way the JVM is configured greatly related to Performance. ● More optimized for querying than CRUD operations, Batch updates are recommended ● Indexes can be set on nodes, relationships and their properties. Can boost query response times ● Mixed reports on querytimes and performance, upcoming releases are optimizing this.
  35. 35. Neo4j Capacity - Data size In Neo4j, data size is mainly limited by the address space of the primary keys for Nodes, Relationships, Properties and Relationship types. Currently, the address space is as follows: nodes 2^35 (∼ 34 billion) relationships 2^35 (∼ 34 billion) properties 2^36 to 2^38 depending on property types (maximum ∼ 274 billion, always at least ∼ 68 billion) relationship types 2^15 (∼ 32 000)
  36. 36. Calling Neo4j JVM Server Neo4j DB Java Application Web Application Web REST API Java API Officially supported languages ● Java ● .NET ● JavaScript ● Python ● Ruby ● PHP
  37. 37. Neo4j Editions Enterprise Enterprise Lock Manager High Performance Cache Clustering Hot Backups Advanced Monitoring NOT FREE Community FREE OPEN SOURCE
  38. 38. If you’ve ever ● Joined more than 7 tables together ● Modeled a graph in a table ● Written a recursive CTE (Common Table Expression) ● Tried to write some crazy stored procedure with multiple recursive self and inner joins You should use Neo4j
  39. 39. Disadvantages ● JVM should configured properly to get the optimal performance. ● Neo4j DB cannot be distributed. They should replicated. ● Inappropriate for transactional information like accounting and banking.
  40. 40. Who use Neo4j?
  41. 41. Thank you !!!

×