Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

2.994 visualizaciones

Publicado el

In this talk we'll explore powerful analytic techniques for graph data. Firstly we'll discover some of the innate properties of (social) graphs from fields like anthropology and sociology. By understanding the forces and tensions within the graph structure and applying some graph theory, we'll be able to predict how the graph will evolve over time. To test just how powerful and accurate graph theory is, we'll also be able to (retrospectively) predict World War 1 based on a social graph and a few simple mechanical rules.
Then we'll see how graph matching can be used to extract online business intelligence (for powerful retail recommendations). In turn we'll apply these powerful techniques to modelling domains in Neo4j (a graph database) and show how Neo4j can be used to drive business intelligence.
Don't worry, there won't be much maths :-)

Publicado en: Tecnología, Educación

A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chicago 2013

  1. 1. A Little Graph Theory for theBusy DeveloperJim WebberChief Scientist, Neo Technology@jimwebber
  2. 2. Roadmap• Imprisoned data• Graph models• Graph theory– Local properties, global behaviors– Predictive techniques• Graph matching– Real-time analytics for fun and profit• Fin
  3. 3. http://www.flickr.com/photos/crazyneighborlady/355232758/
  4. 4. http://gallery.nen.gov.uk/image82582-.html
  5. 5. Aggregate-Oriented Datahttp://martinfowler.com/bliki/AggregateOrientedDatabase.html“There is a significant downside - the whole approach works really wellwhen data access is aligned with the aggregates, but what if you want tolook at the data in a different way? Order entry naturally stores orders asaggregates, but analyzing product sales cuts across the aggregate structure.The advantage of not using an aggregate structure in the database is that itallows you to slice and dice your data different ways for differentaudiences.This is why aggregate-oriented stores talk so much about map-reduce.”
  6. 6. complexity = f(size, connectedness, uniformity)
  7. 7. http://www.bbc.co.uk/london/travel/downloads/tube_map.html
  8. 8. Property graphs• Property graph model:– Nodes with properties– Named, directed relationships with properties– Relationships have exactly one start and end node• Which may be the same node
  9. 9. stolefromloveslovesenemyenemyA Good ManGoes to WarappearedinappearedinappearedinappearedinVictory ofthe Daleksappearedinappearedincompanioncompanionenemy
  10. 10. Property graphs are very whiteboard-friendly
  11. 11. http://blogs.adobe.com/digitalmarketing/analytics/predictive-analytics/predictive-analytics-and-the-digital-marketer/
  12. 12. http://en.wikipedia.org/wiki/File:Leonhard_Euler_2.jpgMeet Leonhard Euler• Swiss mathematician• Inventor of GraphTheory (1736)16
  13. 13. Königsberg (Prussia) - 1736
  14. 14. ABDC
  15. 15. ABDC1234765
  16. 16. http://en.wikipedia.org/wiki/Seven_Bridges_of_Königsberg20
  17. 17. Triadic Closurename: Kylename: Stan name: Kenny
  18. 18. Triadic Closurename: Kylename: Stan name: Kennyname: Kylename: Stan name: KennyFRIEND
  19. 19. Structural Balancename: Cartmanname: Craig name: Tweek
  20. 20. Structural Balancename: Cartmanname: Craig name: Tweekname: Cartmanname: Craig name: TweekFRIEND
  21. 21. Structural Balancename: Cartmanname: Craig name: Tweekname: Cartmanname: Craig name: TweekENEMY
  22. 22. Structural Balancename: Kylename: Stan name: Kennyname: Kylename: Stan name: KennyFRIEND
  23. 23. Structural Balance is a keypredictive techniqueAnd it’s domain-agnostic
  24. 24. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  25. 25. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  26. 26. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  27. 27. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  28. 28. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  29. 29. Allies and EnemiesUKGermanyFranceRussia ItalyAustria
  30. 30. Predicting WWI[Easley and Kleinberg]
  31. 31. Strong Triadic ClosureIt if a node has strong relationships to twoneighbours, then these neighbours must have atleast a weak relationship between them.[Wikipedia]
  32. 32. Triadic Closure(weak relationship)name: Kennyname: Stan name: Cartman
  33. 33. Triadic Closure(weak relationship)name: Kennyname: Stan name: Cartmanname: Kennyname: Stan name: CartmanFRIEND 50%
  34. 34. Weak relationships• Relationships can have “strength” as well asintent– Think: weighting on a relationship in a propertygraph• Weak links play another super-importantstructural role in graph theory– They bridge neighbourhoods
  35. 35. Local BridgesFRIENDname: Kennyname: Stanname: KyleFRIENDFRIENDname: Sallyname: Bebename: WendyFRIENDFRIEND 50%name: CartmanFRIENDENEMY
  36. 36. Local Bridge Property“If a node A in a network satisfies the StrongTriadic Closure Property and is involved in atleast two strong relationships, then any localbridge it is involved in must be a weakrelationship.”[Easley and Kleinberg]
  37. 37. University Karate Club
  38. 38. Graph Partitioning• (NP) Hard problem– Recursively remove the spanning links betweendense regions– Or recursively merge nodes into ever larger“subgraph” nodes– Choose your algorithm carefully – some are betterthan others for a given domain• Can use to (almost exactly) predict thebreak up of the karate club!
  39. 39. University Karate Clubs(predicted by Graph Theory)9
  40. 40. University Karate Clubs(what actually happened!)
  41. 41. Cypher• Declarative graph pattern matching language– “SQL for graphs”– Columnar results• Supports graph matching commands andqueries– Find me stuff like this…– Aggregation, ordering and limit, etc.
  42. 42. Firstname:MickeySurname: SmithDoB: 19781006SKU: 5e175641Product:BadgersNadgers AleSKU: 2555f258Product:Peewee PilsnerCategory: beerSKU: 49d102bcProduct: BabyDry NightsCategory:nappiesCategory: baby Category:alcoholicdrinksSKU: 49d102bcProduct: XBox360Category:consumerelectronicsCategory:consoleBOUGHTBOUGHTMEMBER_OFMEMBER_OFMEMBER_OFMEMBER_OFMEMBER_OF
  43. 43. Firstname: *Surname: *DoB: 1996 > x> 1972Category: beerCategory:nappiesBOUGHTCategory: gameconsole
  44. 44. Firstname: *Surname: *DoB: 1996 > x> 1972Category: beerCategory:nappies!BOUGHTCategory: gameconsole
  45. 45. (beer)(nappies)(console)(daddy)() ()()
  46. 46. Flatten the graph(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies)(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer)(daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)
  47. 47. Wrap in a Cypher MATCH clauseMATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),(daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)
  48. 48. Cypher WHERE clauseMATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),(daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)WHERE b is null
  49. 49. Full Cypher querySTART beer=node:categories(category=‘beer’),nappies=node:categories(category=‘nappies’),xbox=node:products(product=‘xbox 360’)MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer),(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies),(daddy)-[b?:BOUGHT]->(xbox)WHERE b is nullRETURN distinct daddy
  50. 50. Results==> +---------------------------------------------+==> | daddy |==> +---------------------------------------------+==> | Node[15]{name:"Rory Williams",dob:19880121} |==> +---------------------------------------------+==> 1 row==> 0 ms==>neo4j-sh (0)$
  51. 51. Facebook Graph SearchWhich sushi restaurants inNYC do my friends like?
  52. 52. Graph Structure
  53. 53. Cypher querySTART me=node:person(name = Jim),location=node:location(location=New York),cuisine=node:cuisine(cuisine=Sushi)MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant)-[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)RETURN restaurant
  54. 54. Search structure
  55. 55. What are graphs good for?• Recommendations• Pharmacology• Business intelligence• Social computing• Geospatial• MDM• Data center management• Web of things• Genealogy• Time series data• Product catalogue• Web analytics• Scientific computing• Indexing your slow RDBMS• And much more!
  56. 56. Free O’Reilly book foreveryone!http://graphdatabases.com
  57. 57. Thanks for listeningNeo4j: http://neo4j.orgMe: @jimwebber

×