From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Graph tour keynote 2019
1. Graph Tour Washington DC
#1 Database for Connected Data
Jeff Morris
Head of Product Marketing
jeff@neo4j.com
5/7/19
2. I’m still listening to a lot of graph-y books
Adjacent Possibilities Think in Maps Connecting with PeopleJPL Innovation
Uniqueness of Individuals Practice, Practice, Practice
Food
Journey
Space
Journey
Human Senses
InnovationStartups
4. Agenda
• Great Graph Stories are here in Washington DC
• State of the Graph in 2019
• Innovation Waves
• Looking ahead at Recommendations, AI and Graphs
5. Neo4j Is Helping The World To Make Sense of Data
5
ICIJ used Neo4j to uncover the
world’s largest journalistic
leak to date, The Panama
Papers
NASA uses Neo4j for a
“Lessons Learned” database
to improve effectiveness in
search missions in space
Neo4j is used to graph the
human body, map correlations,
identify cause & effect and
search for the cure for cancer
SAVING
DEMOCRACY
MISSION TO
MARS
CURING CANCER
6. In-Q-Tel’s Mission Economy
6
• Venture Capital sponsored by
National Intelligence
• Decomposes and reassembles
technology stacks into common
“genome” vocabulary
• Matches mission problems to
technology assemblies and
vendors
• Evaluates tech across
communications, Bio tech,
robotics, software, hardware, IoT
• Faster evaluations, better
innovations
9. 2.6 TB
11.5 million documents
Emails, Scanned Documents,
Bank Statements etc… Person
B
Bank US
Account
123
Person
A
Acme
Inc
Bank
Bahama
s
Address
XNODE
RELATIONSHIP
16. Business Problem
• Find relationships between people, corporations, accounts,
shell companies and offshore accounts
• Journalists are non-technical
• 2017 Leak from Appleby tax sheltering law firm matched
13.4 million account records with public business
registrations data from across Caribbean
Solution and Benefits
• Exposed tax sheltering practices of Apple, Nike
• Revealed hidden connections among politicians and nations,
like Wilbur Ross & Putin’s son in law
• Triggered government tax evasion investigations in US, UK,
Europe, India, Australia, Bermuda, Canada and Cayman
Islands within 2 days.
• Granted $1M endowment from Golden Globes’ HFPA
Background
• International Consortium of Investigative Journalists (ICIJ),
Pulitzer Prize winning journalists
• Fourth blockbuster investigation using Neo4j to reveal
connections in text-based, and account-based data leaked
from offshore law firms and government records about the
“1% Elite”
• Appends Neo4j-based, “Offshore Leaks Database”
ICIJ Paradise Papers INVESTIGATIVE JOURNALISM
Fraud Detection / Knowledge Graph16
17.
18.
19. Background
• US IT consulting firm helped US Army streamline
equipment deployments and maintenance spending
• Saving lives by improving the operational readiness
of Army equipment like tanks, radios, transports,
aircraft, weaponry, etc.
Business Problem
• Needed to modernize procurement, budget and
logistics processes for equipment & spare parts
• Millions of connections among a tank’s bill-of-
materials, for example
• Improve “what if” cost calculations when planning
missions and troop deployments
• Mainframe systems required over 60 man-hrs to
calculate changes… planning took too long.
Solution and Benefits
• 118M nodes & 185M relationships
• Shed cost estimation times by 88%
• Improved parts delivery timing and accuracy
• DBA labor required dropped by 77%
• Equipment TCO more predictable
• Safer soldiers
US Army / Calibre Systems Equipment Logistics
Parts Assembly & Equipment Maintenance19
21. 2000+
7/10
12/25
8/10
53K+
100+
300+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
• Creator of the Neo4j Graph Platform
• 250+ employees
• HQ in Silicon Valley, other offices include
London, Munich, Paris and Malmö Sweden
• $80M Series E led by Morgan Stanley &
One Peak.
• $160M total raised to date
• Over 20M+ downloads & container pulls
• 300+ enterprise subscription customers
with over half with >$1B in revenue
Ecosystem
Startup Program Alumni
Enterprise customers
Partners
Meet up members
Events per year
Neo4j - The Graph Company
2
1
The Industry’s Largest Dedicated Investment in Graphs
22. Networks of People Business Processes Knowledge Networks
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
Data connections are increasing as rapidly as data volumes
The Rise of Connections in Data
Electronic Networks
On-prem & cloud
computing, Cellular,
Telco & Internet, IoT,
Blockchain
23. CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
• Visibility security by user/role
Neo4j Invented the Labeled Property Graph Model
MARRIED TO
LIVES WITH
PERSON PERSON
23
24. 24
Graph Databases are Designed for Connected Data
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data Connections in data
Real time storage & retrieval Real-Time Connected Insights
Long running queries
aggregation & filtering
“Our Neo4j solution is literally thousands of times
faster than the prior MySQL solution, with queries that
require 10-100 times less code”
Volker Pacher, Senior Developer
Up to
3
Max
# of
hops
1 Millions
25. Internal & Confidential, Neo4j Inc.
25
Neo4j Graph Advantage: Foundational Components
1
2
3
4
5
6
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency, Type Safety
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
26. 26
Strongly Differentiated Commercial Offering
Enterprise Edition is Highly Differentiated from Community Open Source Edition
Date/Time data type1
✔ ✔
3D Geospatial data types1
✔ ✔
Native String Indexes – up to 5x faster writes1
✔ ✔
100B+ Bulk Importer1
✔ Resumable1
Enterprise Cypher Runtime up to 70% faster – ✔
Hot Backups – 2x Faster1
ACID Transactions ✔ ✔
High-performance native API ✔ ✔
High-performance caching ✔ ✔
Cost-based query optimizer ✔ ✔
Graph algorithms library to support AI
initiatives ✔ ✔
Massively parallel graph algorithms – ✔
Query monitoring with enriched metrics – ✔
User and role-based security – ✔
LDAP and Active Directory Integration – ✔
Kerberos security option – ✔
Multi-Clustering
(partition of clusters)1 – ✔
Automatic Cache Warming1 – ✔
Rolling Upgrades1 – ✔
Resumable Copy/
Restore Cluster Member
– ✔
New diagnostic metrics
and support tools1 – ✔
Property Blacklisting – ✔
Language drivers for Java, Python, C# &
JavaScript ✔ ✔
Bolt Binary Protocol ✔ ✔
RPM, Debian, Docker, Azure & AWS Cloud
Delivery ✔ ✔
Intra-cluster encryption secures all traffic
across data centers and cloud zones
– ✔
IPv6 support in clustered deployments – Available
High throughput, least-connected load
balancing built into Bolt drivers
– ✔
Causal Clustering, core and read-replica design
at global scale for applications, analytics
workflows, HA and DR
– ✔
Enterprise Lock Manager accesses all cores on
server
– ✔
Labeled property graph model ✔ ✔
Native graph processing & storage ✔ ✔
Cypher graph query language ✔ ✔
Neo4j Browser with syntax highlighting ✔ ✔
Fast writes via native label indexes ✔ ✔
Composite Indexes ✔ ✔
Cypher for Apache Spark (CAPS) for big data
analytics ✔ ✔
Graph size limitations 34B nodes None
Auto reuse of deleted space – ✔
Property existence constraints – ✔
Cypher query tracing, monitoring and metrics – ✔
Node Key schema constraints – ✔
Neo4j Desktop: Free developer-friendly
package with full database and tools
– ✔
CommunityDatabase Features Architectural Features Graph Platform Features
1New in Neo4j 3.4
Enterprise Community Enterprise Community Enterprise
27. Neo4j Graph Platform Vision
27
Development &
Administration
Analytics
Tooling
BUSINESS
USERS
DEVELOPERS
ADMINS
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & Visualization
DATA
ANALYSTS
DATA
SCIENTISTS
Drivers & APIs
APPLICATIONS
AI
openCypherCloud
28. Development &
Administration
Analytics
Tooling
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & VisualizationDrivers & APIs
AI
Neo4j Database 3.4 & 3.5
• Full Text Search
• Native Indexes
(up to 5x faster writes)
• 100B+ bulk importer
Improved Admin Experience
• Rolling upgrades
• 2x faster backups
• Cache Warming on startup
• Improved diagnostics
Morpheus for Apache Spark
• Graph analytics in the data lake
• In-memory Spark graphs from
Apache Hadoop, Hive,
Gremlin and Spark
• Save graphs into Neo4j
• High-speed data exchange
between Neo4j & data lake
• Progressive analysis using
named graphs
Graph Data Science
• High speed graph
algorithms
Neo4j Bloom
• New graph illustration and
communication tool for non-
technical users
• Explore and edit graph
• Search-based
• Create storyboards
• Foundation for graph data
discovery
• Integrated with graph platform
Multi-Cluster routing built into Bolt drivers
• Date/Time data type
• 3-D Geospatial search
• Secure, Horizontal Multi-Clustering
• Property-value Security
The Neo4j Graph Platform
32. Graphs Are VERY Hungry for Data
Graphs’ appetite to connect more data accelerates the ability to find
adjacent innovations
Customer iteration cycles from 2 weeks to 3 months
34. 20M+
Downloads
8M+ from Neo4j Distribution
12M+ from Docker
Events
400+
Approximate Number of
Neo4j Events per Year
50k+
Meetups
Number of Meetup
Members Globally
50k+
Trained/certified Neo4j
professionals
1k Certified
Trained Developers
Largest Pool of Graph Technologists
35. Density Drives Value In Graphs
Metcalfe’s Law of the Network (V=n2)
5 hops < less Value
100’s of hops deliver
immense VALUE
36. "Neo4j continues to dominate the graph
database market.”
“69% of enterprises have, or are planning
to implement graphs over next 12 months”
October, 2017
“The most widely stated
reason in the survey for
selecting Neo4j was
to drive innovation”
February, 2018
Critical Capabilities for DBMA
“In fact, the rapid rise of Neo4j and other graph
technologies may signal that data
connectedness is indeed a separate paradigm
from the model consolidation happening across
the rest of the NoSQL landscape.”
March, 2018
Analysts See Unique Benefits of Graphs
"Neo4j is the clear market leader in the graph space. It has the
most users, it uses a widely adopted language that is much easier
than Gremlin and in many respects, it has consistently been a lot
more innovative than its competitors.”
“It is the Oracle or SQL Server of the graph database world.”
March, 2019
"Our research suggests that graph databases have the
best chance to survive and thrive as a distinct
category (versus the other NoSQL models) because
connected data applications present serious performance
problems that only a specialized graph DB can solve.”
March, 2019
37. Neo4j Has a Ten Year Head Start
Native Connectedness Differentiates Neo4j
Conceive
Code
Compute
Store
Non-Native Graph DBNative Graph DB RDBMS
Optimized for graph workloads
38. Graph Database Vendor Landscape
3
9
NEO4J SIGNIFICANTLY OUTPACES COMPETITION IN GRAPH LEADERSHIP & INVESTMENT, TECHNOLOGY
CAPABILITY, COMMUNITY BREADTH AND PRODUCT MATURITY
Graph Pioneer & Leader
Architectures optimized for
non-graph workloads.
Not easily adaptable for
graphs. Lack “minutes to
milliseconds” performance.
Few graph-expert resources
Nascent products fall
vastly short.
Graphs as a checkbox.
Slow performance.
Playing ‘catch-up,’ requiring
years to stabilize & grow.
Aggressive posture & claims
to secure PR.
Many fail the “kill -9” test
Graph pioneer & visionary.
Largest, most active community.
More customer successes than all
other vendors combined.
Strongest technology.
Diverse roadmap: cloud, DBaaS,
Spark, Algos for AI, GQL.
40. Highly Valuable Connected Data Use Cases
Drive Enterprise Adoption
41
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Identity & Access
Management
Knowledge
Graph
41.
42. Background
• Over 7M citizens suffer from Diabetes
• Connecting over 400 researchers
• Incorporates over 50 databases, 100k’s of Excel
workbooks, 30 database of biological samples
• Sought to examine disease from as many angles as
possible.
Business Problem
• Genes are connected by proteins or to metabolites,
and patients are connected with their diets, etc…
• Needed to improve the utilization of immensely
technical data
• Needed to cater to doctors and researchers with
simple navigation, communication and connections
of the graph.
Solution and Benefits
• Dr. Alexander Jarasch, Head of Bioinformatics and
Data Management
• Scientists can conduct parallel research without
asking the same questions or repeating tests
• Built views like a liver sample knowledge graph
DZD - German Center for Diabetes Research
Medical Genomic Research43
EE Customer since 2016
Q4
44. Background
• Fortune 100 heavy equipment manufacturer
• 27 Million warranty & service documents parsed
• Foundation for AI-based supply chain management
Business Problem
• Improve maintenance predictability
• Need a knowledge base for 27 million warranty
documents and maintenance orders
• Graphs gather context for AI to identify ‘prime
examples’ of connections among parts, suppliers,
customers and their mechanics anticipate when
equipment will need servicing and by whom.
Solution and Benefits
• Text to knowledge graph
• Common ontology for complaints, symptoms & parts
• Anticipates when equipment will need servicing
• Improves customer and brand satisfaction
• Maximizes lifespan and value of equipment
Caterpillar Heavy Equipment Manufacturing
Parts Assembly & Equipment Maintenance45
46. Background
• Social network of 10M graphic artists
• Peer-to-peer evaluation of art and works-in-progress
• Job sourcing site for creatives
• Massive, millions of updates (reads & writes) to
Activity Feed
• 150 Mongos to 48 Cassandras to 3 Neo4j’s!
Business Problem
• Artists subscribe, appreciate and curate “galleries”
of works of their own and from other artists
• Activities Feed is how everyone receives updates
• 1st implementation was 150 MongoDB instances
• 2nd implementation shrunk to 48 Cassandras, but it
was still too slow and required heavy IT overhead
Solution and Benefits
• 3rd implementation shrunk to 3 Neo4j instances
• Saved over $500k in annual AWS fees
• Reduced data footprint from 50TB to 40GB
• Significantly easier to introduce new features like,
“New projects in you Network”
Adobe Behance Social Network of 10M Graphic Artists
Social Network47
EE Customer since 2016
Q4
51. Background
• Largest Cable TV & Internet Provider in US
• 3rd Largest network on the planet
• xFi is consumer experience in 3M houses
• Internet, router, devices, security, voice & telephony
• Transformational customer experience
Business Problem
• Integrate all experience in a smart home
• Create innovative ideas based on cross-platform
and household member preferences
• Add integrated value of xFinity triple play & quad-
play services (internet, VoIP, cable TV & home
security)
Solution and Benefits
• Custom content per household member
• Security reminders (kids are home, garage left open)
• Serves millions of households
• Makes content recommendations based on
occupant, time of day, permissions and preferences
• Has Siri-like voice commands
COMCAST Xfinity xFi TELECOMMUNICATIONS
Smart Home / Internet of Things52
EE Customer since 2016
Q4
53. Common Graph Entities are Analog
People
Locations
Processes
Devices
Objects
Motives
• Who – People
• What – Activities & Events
• Where – Locations
• When – Time
• Why – Motives & Feelings
• How – Processes, Devices &
Networks
Activities
54. The Whiteboard Model Is the Physical Model
55
Ideation is an analog
activity
• Easily understood
• Easily evolved
• Easy collaboration
between business
and IT
57. Graphs Drive Innovation
58
Context Paths
Auto-Graphs
Graph Layers
1st Order Graph
Cross-Connect
Cross-tech applications
Internet of Things operations
Transparent Neural
Networks
Blockchain-managed
systems
Adjacent graph layers inspire
new innovations
Metadata / Risk Management
Knowledge Graphs
AI- Powered Customer
Experiences
Connect unlike objects such
as people to products,
locations
Mobile app explosion
Recommendation engines
Fraud detectors
Desire for more context to
follow connections
Extract properties during
traversals
Connects like objects
People, computer networks,
telco, etc
58. Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse)
MARRIED_TO
Dan Ann
NODE RELATIONSHIP TYPE
LABEL PROPERTY VARIABLE
59. 60
The GQL Manifesto: https://gql.today/
• Introduced in May 2018: https://gql.today/
• An initiative to immediately
rally support for a unified
Graph Query Language
• Standards meetings are ongoing
• All community members
are encouraged to Vote
their support at
https://gql.today/#vote
63. Graph & ML Algorithms in Neo4j
+35
neo4j.com/
graph-
algorithms-
book/
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route
availability and quality
Determines the
importance of distinct
nodes in the network
Detects group
clustering or partition
options
Evaluates how
alike nodes are
Estimates the
likelihood of nodes
forming a future
relationship
Similarity
64. Graph and ML Algorithms in Neo4j
• Parallel Breadth First Search &
DFS
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-
Step
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
neo4j.com/docs/
graph-algorithms/current/
Updated April 2019
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
65. Graph Analytics:
SparkCypher & Morpheus
Objective: Draw new users from the Spark ecosystem
to graphs & Neo4j
(Also bolsters Cypher as the de-facto query language)
71. 1. Knowledge Graphs
Context for Decisions
2. Connected
Feature Extraction
Context for Credibility
4. AI Explainability3. Graph-
Accelerated AI
Context for Efficiency
Context for Accuracy
Four Pillars of Graph-Enhanced AI
73. Data Network Effect
“A product, generally powered by machine learning, becomes smarter
as it gets more data from your users. The more users use your product,
the more data they contribute; the more data they contribute, the
smarter your product becomes.”
— Matt Turck