2. Value from Data Relationships
Common Graph Database Use Cases
Internal Applications
Master Data Management
Network and
IT Operations
Fraud Detection
Customer-Facing Applications
Real-Time Recommendations
Graph-Based Search
Identity and
Access Management
4. MDM Solutions with Graph Databases
C
C
A AA
U
S S SS S
USER_ACCESS
CONTROLLED_BY
SUBSCRIBED _BY
User
Customers
Accounts
Subscriptions
VP
Staff Staff StaffStaff
DirectorStaffDirector
Manager Manager Manager Manager
Fiber
Link
Fiber
Link
Fiber
Link
Ocean
Cable
Switch Switch
Router Router
Service
Organizational
Hierarchy
Product
Subscriptions
CMDB
Network
Inventory
Social
Networks
5. MDM Isn’t Hierarchical
Typical MDM system structure …but MDM is really a network
Patient
Agent
G.P.Surgeon Partner
Insurance
Patient
AgentG.P.Surgeon
PartnerInsurance
6. Challenges with Current MDM Systems
Lack of support for non-hierarchical or matrix data relationships
• Master data is never strictly hierarchical
• Systems are designed for fixed top-down hierarchy
• Non-hierarchical data is not supported
Inability to unlock value from data relationships
• Systems store only very simple data relationships
• Complex relationships and links not stored
Inflexible and expensive to maintain
• Changes to the model are expensive and time-consuming
7. die Bayerische – Master Data Management
• Field sales unit needed easy access to policies
and customer data in variety of ways
• Growing business needed growing support
• Existing IBM DB2 system unable to meet
performance requirements as it scaled
• Needed 24/7 system for sales unit outside the
company
Mid-size
German insurer
Founded in 1858
More than
500 employees
Project executed
by Delvin GmbH,
subsidiary of
die Bayerische
Versicherung
8. die Bayerische SOLUTION
• Enables field sales unit to flexibly search
for insurance policies and personal data
• Raises the bar for insurance industry
practices
• Supports the business as it scales, with
great performance
• Ported metadata into Neo4j easily
9. Classmates – Social network
Online yearbook
connecting friends from
school, work and military
in US and Canada
Founded as
Memory Lane in Seattle
Develop new social networking capabilities to
monetize yearbook-related offerings
• Show all the people I know in a yearbook
• Show yearbooks my friends appear in most often
• Show sections of a yearbook that my friends
appear most in
• Show me other schools my friends attended
10. Classmates SOLUTION
Neo4j provides a robust and scalable graph
database solution
• 3-instance cluster with cache sharding
and disaster-recovery
• 18ms response time for top 4 queries
• 100M nodes and 600M relationships in
initial graph—including people, images,
schools, yearbooks and pages
• Projected to grow to 1B nodes and 6B
relationships
12. Network Graphs – Telco Example
PROBLEM
Need: Instantly diagnose problems in networks of 1B+ elements
But: Basing diagnosis solely on streaming machine data severely limits
accuracy and effectiveness
SOLUTION
Real-time graph analytics provide actionable insight for the largest
complex connected networks in the world
• The entire network lives in a graph
• Analyzes dependencies in real time
• Highly scalable with carrier-grade uptime requirements
14. Fraud Scenarios
Retail First Party Fraud
• Opening many lines of credit with no intention of paying back
• Accounts for $10B+ in annual losses at US banks(1)
Synthetic Identities and Fraud Rings
• Rings of synthetic identities committing fraud
Insurance – Whiplash for Cash
• Insurance scams using fake drivers, passengers and witnesses
• Increase network efficiency
eCommerce Fraud
• Online payment fraud
(1) Business Insider: http://www.businessinsider.com/how-to-use-social-networks-in-the-fight-against-first-party-
fraud-2011-3
17. Doing Connected Analysis is Challenging
• Large amounts of data and relationships
must be processed
• New data and relationships are continually
being added
• Fraud rings must be uncovered in
real-time to prevent fraud
18. Value
Effective in detecting some of the
most impactful attacks, even from
organized rings
Challenge
Extremely difficult with traditional
technologies
For example a ten-person fraud bust-out is $1.5M, assuming 100 false identities
and 3 financial instruments per identity, each with a $5K credit limit
Connected Analysis with Neo4j
19. Modeling a Fraud Ring as a Graph
Account
Holder
1
Account
Holder
2
Account
Holder
3
SSN
2
SSN
2
Phone
Numbe
r
2
Credit
Card
Address
1
Bank
Account
Bank
Account
Bank
Account
Phone
Numbe
r
2
Credit
Card
Unsecured
Loan
Unsecured
Loan
20. View of fraud ring
in a graph database
Modeling Insurance Fraud as a Graph
Accident
1
Accident
2
Person
1
Person
2
Person
3
Person
4
Person
5
Person
6
Car
1
Car
2
Car
3
Car
4
INVOLVES
DRIVES
REPRESENTS
WITNESSE
S
ADJUSTS
HEALS
21. Gartner’s Layered Fraud Prevention Approach (4)
(4) http://www.gartner.com/newsroom/id/1695014
Traditional Fraud Prevention
Analysis
of users
and their
endpoints
Analysis of
navigation
behavior and
suspect
patterns
Analysis of
anomaly
behavior by
channel
Analysis of
anomaly
behavior
correlated
across channels
Analysis of
relationships
to detect
organized crime
and collusion
Layer 1
Endpoint-
Centric
Navigation-
Centric
Account-
Centric
Cross-
Channel
Entity
Linking
Layer 2 Layer 3 Layer 4 Layer 5
DISCRETE DATA ANALYSIS CONNECTED ANALYSIS
23. Real-Time Recommendations - Benefits
Online Retail
• Suggest related products and services
• Increase revenue and engagement
Media and Broadcasting
• Create an engaging experience
• Produce personalized content and offers
Logistics
• Recommend optimal routes
• Increase network efficiency
24. Real-Time Recommendations - Challenges
Make effective real-time recommendations
• Timing is everything in point-of-touch applications
• Base recommendations on current data, not last night’s batch load
Process large amounts of data and relationships for context
• Relevance is king: Make the right connections
• Drive traffic: Get users to do more with your application
Accommodate new data and relationships continuously
• Systems get richer with new data and relationships
• Recommendations become more relevant
25. Using Data Relationships for Recommendations
Collaborative filtering
Predict what users like based on the
similarity of their behaviors, activities
and preferences to others
Content-based filtering
Recommend items based on what users
have liked in the past
Movie
Person
Person
26. Walmart – Retail Recommendations
World’s largest company
by revenue
World’s largest retailer
and private employer
SF-based global
e-commerce division
manages several websites
Found in 1969
Bentonville, Arkansas
• Needed online customer recommendations to
keep pace with competition
• Data connections provided predictive context,
but were not in a usable format
• Solution had to serve many millions of customers
and products while maintaining superior
scalability and performance
27. Walmart SOLUTION
• Brings customers, preferences, purchases,
products and locations into a graph model
• Uses data relationships to make product
recommendations
• Solution deployed across Walmart
divisions and websites
N eo Tec h n o l o g y, I n c C o n f i d en t i al
GRAPHS ARE EATING RETAIL
CUSTOMERS ORDERS PRODUCT
CATEGORY
THE PROBLEM
CONNECTIONS HOLD PREDICTIVE CONTEXT
CONNECTIONS IN THE DATA NOT IN A
USABLE FORMAT
OTHER EXAMPLES
THE SOLUTION
BRING THE DATA INTO A GRAPH
SO THAT THE CONNECTIONS
CAN BE USED TO MAKE
PRODUCT RECOMMENDATIONS.
COMPETITIVE PRESSURE DEMANDS ONLINE
RECOMMENDATIONS.
28. eBay – Real-time routing recommendations
C2C and B2C
retail network
Full e-commerce
functionality for
individuals and
businesses
Integrated with logistics
vendors for product
deliveries
• Needed an offering to compete with
Amazon Prime and Google Express
• Enable customer-selected delivery inside
90 minutes
• Calculate best route option in real-time
• Scale to enable a variety of services
• Offer more predictable delivery times
29. eBay Now SOLUTION
• Acquired UK-based Shutl, a leader
in same-day delivery
• Used Neo4j to create eBay Now
• 1000 times faster than the prior
MySQL-based solution
• Faster time-to-market
• Improved code quality with
10 to 100 times less query code
31. Curaspan – Graph-based Search
Leader in patient
management for
discharges and referrals
Manages patient referrals
4600+ health care facilities
Connects providers, payers
via web-based patient
management platform
Founded in 1999 in
Newton, Massachusetts
• Improve poor performance of Oracle solution
• Support more complexity including granular,
role-based access control
• Satisfy complex Graph Search queries by
discharge nurses and intake coordinators
Find a skilled nursing facility within n miles of a
given location, belonging to health care group
XYZ, offering speech therapy and cardiac care,
and optionally Italian language services
32. Curaspan SOLUTION
• Met fast, real-time performance demands
• Supported queries span multiple hierarchies
including provider and employee-permissions
graphs
• Improved data model to handle adding more
dimensions to the data such as insurance
networks, service areas and care organizations
• Greatly simplified queries, simplifying
multi-page SQL statements into one
Neo4j function
34. Telenor – Identity & Access Management
Oslo-based Telco
#1 in Nordic countries
#10 in world
Mission-critical system
Availability and
responsiveness critical to
customer satisfaction
Millions of plans, customers, admins, groups
• Highly interconnected data set with massive joins
Degrading relational performance
• Login took minutes to retrieve access rights
Nightly batch workaround
• Solved performance problem, but meant data was
not current
Replace slow Sybase system
• Batch workaround reached 9 hours in 2014—longer
than the nightly batch window
35. Telenor SOLUTION
• Modeling resource graph was straightforward, as the domain is a graph
• Moved authorization from Sybase to Neo4j
• Retired faulty nightly batch process
• Moved real-time response to milliseconds
• Showed fresh data, not yesterday’s snapshot
• Addressed customer retention risks
• Kept business running through aggressive data growth
36. Value from Data Relationships
Common Graph Database Use Cases
Internal Applications
Master Data Management
Network and
IT Operations
Fraud Detection
Customer-Facing Applications
Real-Time Recommendations
Graph-Based Search
Identity and
Access Management
Scale: Neo4j can handle 34B nodes and 34B relationships
Top Uses:
Impact Analysis (e.g. Servers to Services to Users)
Root Cause Analysis
Network Design
Network Security Analysis
Top Queries: Trace dependencies up from servers all the way to applications and users
Trace dependencies across virtual and physical layers of infrastructure
Identify routes & alternate paths between various points in the network
Find the best, shortest, or least busy path, the best location in the network to introduce a new service
Fraudsters have gotten smart in order to pull off large scam or theft, they coordinate multiple bits of activity within shaded area.
The kind of analysis that needs to be done
Challenges: very difficult to model and carry out, and even then can be done only after the fact almost impossible in real-time
Beyond this example, many other ways to detect fraud. By understanding the user across multiple channels of business, able to avoid being gamed by the customer.
Need to include all approaches to catch rookies and experienced fraudsters
Can do one or both but able to do more: jump up category trees, etc.
Valuable predictive information if able to understand what people bought but making prediction of what they are likely to buy required adopting a graph database – data was in tables and unable to perform rich queries for recommendations
Slowest query on MySQL took longer than their fastest delivery
“We run our business on 7 lines of Cypher” – Volker
Different roles use the tool and different roles able to see different things
Need a smart search – not just searching for a keyword – data model according to natural structure and then exposing for search gives you enormous power when searching