This webinar will discuss using graph-enhanced machine learning and AI to thwart fraud. On February 6th, Scott Heath from Expero and Amy Hodler from Neo4j will discuss how graph databases can be used to identify patterns and relationships in complex transactional data to detect fraud. The webinar is part of a series that will also cover building intelligent fraud prevention systems using machine learning and graphs, and obtaining funding for graph-enhanced fraud solutions.
2. Who We Are
SCOTT HEATH
Graph Practice Manager, Expero
Scott.Heath@experoinc.com
@experoinc
AMY HODLER
Analytics Program Manager, Neo4j
Amy.Hodler@neo4j.com
@amyhodler
3. 500+
7/10
12/25
8/10
53K+
100+
250+
450+
Adoption
Top Retail Firms
Top Financial Firms
Top Software Vendors
Customers Partners
• Creator of Neo4j Graph Platform
• ~200 employees
• HQ in Silicon Valley, other offices
include London, Munich, Paris and
Malmö
• $80M in funding from Fidelity,
Sunstone, Conor, Creandum, and
Greenbridge Capital
• Over 10M+ downloads,
• 250+ enterprise subscription customers
with over 50% with $1B and more in
revenue
Neo4j - The Graph Company
Ecosystem
Startups in program
Enterprise customers
Partners
Meetup members
Events per year
Industry’s Largest Dedicated Investment in Graphs
5. NEO4J + EXPERO = COMPLETE ENTERPRISE SYSTEMS
Set of methods, tools
& protocols to build
software applications
U and Visualization
enabling users to
perform self-service
Application Layers
- Micro services
- REST Server
End User
Open Source, COTS
& Custom
- React, Angular
- Keylines, Linkurious
- D3
Full Applications:
• Custom Industry function
• Dashboard
• Reporting
• Visualize Data
Structured/Unstructured data
Extract Source Data
Full Enterprise & Standardized Data
Extract & transform
source data to meet
mission needs, load data
into unified database
Open Source
& COTS
Resolve & persist
data; include multiple
software & hardware
elements
Legacy + Custom +
Industry Data and
Platforms
Source Data
- Legacy
- RDBMS
- Analytics
- Data Lakes
- Data Marts
SOURCE DATA
EXTRACT, TRANSFORM &
LOAD (ETL)
DATA & MIXED
DATA MODEL
GRAPH DATA
& PLATFORM
An entity-centric, schema
less, and self describing
information management
system
APPLICATION LAYERS USER INTERFACE (UI)
APIs
PRESENTATION
LOGIC
DATA
Source Apps
- SFDC
- SAP
- Oracle
6. 6
Join Us - Webinar Series (Save the Dates !)
Thwart Fraud Using
Graph-Enhanced
ML & AI
You Are Here
Build Intelligent Fraud
Prevention with
ML and Graphs
Overview
Technical Aspects
Understand
Business Impact
Feb 13
9:00 PST / 12:00 EST
Lock Down Funding for
Graph-Enhanced
Fraud Solutions
Get
Funding
Feb 20
9:00 PST / 12:00 EST
8. Neo4j — Changing the World
ICIJ used Neo4j to uncovered the world’s
largest journalistic leak up date, The Panama
Papers, exposing criminals, corruption and
extensive tax evasion.
The US space agency uses Neo4j for their
“Lessons Learned” database to connect
information to improve searchability
effectiveness in space mission.
eBay uses Neo4j to enable machine
learning through knowledge graphs
powering “conversational commerce”
Product RecommendationsFraud Detection Knowledge Graphs
9. 9
Harnessing Connections Drives Business Value
Enhanced Decision
Making
Hyper
Personalization
Massive Data
Integration
Data Driven Discovery
& Innovation
Product Recommendations
Personalized Health Care
Media and Advertising
Fraud Prevention
Network Analysis
Law Enforcement
Drug Discovery
Intelligence and Crime Detection
Product & Process Innovation
360 view of customer
Compliance
Optimize Operations
Connected Data at the Center
AI & Machine
Learning
Price optimization
Product Recommendations
Resource allocation
Digital Transformation Megatrends
13. • Money Laundering
• Credit Card
• Check
• Identity Theft
• Combinations
• With nuances in each industry
• Insurance, retail, telecom...
Many Faces of Fraud
But there are commonalities
• ‘Smurfing’
• Transactions
• Actors
• Locations
• Devices
Which means there a common traits,
data, and patterns (or anti-patterns!)
that can be analyzed!
14. The Graph Advantage
• Pattern matching
• Relationship & association analysis
• Real-time monitoring and decisions
• Reflexive to dynamic changes
18. “There is No Network in Nature that
we know of that would be described by
the Random network model.”
- Albert-László Barabási
19. Small-World
High local clustering
and short average
path lengths. Hub and
spoke architecture.
Scale-Free
Hub and spoke
architecture
preserved at multiple
scales. High power law
distribution.
Random
Average distributions.
No structure or
hierarchical patterns.
20. Averages Approach on Structured Data?NodeswithkLinks
Number of Links (k)
Average Distribution
- Random -
Most nodes have the
same number of
links
No highly
connected nodes
NodeswithkLinks
Number of links (k)
Power Law Distribution
- Scale-Free -
Many nodes with
only a few links
A few hubs with a
large number of links
Source: Network Science - Barabasi
21. NodeswithkLinks
Number of Links (k)
Average Distribution
- Random -
Art: Ulysses and the Sirens – Herbert James Draper
Most nodes have the
same number of
links
No highly
connected nodes
You’ll Also Miss the Structure
Hidden in Your Networks
- Scale-Free -
- Small World -
Averages Approach on Structured Data?
23. On Stage
Behind the Scene
Organizations
Multi-related
Processes
Knowledge
Business
Processes
Data
Structure
24. Structures Can Hide
Source: “Communities, modules and large-scale structure in networks“ - Mark Newman
Source: “Hierarchical structure and the prediction of missing links in networks”; ”Structure and inference in annotated networks” - A. Clauset, C. Moore, and M.E.J. Newman.
28. LOGICAL FLOW: SQL → Customer Applications
SOURCE DATA DATA MAPPING - SOURCE MAP ENTITY RESOLUTION (ER)
DATA, SEARCH
& ANALYTICS PLATFORM
APPLICATION
PROGRAMMING
INTERFACES (API)
USER INTERFACE (UI)
PRESENTATION
LOGIC
DATA
PERSON
Name / DOB / Products
PERSON
Name / DOB / Address
COMPANY
Shipper / Phone Number
SOCIAL
NETWORK
SHIPPING
ENTITIES
FRAUD
CUSTOMER 360
SUPPLY CHAIN
RECOMMENDATIONS
Map to Source
Microservices - API
Open Source
& Custom
Resolve and persist entities
within and across datasets
Use ML or Custom
Algorithms
An entity-centric, schemaless
view, and self describing
information management
system
Extract & transform or
Create ‘Map’ of data -
Federated Data Mapped
Set of methods, tools,
and protocols to build
software applications
Visualization tool
enabling mission users
to perform self-service
data analysis
Structured and
unstructured data (e.g.
social media, raid data)
SQL, Triple Store, Hadoop,
etc
COMPANY
Shipper / Address
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
Source Data
Schema
Denormalized
& Standardized
Data
>>>>>>>>>>
COMPANY
PERSON
APIs
PERSON
Master Data Management
Machine Learning Machine Learning
29. Graph databases store data based on
relationships, rather than transactions
Used For: Data analytics systems connecting disparate
structured or unstructured data
Graph Database
Used For: Transactional systems with structured data
Traditional Database
Person FriendPerson_Friend
Graphs are suited for environments where the connections between data
points are just as important as the data points themselves.
30. ONBOARD
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
Name: “Consay”
Headquarters: “NJ”
Nodes
• Can have Labels to classify
• Labels have native indexes MARRIED TO
LIVES WITH
TRADES
PERSON PERSON
30
Property Graph: Nodes + Relationships
Company
Relationships
• Relate nodes by type and direction
• Can have weights
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
amount:
$9,500
31. Exploring The Data
Graph Gives us the Horsepower to See Differently
Dependencies
• Failure chains
• Order of operation
Matching /
Categorizing
Highlight variant of
dependencies
Clustering
Finding things closely
related to each other
(friends, fraud)
Flow / Cost
Find distribution
problems, efficiencies
Similarity
Similar paths or patterns
Centrality, Search
Which nodes are the most
connected or relevant
35. Graph is the Power, ML is the Force Multiplier
● Drive Action Fast
○ Recommend ‘Suspected’ items and flag
them for investigation
○ Based on 10 Years of history Suggest
new areas for investigation - i.e. ACH
patterns show where to look
● Avoid Revenue Loss
○ Predict potential patterns and potential
areas for investigation
● New Insights - ‘Treasure Map’
○ Customer Clustering and Similarity
○ Campaigns: React to found data
○ Intelligent graph insight
36. Graph Enhanced ML & AI
Knowledge Graphs
Provide Rich
Context for AI
AI Visibility
Human-Friendly
Graph Visualization
Graph Accelerated
ML & AI Development
Quickly Evaluate Datasets
and Features for Extraction
Graph Execution of AI
Operationalize Real-Time
OLAP and Monitoring
Graph Enriched Data
Preprocess and Augment
Machine Learning Data
Connected Source of Truth
Data Lineage for ML
System of Record for AI Decisions
37. Interactive Visualization
● Expert + Interactive
● Visualize potential risks
from any other source:
○ pattern monitoring
○ machine learning
○ other LOB
● Rich Visualizations
○ identify “emerging
risk connections”
○ connect-the-dots
across risk cases
AI + Machine Learning
● Semi -> Fully Automated
● Looking for risk patterns
“we have not seen before”
● Looking for recurring
patterns in transaction
streams
● More effective at finding
risks using lower number of
data dimensions
Cooperative/Hybrid Fraud Detection Stages
Risk Pattern Monitoring
● Semi-> Fully Automated
● Looking for risk patterns
“we have seen before”
● Code programs to look for
specific patterns in
transaction streams
● Can look at any number of
dimensions in the data
● Fraud Rings constantly
working to “crack the code”
AnticipateGuard Discover
38. Graph + ML Fraud Analysis System
X[n]
K
N-1
Extract financial history data
AI - Analysis says: this
company is committing
fraudulent transactions
Clustering - Find corporate look-alikes
for fraud analysis
1
2
3
4
5
40. Fraud Detection - Transactional Fraud by Individuals
Graph of individuals
suspected
fraud
suspected
fraud
Machine learning highlights fraudulent
transactions for bank review.
Subgraph of transactions
by an individual
41. Company Identity Lookalikes
Low Risk
High Risk
Average
Unstructured graph of
companies
Same graph, automatically clustered by their financial
history similarities by an unsupervised learning algorithm.
43. Methods to Visualize - ML in Your Application?
1) Entity Link Analysis
○ Transactions : Amounts, Locations, Types of Goods, Types of stores, sizes of amount
○ Known Data : Matching against known previous fraud data
2) Graph Traversals
○ Entities or Actors : People, Companies, Goods and Services
○ Amounts : Odd amounts, small amounts with similar numbers ,repetitive locations
○ Known Data : Matching against data
3) Geospatial Viewing
○ Locations : Physical locations, corporate entities,
○ Devices: Mac Addresses, IP and device
4) Timeline Analysis
○ Reviewing all Events : Locations, Actors, Entities, Transactions,
○ Device Tracking: Mac Addresses, IP and device
44. Example: Use AI/ML + GRAPH To Create Action
ML LINK:
Tie Data to Action :
Campaigns
● Activity
● Trends
● Loyalty
ML PERSONALIZATION:
Entity Link & Graph
Traversal:
● Use History
● User Context
● Background
45. Example: AI + Graph Customer Journey
ML Risk Analysis:
● Risk Factor
● Risk
● Sentiment
AI Clustering:
Entity Link & Graph
Traversal:
● Activity
● Trends
● Background
49. Insight for Graph Methodology
DISCOVERY INVENTION REALIZATION
TRACK &
MEASURE
ONGOING
SUPPORT
PROOF OF CONCEPT PILOT TURN-KEY MVP
DEVELOPMENT
TECHNOLOGY LIFE CYCLE
ASSESSMENTS : DIAGNOSE & PRESCRIBE - DATA, ARCHITECTURE, CODE, USER EXPERIENCE (Any Stage)
SUPPORT -
EXPERT SERVICES
50. Playbook: What are the Next Steps?
Prototype
Pilot
Delivery
Data Loading
DSE Platform
Data Discovery
Craft Visualization
Key Business Functions
Build Rapid Pilot -
Prototype
Validate Business Case and Platform Technology
● Key Customer Functionality
● Graph Data Platform - Specifications
● Working Graph System
● Real Data Set
Business
Problem
Go LiveDevelopmentDiscovery & Requirements Testing
PLAY: Rapid Prototype
51. RAPID PILOT: See and Experience Your Data
Web UI
framework
React
Visualizations EXPERO GRAPH TOOLS +
(Open Source)
Graph
Platform
App Server (Generic Server)
Provisioning EXPERO GRAPH TOOLS
Ansible + Cloudburst
Compute
Cloud
AWS EC2
Data Sources CUSTOMER Data or (Synthetic Data)
52. Join Us - Webinar Series
Thwart Fraud Using
Graph-Enhanced
ML & AI
You Are Here
Build Intelligent Fraud
Prevention with
ML and Graphs
Overview
Technical Aspects
Understand
Business Impact
Feb 13
9:00 PST / 12:00 EST
Lock Down Funding for
Graph-Enhanced
Fraud Solutions
Get
Funding
Feb 20
9:00 PST / 12:00 EST