Graphs and Financial Services Analytics

Graphs and Financial
ServicesAnalytics
Michael Moore, Ph.D. Executive Director, Enterprise
Knowledge Graphs + AI
EY Performance Improvement Advisory
Omar Azhar, M.S. Manager,
Machine Learning andAdvanced Analytics
EY Financial ServicesOrganization
Miguel Perez, Ph.D. (DND), M.S. Senior,
Machine Learning andAdvanced Analytics
EY Financial ServicesOrganization

Falling Memory Cost: 1990-2016

2
15 68 117
244
2000
4000
0
500
1000
1500
2000
2500
3000
3500
4000
m
1.xlarge
m
2.4xlarge
hs1.8xlarge
r3.8xlarge
x1e.16xlarge
x1e.32xlarge
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
EC2 RAM (GB)Continued increase in capacity and
dropping compute costs are challenging
scale-out commodity server assumptions,
particularly for database workloads
2018
ScaleOut  Scale Up

► Common use cases for graph analytics
► Recommendation engines
► Supply chain and network optimization
► Fraud networks
► Community detection (social network analysis)
► Impact analysis / network contagion
► Anomaly detection
7
Graph Analytics Use Cases
Focus of this talk

8
Anomalous Behavior Detection in Dynamic Graphs in Financial Services
Anomalies are not always about finding bad behavior. We’re trying to find change in a network or behavior that is
indicative of a significant change in our assumptions
• Customer Behavior: A life event such as new job, new house, marriage. Significant life changes are
indicated by customers behaving in ways that they previously did not. Points of opportunities for
providing new services
• Transaction Networks at Scale: What defines an efficient flow of funds vs. an inefficient? Are they
correlated with the type of behavior?
•
How should we think of structuring this as a graph problem?
problem?

9
Let’s start with a model everyone is familiar with…Customer 360
FA Hub Corporate
Wiki
Call Logs
E-mail Logs
Social network
data
Financial Hub
Accounts
Hub
Transaction
Logs
Now that we’ve got our graph model we now need to consider
scale

10
Scaling determines what snapshots you take of the graph for analysis
Micro. Looking at my graph at an account
level

11
Moving up the scale. Looking at the customer level

12
Household level

13
How do we think about scaling in a graph problem?
Consider the business defined scale
• Scaling by collections of nodes: clumping nodes together -> household node
• Generally defined by business and domain expertise
• Scaling by collections of edges: clumping edges together -> geometric time-averaging
• Requires both business / domain knowledge as well as a little bit of investigating. How do you
tell what is a full time cycle?
Micro Macro
Account Firm
Coarse versus fine grain Tuning

14
Understanding your graph snapshot. Different data models of the same underlying
knowledge graph
Explore your graph snapshots. You will notice natural separation or clusters / segments in each snapshot. Most of this is already
done through current segmentation models at most firms
Can we use similar graph snapshots to describe expected behavior?
Checking accounts
Credit Cards
Similar customers
by spend
Households with
similar incomes

15
But how is this any different than what is already done today? Why Graph?
They all belong to this
household
The college student
Let’s investigate how a single household shows up through two separate
snapshots
The parents

16
How should change in one snapshot change the nodes in another snapshot?
What does it mean for a node in on snapshot to change it’s data to move to another
location in it’s snapshot?
Can we model that?

17
We should expect diffusion of information across our graph data models (GDMs)
Household moves to a lower cost state -> Household retains income but is wealthier in new state

18
Information should spread across GDMs. It should go both ways but not necessarily
with the same weight
College student graduates and moves back in with his parents

19
Can we now model this as expected change across our GDMs?
Identify node changes
What other types of change have a small impact in one GDM and a large impact in the other
GDM?: One family member moves -> Household income is represented differently in one model versus
another

20
Expressing Behavior with graph snapshots
Compare graph snapshots to identify node behavioral change
• Similar GDMs can give you a context dependent way of expressing behavioral change! This
means we can self-compute it

21
Expressing Behavior with graph snapshots
Compare graph snapshots to identify node behavioral change
• Similar GDMs can give you a context dependent way of expressing behavioral change! This means
we can self-compute it
• Expressing behavioral change is now deeply connected to expressing the structural change on
similar GDMs that are supported by the same underlying knowledge graph

22
Behavioral Change over time
Time-Sequenced Graph Data Models (TSGDM)
• A sequence of graph data models provides the Context for behavioral change over time.

23
TSGDM – Assumptions – Semantic Compatibility
Time-Sequenced Graph Data Models – Necessary Conditions
• (1) Intuitive edges that are semantically compatible with the parent KG and entity resolution

24
• (2) Obeys information theoretic concerns about “information propagation on a geometric
structure

25
structure
• (3) Use an Unsupervised architecture that correctly diffuses information in each time step

26
structure
• (4) The architecture learns how we should be describing behavioral change – not the other
way around

27
structure
• (4) The architecture learns how we should be describing behavioral change – not the other
way around
• (5) Use the statistical distribution learned to identify outliers
• (6) Rank Those outliers

28
TSGDM - Using a learned statistical distribution to identify outliers
Take your customer transaction data and build a Parent Knowledge Graph

29
Scaling experimentation let’s us study different schema for candidate TSGDM
Comparing two similar GDMs
provides context for
behavioral change of a node

30
Apply the selected schema on each month of data (or another appropriate time scale)
Memory constraints will fix the
number of time windows your
architecture can learn from
Month
1
Month
2
Month
3
Month
4
Month
X

31
Learn a Champion Model on each time window batch
Champion Model

32
Apply Champion Model to each TSGDM and investigate the tail of each distribution
The log scale compression error, or reconstruction error,
tends to follow a power law distribution.
Graph structural changes that are harder to
reproduce tend to be outliers!

33
• Create multiple champion models with some overlap in their
time windows
• The overlap in the cumulative error between champion
models will be the outliers of interest
• Rank all nodes by their cumulative error for each Champion
Model
• Key Takeaway: If a financial behavior is hard to
replicate in this framework, the more likely the
behavior is an anomaly
…

34
Use Case: Anti-Money Laundering
Existing Business Problem: Financial Institutions are responsible for monitoring the transaction activity of client
accounts in order to detect the presence of Money Laundering activity. Rule-based systems generate too many false-
positive alerts that require expensive and subjective manual review. Industry standard performance is 1:1000

35
Aggregated activity in real-world networks can demonstrate the efficiency of money-
flow in certain pockets of our economy

36
Aggregated activity in real-world networks can demonstrate the efficiency of money-
flow in certain pockets of our economy
normal and random
dispersion of money flow
that follows a natural path

A few regions of high interconnectivity connected to spoke-like hubs. Low
reproducibility, potentially anomalous

38
A few regions of high interconnectivity connected to spoke-like hubs. Low
reproducibility, potentially anomalous
Potentially higher
connectedness than normal

EY Cross-Sector Graph Experience: MDM, 360°, AML/Fraud, Recommenders
Fortune 100 Tech Company
Use Case:
Global B2B Account 360° view and
marketing attribution
Approach:
Neo4j graph with 500M nodes
and 2.2B relationships,
representing all known business
accounts, contacts and marketing
touches. Mastered data from
17disparate transactional sources
in Azure Data Lake. Supported in-
graph analytics for marketing
attribution and next best action
recommendations across global
geographies
Duration:
16 weeks to working graph
Fortune 100 Footwear Company
Use Case:
Converged Brick & Mortar +
Online Shopper 360° View
Approach:
Neo4j graph with 2B nodes and
relationships, representing sales
transactions for 40M shoppers
across 275 physical stores and the
ecommerce platform. Algorithmic
extraction and profiling from raw
XML records in AWS Hadoop,
MDM record concordance and in-
graph analytics for product
associations, store analytics and
recommendation services.
Duration:
Fortune 500 Cruise Line Company
Use Case:
Shipboard and Shoreside
Recommendation Engine
Approach:
Neo4j graph deployable to
shipboard VM Ware data centers,
with streaming updates from
large shoreside Neo4j graph
integrating data from Azure
Cerebro, Adobe Experience
Manager and legacy transactional
systems. In-graph
analytics,services API,
recommendation engine for next
best activity for passengers
surfaced via mobile app
Duration:
Fortune 100 Investment Firm
Use Case:
Enhanced Anti-Money Laundering
and Fraud Detection using
Graph+AI
Approach:
Neo4j graph of account 360° view
representing activity of 2M
accounts over 4 years. MDM and
entity extraction for account and
party identity elements from
enterprise Oracle system.
Network clustering, feature
engineering and graph embedding
in TensorFlow deep learning
classifier for suspicious activity
patterns across accounts and
between parties.
Duration:
Fortune 100 Tech Company
Use Case:
B2B Local Marketing Events
Recommendation Engine
Approach:
Neo4j graph and personalized
next best event recommendation
engine for B2B field marketers.
Reconciles physical and digital
event attendees with corporate
account structures for 10K
accounts and 5M contacts
Entities mastered from
transactional data in SQLServer
and Azure Data Lake.
Microservices APIs support data
syndication to martech
applications and PowerBI
reporting.
Duration:

Graphs and Financial Services Analytics

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Graphs and Financial Services Analytics

Similar a Graphs and Financial Services Analytics (20)

Más de Neo4j

Más de Neo4j (20)

Último

Último (20)

Graphs and Financial Services Analytics

Notas del editor