SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
Approach to AML Rule Thresholds
By Mayank Johri, Amin Ahmadi, Kevin Kinkade, Sam Day, Michael Spieler, Erik DeMonte
January 12, 2016
Introduction
Institutions are constantly facing the challenge of managing growing alert volumes from automated transaction
monitoring systems, new money laundering typologies to surveil, and more robust regulatory guidance. The question is
how will BSA/AML departments scale to meet demand while managing compliance cost? In order to effectively set
baseline thresholds for new detection scenario configuration or improve the efficacy of existing scenarios, apply
statistical techniques and industry standards to identify the cut-off between “normal” and “abnormal” or “suspicious”
activity. These estimated thresholds are then either challenged or reinforced by the qualitative judgement of professional
investigators during a simulated ‘pseudo’ investigation or ‘qualitative assessment’.
An effective AML transaction monitoring program includes a standardized process for tuning, optimizing, and testing
AML scenarios/typologies that is understandable, repeatable and consistent.
An appropriately tuned or optimized scenario seeks a balance between maximizing the identification of suspicious
activity while simultaneously maximizing resource efficiency. The two competing objectives of tuning and optimization
which must remain in constant balance are:
(1) Reduce the number of ‘false positives’ or alerts generated on transactions that do not require further investigation
or the filing a Suspicious Activity Report (SAR).
(2) Reduce the number of ‘false negatives’ or ‘transactions that were not alerted’ but that do require further
investigation or the filing a SAR.
Phases
The following outlines the eight phase process for the initial tuning:
 Phase 0 | Planning. The Policy Office (PO) works closely with the Analytics team to strategize the scenario,
stratification, and parameters that will be used to conduct a threshold analysis.
 Phase 1 | Assess Data. Analytics communicates which data fields will be required to perform this analysis to
Information Technology (IT). IT then determines if the ETL of these fields into Transaction Monitoring System is
a near or long term process.
 Phase 2 | Query Data. Analytics queries the required transactional data for analysis.
 Phase 3 | Quantitative Analysis. Analytics stratifies the data as required (such as grouping like-attributes or ‘non-
tunable parameters’ such as entity/consumer, cash intensive businesses/non-CIB, Cr/Db, high/medium risk
destinations etc.) to account for like-attribute behavior patterns.
Transformation
Once stratified, Analytics performs transformations to the data as required (such as 90 day rolling count/sum/standard
deviation etc).
Exploratory Data Analysis
Analytics performs a variety of visual and statistical exploratory data analysis (EDA) techniques to analyze the dataset to
understand the correlation and impact that one or more parameters may have on the scenario, and therefore ultimately
on the alert-to-case efficacy. The objective of EDA is to further explore the recommended parameters (count, amount,
standard deviation, etc.) proposed during the planning phase to determine with greater statistical precision the best
combination of parameters
Segmentation
Once stratified and transformed, Analytics clusters the data’s ‘tunable parameters’ to account for ‘skewness’ in the data
population caused by outliers in order to yield a statistically accurate threshold that is representative of the 85th
percentile.
The 85th percentile is used as a standard when establishing a new rule to set an initial baseline threshold for defining the
cutoff between “normal” transactional data and “unusual” transactional data. For normally distributed data with a bell-
shaped curve (as depicted in the middle diagram below, figure 1.1), the mean value (i.e., the “expected” value) represents
the central tendency of the data, and the 85th percentile represents one standard deviation (σ) from this central tendency.
The 85th percentile could represent a reasonably conservative cutoff line or “threshold” for unusual activity. This
baseline simply provides a starting point for further analysis, and is later refined through qualitative judgement and alert-
to-case efficacy.
If transactional data were always normally distributed, it would be easy to calculate one standard deviation above the
mean to identify where to draw the line representing the 85th percentile of the data (this technique is often referred to as
‘quantiling’), thus establishing the threshold. However, in real world applications transactional data is often not normally
distributed. Transactional data is frequently skewed by outliers (such as uniquely high-value customers), therefore, if
statistical techniques that assume normal distribution (such as quantile) are applied while determining the 85th percentile
(+1 standard deviation from the mean), the result will yield a misrepresentative ‘threshold’ which is offset by the
outlier(s).
Figure 1.1 Distribution affected by Skewness
Clustering
To account for skewness in the data, employ the clustering technique known as ‘Partition around Medoid’ (PAM), or
more specifically, ‘Clustering Large Application’s (CLARA). Clustering is an alternative method of data segmentation
which is not predicated on the assumption that the data is normally distributed or that it has constant variance.
Clustering works by breaking the dataset into groups of distinct clusters around one common entity of the dataset
(which represents the group). This partition more accurately allows the assignment of a boundary (such as a target
threshold to distinguish normal from unusual activity).
The first step of the clustering model is to understand the number of clusters to partition the data by. The methodology
used to identify the optimal number of clusters takes into account two variables:
 Approximation – How the clustering model fits to the current data set (“Error Measure”)
 Generalization – Cost of how well the clustering model could be re-performed with another similar data set
The model for clustering can be seen in the figure below. As the number of clusters increases, (x-axis) the model will
become more complex and thus less stable. Increasing the number of clusters creates a more customized model which is
catered to the current data set, resulting in a high level of approximation. However, in this situation cost will increase as
the flexibility to re-perform using a similar data set will become more difficult. Inversely, the fewer clusters the less
representative the model is for the current data set, but the more scalable it is for future similar data sets. An objective
function curve is plotted to map the tradeoff between the two competing objectives. This modelling methodology is
used to identify the inflection point of the objective function of the two variables - the optimal number of clusters that
will accommodate both the current data set (approximation) and future data sets (generalization). Refer to figure 1.2
below for the conceptual visual of the modelling methodology used for identifying the optimal number of clusters.
Figure 1.2 Cluster Modeling – Identification of Number of Clusters
The basic approach to CLARA clustering is to partition objects/observations into several similar subsets. Data is
partitioned based on ‘Euclidean’ distance to a common data point (called a medoid). Medoid, rather than being a
calculated quantity (as it is the case with “mean”), is a data point in the cluster which happens to have the minimal
average dissimilarity to all other data points assigned to the same cluster. Euclidean distance is the most common
measure of dissimilarity. The advantage of using medoid-based cluster analysis is the fact that no assumption is made
about the structure of the data. In the case of mean-based cluster analysis, however, one makes the implicit restrictive
assumption that the data follows a Gaussian (bell-shape) distribution.
The next step is to determine the number of dimensions for parameter threshold analysis and to translate the
transactional data into ‘events’. An event is defined as a unique combination of all parameters for the identified scenario
or rule. The full transactional data set is translated into a population of events. Event bands are formed based on the
distribution of total events within the clusters. Event bands can be thought of as the boundaries between the clusters
(such that one or more parameters exhibit similarity).
Event Banding with One Parameter
When a scenario only has one tunable parameter (such as ‘amount’), bands for this parameter are ideally generated in 5%
increments beginning at the 50th percentile, resulting in six bands – P50, P55, P60, P65, P70, P75, P80, P85, P90, and
P95. The 50th percentile is chosen as a starting point to allow room for adjustment towards a more conservative
cluster/threshold, pending the results of the qualitative analysis. In other words, it is important to include clusters well
below, but still within reasonable consideration to the target threshold definition of transaction activity that will be
considered quantitatively suspicious. Refer to Figure 1.3 below.
Figure 1.3 85
th
Percentile and BTL/ATL
Some parameters such as ‘transaction count’ have a discrete range of values, and therefore the bands may not be able to
be established exactly at the desired percentile level. In these cases, judgment is necessary to establish reasonable bands.
Depending on the values of the bands, they will often be rounded to nearby numbers of a similar order of magnitude
but that are more easily socialized with internal and external business partners. Each of these bands corresponds to a
parameter value to be tested as a prospective threshold for the scenario.
If the six clusters have ranges that are drastically different from one another, adjustment to the bands may be necessary
to make the clusters more reasonable while still maintaining a relatively evenly distributed volume across the event
bands. This process is subjective and will differ from scenario-to-scenario, especially in cases where a specific value for a
parameter is inherent in the essence of the rule (e.g., $10,000 for cash structuring). In many cases the nature of the
customer segment and activity being monitored may support creating fewer than 6 event bands due to the lack of
volume of activity for that segment.
Figure 1.4 Event Banding of 1 Parameter ‘Amount’
Event Banding with Two Parameters
When a scenario has two tunable parameters (such as ‘count’ and ‘amount’), two independent sets of bands need to be
established for each parameter, similar to the method used for one parameter.
Analysis of two tunable parameters may be thought of as ‘two-dimensional’, whereas one parameter event banding is
based only on a single parameter (one axis), event banding with two parameters is affected by two axes (x & y axis). For
example, ‘count’ may represent the x-axis, while ‘amount’ may represent the y-axis. In this sense, the ultimate threshold
is determined by a combination of both axes, and so are the event bands. Including additional parameters will likewise
add additional dimensions and complexity.
As discussed above, while the 85th percentile is used to determine the threshold line, bands are created through
clustering techniques starting at the 50th percentile to account for those data points below, but still within reasonable
consideration to the target threshold definition of transaction activity that will be considered quantitatively suspicious. In
the diagram below, we see banding between two parameters, count and value. Once the data is clustered, the 85th
percentile is identified per the distribution (upper right hand table in Figure 1.5 below) and qualitative judgement is
exercised in order to set exact thresholds within the range that creates a model conducive for re-performance (Refer
above to discussion on “generalization” in the discussion of clustering modelling).
Figure 1.5 Event Banding of 2 Parameters ‘Value’/‘Count’
Event Banding with more than Two Parameters
When the scenario has more than two tunable parameters (such as count, amount and standard deviation), more than
two independent sets of bands need to be established for each parameter, similarly to the method used for two
parameters.
Select Threshold(s)
The output of phase three is the ‘threshold’, or ‘event’ characteristics (combination of thresholds based in the case of
multiple parameters) which serve as the baseline for ‘suspicious’ activity. Too many alerts may be generated which
creates extraneous noise and strains BSA/AML investigative resources. Conversely if the threshold is set too high,
suspicious activity may not generate alerts.
 Phase 4 | Sampling. Analytics applies the thresholds determined during the quantitative analysis phase to the
historical data in order to identify potential events for Above-the-Line (ATL) and Below-the-Line (BTL) analysis.
These indicators, when flagged as ‘ATL’ are essentially the same thing as alerts, except since they are applied using
historical data they are referred to as ‘pseudo alerts’. The number of transactions which fall into the ATL or BTL
category will determine the number of random samples required for a statistically significant qualitative assessment.
The purpose of the samples is to evaluate the efficacy of Analytics’ calculated thresholds. In other words, if the
threshold is appropriately tuned, then a larger percentage of events marked ‘ATL’ should be classified as ‘suspicious’
by an independent FIU investigator compared to the ‘BTL’ events. Analytics then packages these sample ATL and
BTL transactions into a format which is understandable and readable by an FIU investigator (samples must include
the transactional detail fields required for FIU to determine the nature of the transactions).
 Phase 5 | Training. Analytics orients the FIU investigators to the scenario, parameters and overall intent/spirit of
each rule so that during the qualitative analysis phase, the FIU investigators render appropriate independent
judgements for ATL and BTL samples.
 Phase 6 | Qualitative Analysis. FIU assesses the sampled transactions from a qualitative perspective. During this
phase, an independent FIU investigator analyzes each sampled pseudo alert as they would treat real alerts (without
any bias regardless of the alert’s classification as ATL or BTL). The investigator’s evaluation must include
consideration for the intent of each rule, and should include an assessment of both the qualitative and quantitative
fields associated with each alert. The FIU investigator will generally evaluate each transaction through a lens akin to
“Given what is known from KYC, origin/destination of funds, beneficiary, etcetera, is it explainable that this
consumer/entity would transact this dollar amount at this ...frequency, velocity, pattern etc...” FIU provides
feedback to Analytics for each pseudo alert classified as (a) ‘Escalate-to-Case’ (b) ‘Alert Cleared – No Investigation
Required (false positive)’, (c) ‘Alert Cleared – Error, or (d) ‘Insufficient Information’. If the efficacy is deemed
appropriate, then Business Review Session is scheduled to vote the rule into production.
 Phase 7 | Business Review Session. PO, Analytics and FIU present their findings for business review to voting
members.
 Phase 8 | Implementation. Analytics provides functional specifications to IT to implement the scenario within
Transaction Monitoring System.

Más contenido relacionado

La actualidad más candente

Descriptive Statistics and Data Visualization
Descriptive Statistics and Data VisualizationDescriptive Statistics and Data Visualization
Descriptive Statistics and Data VisualizationDouglas Joubert
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5BITS
 
Business intelligence concepts & application
Business intelligence concepts & applicationBusiness intelligence concepts & application
Business intelligence concepts & applicationnandini patil
 
Creating customer value from technology
Creating customer value from technologyCreating customer value from technology
Creating customer value from technologyGreg Hopper
 
Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining IntroductionVijayasankariS
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time seriesLuigi Piva CQF
 
3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...PROWEBSCRAPER
 
Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)snegacmr
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningWan Aezwani Wab
 
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterHeiko Paulheim
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Lecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptxLecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptxGauravSonawane51
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis IntroductionPrasiddhaSarma
 

La actualidad más candente (20)

Descriptive Statistics and Data Visualization
Descriptive Statistics and Data VisualizationDescriptive Statistics and Data Visualization
Descriptive Statistics and Data Visualization
 
Unit 1
Unit 1Unit 1
Unit 1
 
RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5RNA-seq for DE analysis: detecting differential expression - part 5
RNA-seq for DE analysis: detecting differential expression - part 5
 
Business intelligence concepts & application
Business intelligence concepts & applicationBusiness intelligence concepts & application
Business intelligence concepts & application
 
Creating customer value from technology
Creating customer value from technologyCreating customer value from technology
Creating customer value from technology
 
Data mining Introduction
Data mining IntroductionData mining Introduction
Data mining Introduction
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time series
 
Sarcasm Detection
Sarcasm DetectionSarcasm Detection
Sarcasm Detection
 
3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...3 pillars of big data : structured data, semi structured data and unstructure...
3 pillars of big data : structured data, semi structured data and unstructure...
 
Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)Discretization and concept hierarchy(os)
Discretization and concept hierarchy(os)
 
Statistics ppts
Statistics pptsStatistics ppts
Statistics ppts
 
Apriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule MiningApriori and Eclat algorithm in Association Rule Mining
Apriori and Eclat algorithm in Association Rule Mining
 
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Lecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptxLecture-12Evaluation Measures-ML.pptx
Lecture-12Evaluation Measures-ML.pptx
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
 
Explaining Peptide Prophet
Explaining Peptide ProphetExplaining Peptide Prophet
Explaining Peptide Prophet
 
Primer Designing Event.ppt
Primer Designing Event.pptPrimer Designing Event.ppt
Primer Designing Event.ppt
 
Data cleaning-outlier-detection
Data cleaning-outlier-detectionData cleaning-outlier-detection
Data cleaning-outlier-detection
 

Destacado

Quick Reference Guide to BSA/AML Risk Assessment
Quick Reference Guide to BSA/AML Risk AssessmentQuick Reference Guide to BSA/AML Risk Assessment
Quick Reference Guide to BSA/AML Risk AssessmentMayank Johri
 
Anti-Money Laundering (AML) Risk Assessment Process
Anti-Money Laundering (AML) Risk Assessment ProcessAnti-Money Laundering (AML) Risk Assessment Process
Anti-Money Laundering (AML) Risk Assessment Processaccenture
 
Final CDD Rule - How We Got Here and What To Do Now
Final CDD Rule - How We Got Here and What To Do NowFinal CDD Rule - How We Got Here and What To Do Now
Final CDD Rule - How We Got Here and What To Do NowNick Guest, CAMS
 
Statistical Approach to CRR
Statistical Approach to CRRStatistical Approach to CRR
Statistical Approach to CRRMayank Johri
 
Reducing False Positives
Reducing False PositivesReducing False Positives
Reducing False PositivesMayank Johri
 
Anti money laundering - PEPs
Anti money laundering - PEPsAnti money laundering - PEPs
Anti money laundering - PEPsBesart Qerimi
 
Anti Money Laundering Typologies
Anti Money Laundering TypologiesAnti Money Laundering Typologies
Anti Money Laundering TypologiesWilliam Byrnes
 
anti money laundering certificate
anti money laundering certificateanti money laundering certificate
anti money laundering certificateSali Saba
 
Anti-Money Laundering and Counter Financing of Terrorism
Anti-Money Laundering and Counter Financing of TerrorismAnti-Money Laundering and Counter Financing of Terrorism
Anti-Money Laundering and Counter Financing of TerrorismPuni Hariaratnam
 
How to conduct an anti-money laundering (AML) system assessment
How to conduct an anti-money laundering (AML) system assessmentHow to conduct an anti-money laundering (AML) system assessment
How to conduct an anti-money laundering (AML) system assessmentKeith Furst
 
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...Louise Malady
 
Anti money laundering
Anti money launderingAnti money laundering
Anti money launderingUttma Shukla
 
Visualising Data with Code
Visualising Data with CodeVisualising Data with Code
Visualising Data with CodeRi Liu
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with DataSeth Familian
 

Destacado (16)

Quick Reference Guide to BSA/AML Risk Assessment
Quick Reference Guide to BSA/AML Risk AssessmentQuick Reference Guide to BSA/AML Risk Assessment
Quick Reference Guide to BSA/AML Risk Assessment
 
Risk based approach
Risk based approachRisk based approach
Risk based approach
 
Anti-Money Laundering (AML) Risk Assessment Process
Anti-Money Laundering (AML) Risk Assessment ProcessAnti-Money Laundering (AML) Risk Assessment Process
Anti-Money Laundering (AML) Risk Assessment Process
 
Final CDD Rule - How We Got Here and What To Do Now
Final CDD Rule - How We Got Here and What To Do NowFinal CDD Rule - How We Got Here and What To Do Now
Final CDD Rule - How We Got Here and What To Do Now
 
Statistical Approach to CRR
Statistical Approach to CRRStatistical Approach to CRR
Statistical Approach to CRR
 
Reducing False Positives
Reducing False PositivesReducing False Positives
Reducing False Positives
 
Anti money laundering - PEPs
Anti money laundering - PEPsAnti money laundering - PEPs
Anti money laundering - PEPs
 
Anti Money Laundering Typologies
Anti Money Laundering TypologiesAnti Money Laundering Typologies
Anti Money Laundering Typologies
 
anti money laundering certificate
anti money laundering certificateanti money laundering certificate
anti money laundering certificate
 
Anti-Money Laundering and Counter Financing of Terrorism
Anti-Money Laundering and Counter Financing of TerrorismAnti-Money Laundering and Counter Financing of Terrorism
Anti-Money Laundering and Counter Financing of Terrorism
 
How to conduct an anti-money laundering (AML) system assessment
How to conduct an anti-money laundering (AML) system assessmentHow to conduct an anti-money laundering (AML) system assessment
How to conduct an anti-money laundering (AML) system assessment
 
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...
FATF's June 2013 Guidance Note on a Risk Based Approach to Implementing AML/C...
 
Anti money laundering
Anti money launderingAnti money laundering
Anti money laundering
 
Visualising Data with Code
Visualising Data with CodeVisualising Data with Code
Visualising Data with Code
 
Risk Management Framework
Risk Management FrameworkRisk Management Framework
Risk Management Framework
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 

Similar a BSA_AML Rule Tuning

A Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningA Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningIRJET Journal
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSKAMIL MAJEED
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data PreprocessingT Kavitha
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionLalit Jain
 
Smart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend AnalysisSmart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend AnalysisIRJET Journal
 
Risk based quality management
Risk based quality managementRisk based quality management
Risk based quality managementselinasimpson2301
 
Data Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCData Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCsharondabriggs
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...Happiest Minds Technologies
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...IJECEIAES
 
Keys to extract value from the data analytics life cycle
Keys to extract value from the data analytics life cycleKeys to extract value from the data analytics life cycle
Keys to extract value from the data analytics life cycleGrant Thornton LLP
 
Open06
Open06Open06
Open06butest
 
Exam Short Preparation on Data Analytics
Exam Short Preparation on Data AnalyticsExam Short Preparation on Data Analytics
Exam Short Preparation on Data AnalyticsHarsh Parekh
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
Quality management information system
Quality management information systemQuality management information system
Quality management information systemselinasimpson341
 

Similar a BSA_AML Rule Tuning (20)

1234
12341234
1234
 
A Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data MiningA Comparative Study for Anomaly Detection in Data Mining
A Comparative Study for Anomaly Detection in Data Mining
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODS
 
Chapter 3.pdf
Chapter 3.pdfChapter 3.pdf
Chapter 3.pdf
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly Detection
 
Smart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend AnalysisSmart E-Logistics for SCM Spend Analysis
Smart E-Logistics for SCM Spend Analysis
 
Risk based quality management
Risk based quality managementRisk based quality management
Risk based quality management
 
Data Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisCData Mining StepsProblem Definition Market AnalysisC
Data Mining StepsProblem Definition Market AnalysisC
 
Descriptive Analytics: Data Reduction
 Descriptive Analytics: Data Reduction Descriptive Analytics: Data Reduction
Descriptive Analytics: Data Reduction
 
Analyzing Performance Test Data
Analyzing Performance Test DataAnalyzing Performance Test Data
Analyzing Performance Test Data
 
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...
 
Keys to extract value from the data analytics life cycle
Keys to extract value from the data analytics life cycleKeys to extract value from the data analytics life cycle
Keys to extract value from the data analytics life cycle
 
Open06
Open06Open06
Open06
 
Exam Short Preparation on Data Analytics
Exam Short Preparation on Data AnalyticsExam Short Preparation on Data Analytics
Exam Short Preparation on Data Analytics
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
Quality management information system
Quality management information systemQuality management information system
Quality management information system
 

BSA_AML Rule Tuning

  • 1. Approach to AML Rule Thresholds By Mayank Johri, Amin Ahmadi, Kevin Kinkade, Sam Day, Michael Spieler, Erik DeMonte January 12, 2016 Introduction Institutions are constantly facing the challenge of managing growing alert volumes from automated transaction monitoring systems, new money laundering typologies to surveil, and more robust regulatory guidance. The question is how will BSA/AML departments scale to meet demand while managing compliance cost? In order to effectively set baseline thresholds for new detection scenario configuration or improve the efficacy of existing scenarios, apply statistical techniques and industry standards to identify the cut-off between “normal” and “abnormal” or “suspicious” activity. These estimated thresholds are then either challenged or reinforced by the qualitative judgement of professional investigators during a simulated ‘pseudo’ investigation or ‘qualitative assessment’. An effective AML transaction monitoring program includes a standardized process for tuning, optimizing, and testing AML scenarios/typologies that is understandable, repeatable and consistent. An appropriately tuned or optimized scenario seeks a balance between maximizing the identification of suspicious activity while simultaneously maximizing resource efficiency. The two competing objectives of tuning and optimization which must remain in constant balance are: (1) Reduce the number of ‘false positives’ or alerts generated on transactions that do not require further investigation or the filing a Suspicious Activity Report (SAR). (2) Reduce the number of ‘false negatives’ or ‘transactions that were not alerted’ but that do require further investigation or the filing a SAR. Phases The following outlines the eight phase process for the initial tuning:  Phase 0 | Planning. The Policy Office (PO) works closely with the Analytics team to strategize the scenario, stratification, and parameters that will be used to conduct a threshold analysis.  Phase 1 | Assess Data. Analytics communicates which data fields will be required to perform this analysis to Information Technology (IT). IT then determines if the ETL of these fields into Transaction Monitoring System is a near or long term process.  Phase 2 | Query Data. Analytics queries the required transactional data for analysis.  Phase 3 | Quantitative Analysis. Analytics stratifies the data as required (such as grouping like-attributes or ‘non- tunable parameters’ such as entity/consumer, cash intensive businesses/non-CIB, Cr/Db, high/medium risk destinations etc.) to account for like-attribute behavior patterns. Transformation Once stratified, Analytics performs transformations to the data as required (such as 90 day rolling count/sum/standard deviation etc).
  • 2. Exploratory Data Analysis Analytics performs a variety of visual and statistical exploratory data analysis (EDA) techniques to analyze the dataset to understand the correlation and impact that one or more parameters may have on the scenario, and therefore ultimately on the alert-to-case efficacy. The objective of EDA is to further explore the recommended parameters (count, amount, standard deviation, etc.) proposed during the planning phase to determine with greater statistical precision the best combination of parameters Segmentation Once stratified and transformed, Analytics clusters the data’s ‘tunable parameters’ to account for ‘skewness’ in the data population caused by outliers in order to yield a statistically accurate threshold that is representative of the 85th percentile. The 85th percentile is used as a standard when establishing a new rule to set an initial baseline threshold for defining the cutoff between “normal” transactional data and “unusual” transactional data. For normally distributed data with a bell- shaped curve (as depicted in the middle diagram below, figure 1.1), the mean value (i.e., the “expected” value) represents the central tendency of the data, and the 85th percentile represents one standard deviation (σ) from this central tendency. The 85th percentile could represent a reasonably conservative cutoff line or “threshold” for unusual activity. This baseline simply provides a starting point for further analysis, and is later refined through qualitative judgement and alert- to-case efficacy. If transactional data were always normally distributed, it would be easy to calculate one standard deviation above the mean to identify where to draw the line representing the 85th percentile of the data (this technique is often referred to as ‘quantiling’), thus establishing the threshold. However, in real world applications transactional data is often not normally distributed. Transactional data is frequently skewed by outliers (such as uniquely high-value customers), therefore, if statistical techniques that assume normal distribution (such as quantile) are applied while determining the 85th percentile (+1 standard deviation from the mean), the result will yield a misrepresentative ‘threshold’ which is offset by the outlier(s). Figure 1.1 Distribution affected by Skewness Clustering To account for skewness in the data, employ the clustering technique known as ‘Partition around Medoid’ (PAM), or more specifically, ‘Clustering Large Application’s (CLARA). Clustering is an alternative method of data segmentation which is not predicated on the assumption that the data is normally distributed or that it has constant variance. Clustering works by breaking the dataset into groups of distinct clusters around one common entity of the dataset (which represents the group). This partition more accurately allows the assignment of a boundary (such as a target threshold to distinguish normal from unusual activity). The first step of the clustering model is to understand the number of clusters to partition the data by. The methodology used to identify the optimal number of clusters takes into account two variables:
  • 3.  Approximation – How the clustering model fits to the current data set (“Error Measure”)  Generalization – Cost of how well the clustering model could be re-performed with another similar data set The model for clustering can be seen in the figure below. As the number of clusters increases, (x-axis) the model will become more complex and thus less stable. Increasing the number of clusters creates a more customized model which is catered to the current data set, resulting in a high level of approximation. However, in this situation cost will increase as the flexibility to re-perform using a similar data set will become more difficult. Inversely, the fewer clusters the less representative the model is for the current data set, but the more scalable it is for future similar data sets. An objective function curve is plotted to map the tradeoff between the two competing objectives. This modelling methodology is used to identify the inflection point of the objective function of the two variables - the optimal number of clusters that will accommodate both the current data set (approximation) and future data sets (generalization). Refer to figure 1.2 below for the conceptual visual of the modelling methodology used for identifying the optimal number of clusters. Figure 1.2 Cluster Modeling – Identification of Number of Clusters The basic approach to CLARA clustering is to partition objects/observations into several similar subsets. Data is partitioned based on ‘Euclidean’ distance to a common data point (called a medoid). Medoid, rather than being a calculated quantity (as it is the case with “mean”), is a data point in the cluster which happens to have the minimal average dissimilarity to all other data points assigned to the same cluster. Euclidean distance is the most common measure of dissimilarity. The advantage of using medoid-based cluster analysis is the fact that no assumption is made about the structure of the data. In the case of mean-based cluster analysis, however, one makes the implicit restrictive assumption that the data follows a Gaussian (bell-shape) distribution. The next step is to determine the number of dimensions for parameter threshold analysis and to translate the transactional data into ‘events’. An event is defined as a unique combination of all parameters for the identified scenario or rule. The full transactional data set is translated into a population of events. Event bands are formed based on the distribution of total events within the clusters. Event bands can be thought of as the boundaries between the clusters (such that one or more parameters exhibit similarity). Event Banding with One Parameter When a scenario only has one tunable parameter (such as ‘amount’), bands for this parameter are ideally generated in 5% increments beginning at the 50th percentile, resulting in six bands – P50, P55, P60, P65, P70, P75, P80, P85, P90, and P95. The 50th percentile is chosen as a starting point to allow room for adjustment towards a more conservative cluster/threshold, pending the results of the qualitative analysis. In other words, it is important to include clusters well below, but still within reasonable consideration to the target threshold definition of transaction activity that will be considered quantitatively suspicious. Refer to Figure 1.3 below.
  • 4. Figure 1.3 85 th Percentile and BTL/ATL Some parameters such as ‘transaction count’ have a discrete range of values, and therefore the bands may not be able to be established exactly at the desired percentile level. In these cases, judgment is necessary to establish reasonable bands. Depending on the values of the bands, they will often be rounded to nearby numbers of a similar order of magnitude but that are more easily socialized with internal and external business partners. Each of these bands corresponds to a parameter value to be tested as a prospective threshold for the scenario. If the six clusters have ranges that are drastically different from one another, adjustment to the bands may be necessary to make the clusters more reasonable while still maintaining a relatively evenly distributed volume across the event bands. This process is subjective and will differ from scenario-to-scenario, especially in cases where a specific value for a parameter is inherent in the essence of the rule (e.g., $10,000 for cash structuring). In many cases the nature of the customer segment and activity being monitored may support creating fewer than 6 event bands due to the lack of volume of activity for that segment. Figure 1.4 Event Banding of 1 Parameter ‘Amount’ Event Banding with Two Parameters When a scenario has two tunable parameters (such as ‘count’ and ‘amount’), two independent sets of bands need to be established for each parameter, similar to the method used for one parameter.
  • 5. Analysis of two tunable parameters may be thought of as ‘two-dimensional’, whereas one parameter event banding is based only on a single parameter (one axis), event banding with two parameters is affected by two axes (x & y axis). For example, ‘count’ may represent the x-axis, while ‘amount’ may represent the y-axis. In this sense, the ultimate threshold is determined by a combination of both axes, and so are the event bands. Including additional parameters will likewise add additional dimensions and complexity. As discussed above, while the 85th percentile is used to determine the threshold line, bands are created through clustering techniques starting at the 50th percentile to account for those data points below, but still within reasonable consideration to the target threshold definition of transaction activity that will be considered quantitatively suspicious. In the diagram below, we see banding between two parameters, count and value. Once the data is clustered, the 85th percentile is identified per the distribution (upper right hand table in Figure 1.5 below) and qualitative judgement is exercised in order to set exact thresholds within the range that creates a model conducive for re-performance (Refer above to discussion on “generalization” in the discussion of clustering modelling). Figure 1.5 Event Banding of 2 Parameters ‘Value’/‘Count’ Event Banding with more than Two Parameters When the scenario has more than two tunable parameters (such as count, amount and standard deviation), more than two independent sets of bands need to be established for each parameter, similarly to the method used for two parameters. Select Threshold(s) The output of phase three is the ‘threshold’, or ‘event’ characteristics (combination of thresholds based in the case of multiple parameters) which serve as the baseline for ‘suspicious’ activity. Too many alerts may be generated which creates extraneous noise and strains BSA/AML investigative resources. Conversely if the threshold is set too high, suspicious activity may not generate alerts.
  • 6.  Phase 4 | Sampling. Analytics applies the thresholds determined during the quantitative analysis phase to the historical data in order to identify potential events for Above-the-Line (ATL) and Below-the-Line (BTL) analysis. These indicators, when flagged as ‘ATL’ are essentially the same thing as alerts, except since they are applied using historical data they are referred to as ‘pseudo alerts’. The number of transactions which fall into the ATL or BTL category will determine the number of random samples required for a statistically significant qualitative assessment. The purpose of the samples is to evaluate the efficacy of Analytics’ calculated thresholds. In other words, if the threshold is appropriately tuned, then a larger percentage of events marked ‘ATL’ should be classified as ‘suspicious’ by an independent FIU investigator compared to the ‘BTL’ events. Analytics then packages these sample ATL and BTL transactions into a format which is understandable and readable by an FIU investigator (samples must include the transactional detail fields required for FIU to determine the nature of the transactions).  Phase 5 | Training. Analytics orients the FIU investigators to the scenario, parameters and overall intent/spirit of each rule so that during the qualitative analysis phase, the FIU investigators render appropriate independent judgements for ATL and BTL samples.  Phase 6 | Qualitative Analysis. FIU assesses the sampled transactions from a qualitative perspective. During this phase, an independent FIU investigator analyzes each sampled pseudo alert as they would treat real alerts (without any bias regardless of the alert’s classification as ATL or BTL). The investigator’s evaluation must include consideration for the intent of each rule, and should include an assessment of both the qualitative and quantitative fields associated with each alert. The FIU investigator will generally evaluate each transaction through a lens akin to “Given what is known from KYC, origin/destination of funds, beneficiary, etcetera, is it explainable that this consumer/entity would transact this dollar amount at this ...frequency, velocity, pattern etc...” FIU provides feedback to Analytics for each pseudo alert classified as (a) ‘Escalate-to-Case’ (b) ‘Alert Cleared – No Investigation Required (false positive)’, (c) ‘Alert Cleared – Error, or (d) ‘Insufficient Information’. If the efficacy is deemed appropriate, then Business Review Session is scheduled to vote the rule into production.  Phase 7 | Business Review Session. PO, Analytics and FIU present their findings for business review to voting members.  Phase 8 | Implementation. Analytics provides functional specifications to IT to implement the scenario within Transaction Monitoring System.