SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Credit Card Fraud Detection
Why Theory Doesn't Adjust to Practice
Alejandro Correa Bahnsen, Luxembourg University
Andrés Gonzalez Montoya, Scotia Bank
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Introduction
€ 500
€ 600
€ 700
€ 800
2007 2008 2009 2010 2011E 2012E
Europe fraud evolution
Internet transactions (millions of euros)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Introduction
$-
$1.0
$2.0
$3.0
$4.0
$5.0
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
US fraud evolution
Online revenue lost due to fraud (Billions of dollars)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Increasing fraud levels around the world
• Different technologies and legal requirements makes
it harder to control
• There is a need for advanced fraud detection
systems
Introduction
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Introduction
• Transaction flow
• Database
• Evaluation of algorithms
• If-Then rules (Expert Rules)
• Financial measure
• Predictive modeling
• Logistic Regression
• Cost Sensitive Logistic Regression
Agenda
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Simplify transaction flow
Fraud??
Network
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Data
• Larger European card
processing company
• 2012 card present transactions
• 750,000 Transactions
• 3500 Frauds
• 0.467% Fraud rate
• 148,562 EUR lost due to fraud
on test dataset
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
Test
Train
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Raw attributes
• Other attributes:
Age, country of residence, postal code, type of card
Data
TRXID Client ID Date Amount Location Type
Merchant
Group
Fraud
1 1 2/1/12 6:00 580 Ger Internet Airlines No
2 1 2/1/12 6:15 120 Eng Present Car Rent No
3 2 2/1/12 8:20 12 Bel Present Hotel Yes
4 1 3/1/12 4:15 60 Esp ATM ATM No
5 2 3/1/12 9:18 8 Fra Present Retail No
6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Derived attributes
Data
Trx
ID
Client
ID
Date Amount Location Type
Merchant
Group
Fraud
No. of Trx – same
client – last 6 hour
Sum – same client
– last 7 days
1 1 2/1/12 6:00 580 Ger Internet Airlines No 0 0
2 1 2/1/12 6:15 120 Eng Present Car Renting No 1 580
3 2 2/1/12 8:20 12 Bel Present Hotel Yes 0 0
4 1 3/1/12 4:15 60 Esp ATM ATM No 0 700
5 2 3/1/12 9:18 8 Fra Present Retail No 0 12
6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes 1 760
By Group Last Function
Client None hour Count
Credit Card Transaction Type day Sum(Amount)
Merchant week Avg(Amount)
Merchant Category month
Merchant Country 3 months
– Combination of following criteria:
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Misclassification = 1 −
𝑇𝑃+𝑇𝑁
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
• Recall =
𝑇𝑃
𝑇𝑃+𝐹𝑁
• Precision =
𝑇𝑃
𝑇𝑃+𝐹𝑃
• F-Score = 2
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
Evaluation
True Class (𝑦𝑖)
Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0)
Predicted class
(𝑝𝑖)
Fraud (𝑝𝑖=1) TP FP
Legitimate (𝑝𝑖=0) FN TN
• Confusion matrix
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Introduction
• Transaction flow
• Database
• Evaluation of algorithms
• If-Then rules (Expert Rules)
• Financial measure
• Predictive modeling
• Logistic Regression
• Cost Sensitive Logistic Regression
Agenda
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Fraud
Algorithms
• If-Then rules
• Predictive modeling
• Logistic Regression
• Decision Trees
• Random Forest
• Cost Sensitive
Logistic Regression
Fraud??
Network
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• “Purpose is to use facts and rules, taken from the
knowledge of many human experts, to help make
decisions.”
• Example of rules
• More than 4 ATM transactions in one hour?
• More than 2 transactions in 5 minutes?
• Magnetic stripe transaction then internet transaction?
If-Then rules (Expert rules)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• More than 4 ATM transactions in one hour?
• More than 2 transactions in 5 minutes?
• Magnetic stripe transaction then internet
transaction?
If-Then rules (Expert rules)
Fraud??
Network
If one or more rules is activated then decline the transaction
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Problems with rules
• New fraud patterns are not detected
• Only simple rules can be created
• Advantages of rules
• Easy to implement
• Very easy to interpret
If-Then rules (Expert rules)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
If-Then rules (Expert rules)
1.04%
31%
17%
22%
Miss-cla Recall Precision F1-Score
Results
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Motivation
• False positives carries a different cost than
false negatives
• Frauds range from few to thousands of euros
(dollars, pounds, etc)
Financial evaluation
There is a need for a real comparison measure
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Cost matrix
where:
• Evaluation measure
Financial evaluation
Ca Administrative costs
Amt Amount of transaction i
True Class (𝑦𝑖)
Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0)
Predicted class
(𝑝𝑖)
Fraud (𝑝𝑖=1) Ca Ca
Legitimate (𝑝𝑖=0) Amt 0
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
If-Then rules
1.04%
31%
17%
22%
Miss-cla Recall Precision F1-Score
Results
€
95,520
€
148,562
Cost Cost No Model
148,562 EUR are the losses due to fraud in the test database (2 months)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Introduction
• Transaction flow
• Database
• Evaluation of algorithms
• If-Then rules (Expert Rules)
• Financial measure
• Predictive modeling
• Logistic Regression
• Cost Sensitive Logistic Regression
Agenda
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Predictive modeling is the use of statistical and
mathematical techniques to discover patterns in data in
order to make predictions
Predictive modeling
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Predictive modeling
Amountoftransaction
Number of transactions last day
Normal Transaction
Fraud
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Predictive modeling
Amountoftransaction
Number of transactions last day
Normal Transaction
Fraud
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Predictive modeling
Amount of transaction
Number of transactions last day
Normal Transaction
Fraud
Amount spend on internet last month
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
True Class (𝑦𝑖)
Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0)
Predicted class
(𝑝𝑖)
Fraud (𝑝𝑖=1) 0 1
Legitimate (𝑝𝑖=0) 1 0
• Model
• Cost Function
• Cost Matrix
Logistic Regression
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
€
148,196
€
148,562
Cost Cost No Model
0.52% 0%
2%
0%
Miss-cla Recall Precision F1-Score
Logistic Regression
Results
148,562 EUR are the losses due to fraud in the test database (2 months)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
1% 5% 10% 20% 50%
Logistic Regression
Sub-sampling procedure:
0.467%
Select all the frauds and a random sample of the legitimate transactions.
620,000
310,000
62,000
31,000 15,500 5,200
Fraud Percentage
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Logistic Regression
Results
€ 148,562 € 148,196
€ 142,510
€ 112,103
€ 79,838
€ 65,870
€ 46,530
€ -
€ 20,000
€ 40,000
€ 60,000
€ 80,000
€ 100,000
€ 120,000
€ 140,000
€ 160,000
0%
10%
20%
30%
40%
50%
60%
70%
No Model All 1% 5% 10% 20% 50%
Cost Recall Precision Miss-cla F1-Score
Selecting the algorithm by Cost
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Logistic Regression
• Best model selected using traditional F1-Score does not gives the best results in
terms of cost
• Model selected by cost, is trained using less than 1% of the database, meaning there
is a lot of information excluded
• The algorithm is trained to minimize the miss-classification (approx.) but then is
evaluated based on cost
• Why not train the algorithm to minimize the cost instead?
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
True Class (𝑦𝑖)
Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0)
Predicted class
(𝑝𝑖)
Fraud (𝑝𝑖=1) Ca Ca
Legitimate (𝑝𝑖=0) Amt 0
• Cost Matrix
Cost Sensitive Logistic Regression
• Cost Function
• Objective
Find 𝜃 that minimized the cost function (Genetic Algorithms)
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Cost Function
• Gradient
• Hessian
Cost Sensitive Logistic Regression
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Cost Sensitive Logistic Regression
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Legitimate
Fraud
Amount cumulative distribution
€49
€370€124
€196
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
€ 148,562
€ 31,174
€ 37,785
€ 66,245 € 67,264
€ 73,772
€ 85,724
€ -
€ 20,000
€ 40,000
€ 60,000
€ 80,000
€ 100,000
€ 120,000
€ 140,000
€ 160,000
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
No Model All 1% 5% 10% 20% 50%
Cost Recall Precision F1-Score
Cost sensitive Logistic Regression
Results
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Cost sensitive Logistic Regression
Results€ 148,562
€ 95,520
€ 46,530
€ 31,174
€ 35,466 € 34,203
€ -
€ 20,000
€ 40,000
€ 60,000
€ 80,000
€ 100,000
€ 120,000
€ 140,000
€ 160,000
0%
10%
20%
30%
40%
50%
60%
70%
80%
No Model If-Then rules Logistic Regression Cost Sensitive
Logistic Regression
Decision Trees Random Forests
Cost Recall Precision F1-Score
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Conclusion
• Selecting models based on traditional statistics does not
gives the best results in terms of cost
• Models should be evaluated taking into account real
financial costs of the application
• Algorithms should be developed to incorporate those
financial costs
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Contact information
Alejandro Correa Bahnsen
University of Luxembourg
Luxembourg
al.bahnsen@gmail.com
http://www.linkedin.com/in/albahnsen
http://www.slideshare.net/albahnsen
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
Thank You!!
Alejandro Correa Bahnsen
Andres Gonzalez Montoya
Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013
• Hastie, T., & Tibshirani, R. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction.
Beijing.
• Hand, D., Whitrow, C., Adams, N. M., Juszczak, P., & Weston, D. (2007). Performance criteria for plastic card fraud
detection tools. Journal of the Operational Research Society, 59, 956–962.
• Sheng, V., & Ling, C. (2006). Thresholding for making classifiers cost-sensitive. Proceedings of the National
Conference on Artificial Intelligence.
• Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data mining for credit card fraud: A
comparative study. Decision Support Systems, 50(3), 602–613.
• Ling, C., & Sheng, V. (2008). Cost-sensitive learning and the class imbalance problem. In C. Sammut & G. I. Webb
(Eds.), Encyclopedia of Machine Learning (pp. 231–235). Springer.
• Moro, S., Laureano, R., & Cortez, P. (2011). Using data mining for bank direct marketing: An application of the
crisp-dm methodology. In EUROSIS (Ed.), European Simulation and Modeling Conference - ESM’2011 (pp. 117–
121). Guimares, Portugal.
References

Más contenido relacionado

La actualidad más candente

Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningStefano Tempesta
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detectionijtsrd
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Francesca Lazzeri, PhD
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learningijtsrd
 
Uses of analytics in the field of Banking
Uses of analytics in the field of BankingUses of analytics in the field of Banking
Uses of analytics in the field of BankingNiveditasri N
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Predictive analytics solution for claims fraud detection
Predictive analytics solution for claims fraud detectionPredictive analytics solution for claims fraud detection
Predictive analytics solution for claims fraud detectionZensar Technologies Ltd.
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection MLMaatougSelim
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learningdataalcott
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentationHernan Huwyler
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
 
Mclarens @ Data Science Sg
Mclarens @ Data Science SgMclarens @ Data Science Sg
Mclarens @ Data Science SgBenji Thian
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoInstitute of Contemporary Sciences
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"Lviv Startup Club
 
Ibm financial crime management solution 3
Ibm financial crime management solution 3Ibm financial crime management solution 3
Ibm financial crime management solution 3Sunny Fei
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 

La actualidad más candente (20)

Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine Learning
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detection
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...Operationalize deep learning models for fraud detection with Azure Machine Le...
Operationalize deep learning models for fraud detection with Azure Machine Le...
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learning
 
Uses of analytics in the field of Banking
Uses of analytics in the field of BankingUses of analytics in the field of Banking
Uses of analytics in the field of Banking
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Predictive analytics solution for claims fraud detection
Predictive analytics solution for claims fraud detectionPredictive analytics solution for claims fraud detection
Predictive analytics solution for claims fraud detection
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection ML
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
 
Fraud detection
Fraud detectionFraud detection
Fraud detection
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Mclarens @ Data Science Sg
Mclarens @ Data Science SgMclarens @ Data Science Sg
Mclarens @ Data Science Sg
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"Liubomyr Bregman  "Financial Crime Detection using Advanced Analytics"
Liubomyr Bregman "Financial Crime Detection using Advanced Analytics"
 
Ibm financial crime management solution 3
Ibm financial crime management solution 3Ibm financial crime management solution 3
Ibm financial crime management solution 3
 
Creditcard
CreditcardCreditcard
Creditcard
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 

Destacado

Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...Alejandro Correa Bahnsen, PhD
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningAlejandro Correa Bahnsen, PhD
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksAlejandro Correa Bahnsen, PhD
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Alejandro Correa Bahnsen, PhD
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesAlejandro Correa Bahnsen, PhD
 

Destacado (10)

2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle2011 advanced analytics through the credit cycle
2011 advanced analytics through the credit cycle
 
Modern Data Science
Modern Data ScienceModern Data Science
Modern Data Science
 
Fraud analytics detección y prevención de fraudes en la era del big data sl...
Fraud analytics detección y prevención de fraudes en la era del big data   sl...Fraud analytics detección y prevención de fraudes en la era del big data   sl...
Fraud analytics detección y prevención de fraudes en la era del big data sl...
 
Maximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learningMaximizing a churn campaigns profitability with cost sensitive machine learning
Maximizing a churn campaigns profitability with cost sensitive machine learning
 
1609 Fraud Data Science
1609 Fraud Data Science1609 Fraud Data Science
1609 Fraud Data Science
 
Analytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacionAnalytics - compitiendo en la era de la informacion
Analytics - compitiendo en la era de la informacion
 
Classifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural NetworksClassifying Phishing URLs Using Recurrent Neural Networks
Classifying Phishing URLs Using Recurrent Neural Networks
 
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...Maximizing a churn campaign’s profitability with cost sensitive predictive an...
Maximizing a churn campaign’s profitability with cost sensitive predictive an...
 
Demystifying machine learning using lime
Demystifying machine learning using limeDemystifying machine learning using lime
Demystifying machine learning using lime
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slides
 

Similar a 2013 credit card fraud detection why theory dosent adjust to practice

IRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET Journal
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNeo4j
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarImpetus Technologies
 
Marketing in the Cloud
Marketing in the CloudMarketing in the Cloud
Marketing in the CloudScott Brinker
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarSmart Insights
 
Analytics in las vegas
Analytics in las vegasAnalytics in las vegas
Analytics in las vegasLon ODonnell
 
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...TigerGraph
 
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享Amazon Web Services
 
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...Decision CAMP
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxyatintaneja6
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning TechniquesIRJET Journal
 
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Amazon Web Services Korea
 
ATHENS Predictive Claim Subrogation - QueBIT
ATHENS Predictive Claim Subrogation - QueBITATHENS Predictive Claim Subrogation - QueBIT
ATHENS Predictive Claim Subrogation - QueBITQueBIT Consulting
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersBrian Griffith
 

Similar a 2013 credit card fraud detection why theory dosent adjust to practice (20)

Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
 
Data Science for Retail Broking
Data Science for Retail BrokingData Science for Retail Broking
Data Science for Retail Broking
 
IRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention System
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4j
 
Trends in AML Compliance and Technology
Trends in AML Compliance and TechnologyTrends in AML Compliance and Technology
Trends in AML Compliance and Technology
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Machine Learning For Stock Broking
Machine Learning For Stock BrokingMachine Learning For Stock Broking
Machine Learning For Stock Broking
 
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
 
Marketing in the Cloud
Marketing in the CloudMarketing in the Cloud
Marketing in the Cloud
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology Webinar
 
Analytics in las vegas
Analytics in las vegasAnalytics in las vegas
Analytics in las vegas
 
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
Detecting Fraud and AML Violations In Real-Time for Banking, Telecom and eCom...
 
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
 
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...
Decision CAMP 2013 - sako hidetoshi - blaze consulting japan - Using Business...
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
 
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
Democratization - New Wave of Data Science (홍운표 상무, DataRobot) :: AWS Techfor...
 
ATHENS Predictive Claim Subrogation - QueBIT
ATHENS Predictive Claim Subrogation - QueBITATHENS Predictive Claim Subrogation - QueBIT
ATHENS Predictive Claim Subrogation - QueBIT
 
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its CustomersHow Eastern Bank Uses Big Data to Better Serve and Protect its Customers
How Eastern Bank Uses Big Data to Better Serve and Protect its Customers
 

Más de Alejandro Correa Bahnsen, PhD

Más de Alejandro Correa Bahnsen, PhD (6)

black hat deephish
black hat deephishblack hat deephish
black hat deephish
 
DeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AIDeepPhish: Simulating malicious AI
DeepPhish: Simulating malicious AI
 
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
AI vs. AI: Can Predictive Models Stop the Tide of Hacker AI?
 
How I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsHow I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data Products
 
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision TreesFraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision Trees
 
2012 predictive clusters
2012 predictive clusters2012 predictive clusters
2012 predictive clusters
 

2013 credit card fraud detection why theory dosent adjust to practice

  • 1. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Credit Card Fraud Detection Why Theory Doesn't Adjust to Practice Alejandro Correa Bahnsen, Luxembourg University Andrés Gonzalez Montoya, Scotia Bank
  • 2. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Introduction € 500 € 600 € 700 € 800 2007 2008 2009 2010 2011E 2012E Europe fraud evolution Internet transactions (millions of euros)
  • 3. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Introduction $- $1.0 $2.0 $3.0 $4.0 $5.0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 US fraud evolution Online revenue lost due to fraud (Billions of dollars)
  • 4. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Increasing fraud levels around the world • Different technologies and legal requirements makes it harder to control • There is a need for advanced fraud detection systems Introduction
  • 5. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Introduction • Transaction flow • Database • Evaluation of algorithms • If-Then rules (Expert Rules) • Financial measure • Predictive modeling • Logistic Regression • Cost Sensitive Logistic Regression Agenda
  • 6. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Simplify transaction flow Fraud?? Network
  • 7. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Data • Larger European card processing company • 2012 card present transactions • 750,000 Transactions • 3500 Frauds • 0.467% Fraud rate • 148,562 EUR lost due to fraud on test dataset Dec Nov Oct Sep Aug Jul Jun May Apr Mar Feb Jan Test Train
  • 8. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Raw attributes • Other attributes: Age, country of residence, postal code, type of card Data TRXID Client ID Date Amount Location Type Merchant Group Fraud 1 1 2/1/12 6:00 580 Ger Internet Airlines No 2 1 2/1/12 6:15 120 Eng Present Car Rent No 3 2 2/1/12 8:20 12 Bel Present Hotel Yes 4 1 3/1/12 4:15 60 Esp ATM ATM No 5 2 3/1/12 9:18 8 Fra Present Retail No 6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes
  • 9. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Derived attributes Data Trx ID Client ID Date Amount Location Type Merchant Group Fraud No. of Trx – same client – last 6 hour Sum – same client – last 7 days 1 1 2/1/12 6:00 580 Ger Internet Airlines No 0 0 2 1 2/1/12 6:15 120 Eng Present Car Renting No 1 580 3 2 2/1/12 8:20 12 Bel Present Hotel Yes 0 0 4 1 3/1/12 4:15 60 Esp ATM ATM No 0 700 5 2 3/1/12 9:18 8 Fra Present Retail No 0 12 6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes 1 760 By Group Last Function Client None hour Count Credit Card Transaction Type day Sum(Amount) Merchant week Avg(Amount) Merchant Category month Merchant Country 3 months – Combination of following criteria:
  • 10. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Misclassification = 1 − 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 • Recall = 𝑇𝑃 𝑇𝑃+𝐹𝑁 • Precision = 𝑇𝑃 𝑇𝑃+𝐹𝑃 • F-Score = 2 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 Evaluation True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑝𝑖=1) TP FP Legitimate (𝑝𝑖=0) FN TN • Confusion matrix
  • 11. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Introduction • Transaction flow • Database • Evaluation of algorithms • If-Then rules (Expert Rules) • Financial measure • Predictive modeling • Logistic Regression • Cost Sensitive Logistic Regression Agenda
  • 12. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Fraud Algorithms • If-Then rules • Predictive modeling • Logistic Regression • Decision Trees • Random Forest • Cost Sensitive Logistic Regression Fraud?? Network
  • 13. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • “Purpose is to use facts and rules, taken from the knowledge of many human experts, to help make decisions.” • Example of rules • More than 4 ATM transactions in one hour? • More than 2 transactions in 5 minutes? • Magnetic stripe transaction then internet transaction? If-Then rules (Expert rules)
  • 14. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • More than 4 ATM transactions in one hour? • More than 2 transactions in 5 minutes? • Magnetic stripe transaction then internet transaction? If-Then rules (Expert rules) Fraud?? Network If one or more rules is activated then decline the transaction
  • 15. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Problems with rules • New fraud patterns are not detected • Only simple rules can be created • Advantages of rules • Easy to implement • Very easy to interpret If-Then rules (Expert rules)
  • 16. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 If-Then rules (Expert rules) 1.04% 31% 17% 22% Miss-cla Recall Precision F1-Score Results
  • 17. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Motivation • False positives carries a different cost than false negatives • Frauds range from few to thousands of euros (dollars, pounds, etc) Financial evaluation There is a need for a real comparison measure
  • 18. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Cost matrix where: • Evaluation measure Financial evaluation Ca Administrative costs Amt Amount of transaction i True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑝𝑖=1) Ca Ca Legitimate (𝑝𝑖=0) Amt 0
  • 19. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 If-Then rules 1.04% 31% 17% 22% Miss-cla Recall Precision F1-Score Results € 95,520 € 148,562 Cost Cost No Model 148,562 EUR are the losses due to fraud in the test database (2 months)
  • 20. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Introduction • Transaction flow • Database • Evaluation of algorithms • If-Then rules (Expert Rules) • Financial measure • Predictive modeling • Logistic Regression • Cost Sensitive Logistic Regression Agenda
  • 21. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Predictive modeling is the use of statistical and mathematical techniques to discover patterns in data in order to make predictions Predictive modeling
  • 22. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Predictive modeling Amountoftransaction Number of transactions last day Normal Transaction Fraud
  • 23. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Predictive modeling Amountoftransaction Number of transactions last day Normal Transaction Fraud
  • 24. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Predictive modeling Amount of transaction Number of transactions last day Normal Transaction Fraud Amount spend on internet last month
  • 25. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑝𝑖=1) 0 1 Legitimate (𝑝𝑖=0) 1 0 • Model • Cost Function • Cost Matrix Logistic Regression
  • 26. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 € 148,196 € 148,562 Cost Cost No Model 0.52% 0% 2% 0% Miss-cla Recall Precision F1-Score Logistic Regression Results 148,562 EUR are the losses due to fraud in the test database (2 months)
  • 27. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 1% 5% 10% 20% 50% Logistic Regression Sub-sampling procedure: 0.467% Select all the frauds and a random sample of the legitimate transactions. 620,000 310,000 62,000 31,000 15,500 5,200 Fraud Percentage
  • 28. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Logistic Regression Results € 148,562 € 148,196 € 142,510 € 112,103 € 79,838 € 65,870 € 46,530 € - € 20,000 € 40,000 € 60,000 € 80,000 € 100,000 € 120,000 € 140,000 € 160,000 0% 10% 20% 30% 40% 50% 60% 70% No Model All 1% 5% 10% 20% 50% Cost Recall Precision Miss-cla F1-Score Selecting the algorithm by Cost
  • 29. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Logistic Regression • Best model selected using traditional F1-Score does not gives the best results in terms of cost • Model selected by cost, is trained using less than 1% of the database, meaning there is a lot of information excluded • The algorithm is trained to minimize the miss-classification (approx.) but then is evaluated based on cost • Why not train the algorithm to minimize the cost instead?
  • 30. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑝𝑖=1) Ca Ca Legitimate (𝑝𝑖=0) Amt 0 • Cost Matrix Cost Sensitive Logistic Regression • Cost Function • Objective Find 𝜃 that minimized the cost function (Genetic Algorithms)
  • 31. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Cost Function • Gradient • Hessian Cost Sensitive Logistic Regression
  • 32. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Cost Sensitive Logistic Regression 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Legitimate Fraud Amount cumulative distribution €49 €370€124 €196
  • 33. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 € 148,562 € 31,174 € 37,785 € 66,245 € 67,264 € 73,772 € 85,724 € - € 20,000 € 40,000 € 60,000 € 80,000 € 100,000 € 120,000 € 140,000 € 160,000 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% No Model All 1% 5% 10% 20% 50% Cost Recall Precision F1-Score Cost sensitive Logistic Regression Results
  • 34. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Cost sensitive Logistic Regression Results€ 148,562 € 95,520 € 46,530 € 31,174 € 35,466 € 34,203 € - € 20,000 € 40,000 € 60,000 € 80,000 € 100,000 € 120,000 € 140,000 € 160,000 0% 10% 20% 30% 40% 50% 60% 70% 80% No Model If-Then rules Logistic Regression Cost Sensitive Logistic Regression Decision Trees Random Forests Cost Recall Precision F1-Score
  • 35. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Conclusion • Selecting models based on traditional statistics does not gives the best results in terms of cost • Models should be evaluated taking into account real financial costs of the application • Algorithms should be developed to incorporate those financial costs
  • 36. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Contact information Alejandro Correa Bahnsen University of Luxembourg Luxembourg al.bahnsen@gmail.com http://www.linkedin.com/in/albahnsen http://www.slideshare.net/albahnsen
  • 37. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 Thank You!! Alejandro Correa Bahnsen Andres Gonzalez Montoya
  • 38. Copyright © 2013, SAS Institute Inc. All rights reserved. #analytics2013 • Hastie, T., & Tibshirani, R. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Beijing. • Hand, D., Whitrow, C., Adams, N. M., Juszczak, P., & Weston, D. (2007). Performance criteria for plastic card fraud detection tools. Journal of the Operational Research Society, 59, 956–962. • Sheng, V., & Ling, C. (2006). Thresholding for making classifiers cost-sensitive. Proceedings of the National Conference on Artificial Intelligence. • Bhattacharyya, S., Jha, S., Tharakunnel, K., & Westland, J. C. (2011). Data mining for credit card fraud: A comparative study. Decision Support Systems, 50(3), 602–613. • Ling, C., & Sheng, V. (2008). Cost-sensitive learning and the class imbalance problem. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of Machine Learning (pp. 231–235). Springer. • Moro, S., Laureano, R., & Cortez, P. (2011). Using data mining for bank direct marketing: An application of the crisp-dm methodology. In EUROSIS (Ed.), European Simulation and Modeling Conference - ESM’2011 (pp. 117– 121). Guimares, Portugal. References