SlideShare a Scribd company logo
1 of 15
6.53
E-COMMERCE
FRAUD-MACHINE
LEARNING
MODELS
Ximena Bustamante
INTRODUCTION
According to Statista “e-commerce losses to online payment fraud were
estimated at 41 billion U.S. dollars globally in 2022, up from the previous
year. The figure is expected to grow further to 48 billion U.S. dollars by
2023” (Statista, “Value of e-commerce losses to online payment fraud
worldwide from 2020 to 2023”)
Machine learning algorithms are often used to identify potentially
fraudulent transactions
Come explore with me two models, logistic regression and decision trees,
that were used to identify variables significantly correlated with fraud
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
2
DATASET
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
3
Variables
customerEmail
Multiple
Duplicated
customerPhone
customerDevice
customerIPAddress
customerBillingAddress
No_Transactions
No_Orders
No_Payments
transactionId
orderId
paymentMethodId
paymentMethodRegistrationFailure
paymentMethodType
paymentMethodProvider
transactionAmount
transactionFailed
orderState
Fraud
KEY INSIGHTS
KEY INSIGHTS
SIGNIFICANT VARIABLES
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
5
• The dataset consisted of 19 variable
• Out of the 18 independent variables—1 dependent variable—only 7 were found to be significant and the algorithms were run on
these.
KEY INSIGHTS
LOGISTIC REGRESSION
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
6
• A logistic regression model was created with one dependent variable (fraud: Y/N) and 7 independent variables
• It resulted in a highly accurate model according to the confusion matrix used to measure its precision
• As see on the image on the right, it resulted in an 88% accuracy, 85% sensitivity
91% specificity, 90% precision and 87% negative predictive value
• Out of 65 non-fraud transactions in the test data, it correctly identified 59
• Out of 64 fraud transactions in the test data, it correctly identified 55
KEY INSIGHTS
DECISION TREES
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
7
• A decision tree model was also created with the same dependent and independent
• It also resulted in a highly accurate algorithm according to the confusion matrix used to measure its precision
• This model resulted in a 96% sensitivity
83% specificity, 85% pos predictive and 95% negative predictive value
• Out of a total of 260 non-fraud transactions, it correctly identified 249
• Out of a total of 257 fraud transactions, it correctly identified 213
DATA PROCESS-ACQUISITION,
PREPARATION, ANALYSIS AND
VISUALIZATION
DATA ACQUISITION, PREPARATION AND
ANALYSIS
EXCEL & ACCES
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
9
• Data was acquired from Kaggle and analysis was conducted with inspiration from University of Illinois –Urbana Champaign
Professor Hudson (Machine Learning Algorithms with R in Business Analytics)
• Tables with transaction data and customer data were initially joined in Acces and then explored in Excel
• Initial exploration of the data led to the identification of multiple customer e-mails associated to one customer
• This led to a new variable of binomial values being created to reflect transactions for customers with MULTIPLE emails
DATA ACQUISITION, PREPARATION AND
ANALYSIS
POWER BI
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
10
• Power BI-Power Query was used to conduct more in-depth analysis of the variables
• Based on “Column Distribution”, it was evident that some IP addresses, devices and billing addresses were being used by multiple
customers (DUPLICATED)
• Thus, a new “Duplicated” column was created to reflect these transactions
DATA ACQUISITION, PREPARATION AND
ANALYSIS
R STUDIO
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
11
• R Studio was used to create the 2 Machine Learning (ML) algorithms
• For the complete code, please visit my GitHub repository
• To create both ML models, I uploaded the necessary libraries, converted strings to factors, created confusion matrix, visualized the
balance of the dataset, split the data into training and testing sets, trained the models and the evaluated them on the test data,
made predictions, and finally used confusion matrix to measure accuracy
DATA VISUALIZATION
POWER BI
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
12
• Power BI was used to create a map to show the geographical location of all transactions, color coded by fraud and non-fraud
CHALLENGES AND COOL TECHNIQUES
CHALLENGES AND COOL TECHNIQUES
2023 E-Commerce Fraud Machine Learning Models-Ximena
Bustamante
14
• Challenge: High number of correlated variables
• Cool Technique: Feature engineering--created two columns (with binomial values) to reflect transactions that had
duplicated/multiple addresses, phone numbers and Ip addresses, instead of creating one column for
• Challenge: Unbalanced dataset
• Cool Technique: Balanced it using RUS (random under sampling) to create a dataset with roughly the same amount of fraud/non-
fraud transactions
What If I had More Time?
• If I had more time, I would have done social networking to see how transactions may associate to one another
THANK YOU FOR
CHECKING OUT MY
PROJECT!
 Follow me for more project ideas
 If you have any questions, comments, feedback, JOB OFFERS , feel free to DM me
2023 E-Commerce Fraud Machine Learning
Models-Ximena Bustamante
15

More Related Content

Similar to E-Commerce Fraud Machine Learning Models.pptx

CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION K Srinivas Rao
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxyatintaneja6
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...IRJET Journal
 
Sharing Microsoft RMS Data with QuickBooks
Sharing Microsoft RMS Data with QuickBooksSharing Microsoft RMS Data with QuickBooks
Sharing Microsoft RMS Data with QuickBooksDawn Scranton
 
Global Dynamics 365 Bootcamp London 2018
Global Dynamics 365 Bootcamp London 2018Global Dynamics 365 Bootcamp London 2018
Global Dynamics 365 Bootcamp London 2018Stefano Tempesta
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.Shakas Technologies
 
Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Solutions
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningIRJET Journal
 
TELECOM SERVICES: I.T. & ANALYTICS
TELECOM SERVICES: I.T. & ANALYTICSTELECOM SERVICES: I.T. & ANALYTICS
TELECOM SERVICES: I.T. & ANALYTICSGeorge Krasadakis
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsIRJET Journal
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Neo4j
 
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSFRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSIRJET Journal
 
IRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud DetectionIRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud DetectionIRJET Journal
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber SecurityRishi Kant
 
Are Merchants Losing The CNP Fraud Battle - A QPS Whitepaper
Are Merchants Losing The CNP Fraud Battle - A QPS WhitepaperAre Merchants Losing The CNP Fraud Battle - A QPS Whitepaper
Are Merchants Losing The CNP Fraud Battle - A QPS WhitepaperQuatrro Processing Services (QPS)
 
Automated cheque recognition
Automated cheque recognitionAutomated cheque recognition
Automated cheque recognitioninfo_jojo
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning TechniquesIRJET Journal
 

Similar to E-Commerce Fraud Machine Learning Models.pptx (20)

CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
 
A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
 
Sharing Microsoft RMS Data with QuickBooks
Sharing Microsoft RMS Data with QuickBooksSharing Microsoft RMS Data with QuickBooks
Sharing Microsoft RMS Data with QuickBooks
 
Global Dynamics 365 Bootcamp London 2018
Global Dynamics 365 Bootcamp London 2018Global Dynamics 365 Bootcamp London 2018
Global Dynamics 365 Bootcamp London 2018
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...Certus Accelerate - Building the business case for why you need to invest in ...
Certus Accelerate - Building the business case for why you need to invest in ...
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine Learning
 
TELECOM SERVICES: I.T. & ANALYTICS
TELECOM SERVICES: I.T. & ANALYTICSTELECOM SERVICES: I.T. & ANALYTICS
TELECOM SERVICES: I.T. & ANALYTICS
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
 
Project PPT sem 2.pptx
Project PPT sem 2.pptxProject PPT sem 2.pptx
Project PPT sem 2.pptx
 
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
Banking Circle: Money Laundering Beware: A Modern Approach to AML with Machin...
 
CREDIT_CARD.ppt
CREDIT_CARD.pptCREDIT_CARD.ppt
CREDIT_CARD.ppt
 
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSFRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
 
IRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud DetectionIRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud Detection
 
The Role of Generative AI and LLMs in Accounts Payable Automation1.pdf
The Role of Generative AI and LLMs in Accounts Payable Automation1.pdfThe Role of Generative AI and LLMs in Accounts Payable Automation1.pdf
The Role of Generative AI and LLMs in Accounts Payable Automation1.pdf
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber Security
 
Are Merchants Losing The CNP Fraud Battle - A QPS Whitepaper
Are Merchants Losing The CNP Fraud Battle - A QPS WhitepaperAre Merchants Losing The CNP Fraud Battle - A QPS Whitepaper
Are Merchants Losing The CNP Fraud Battle - A QPS Whitepaper
 
Automated cheque recognition
Automated cheque recognitionAutomated cheque recognition
Automated cheque recognition
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
 

Recently uploaded

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 

Recently uploaded (20)

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 

E-Commerce Fraud Machine Learning Models.pptx

  • 2. INTRODUCTION According to Statista “e-commerce losses to online payment fraud were estimated at 41 billion U.S. dollars globally in 2022, up from the previous year. The figure is expected to grow further to 48 billion U.S. dollars by 2023” (Statista, “Value of e-commerce losses to online payment fraud worldwide from 2020 to 2023”) Machine learning algorithms are often used to identify potentially fraudulent transactions Come explore with me two models, logistic regression and decision trees, that were used to identify variables significantly correlated with fraud 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 2
  • 3. DATASET 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 3 Variables customerEmail Multiple Duplicated customerPhone customerDevice customerIPAddress customerBillingAddress No_Transactions No_Orders No_Payments transactionId orderId paymentMethodId paymentMethodRegistrationFailure paymentMethodType paymentMethodProvider transactionAmount transactionFailed orderState Fraud
  • 5. KEY INSIGHTS SIGNIFICANT VARIABLES 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 5 • The dataset consisted of 19 variable • Out of the 18 independent variables—1 dependent variable—only 7 were found to be significant and the algorithms were run on these.
  • 6. KEY INSIGHTS LOGISTIC REGRESSION 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 6 • A logistic regression model was created with one dependent variable (fraud: Y/N) and 7 independent variables • It resulted in a highly accurate model according to the confusion matrix used to measure its precision • As see on the image on the right, it resulted in an 88% accuracy, 85% sensitivity 91% specificity, 90% precision and 87% negative predictive value • Out of 65 non-fraud transactions in the test data, it correctly identified 59 • Out of 64 fraud transactions in the test data, it correctly identified 55
  • 7. KEY INSIGHTS DECISION TREES 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 7 • A decision tree model was also created with the same dependent and independent • It also resulted in a highly accurate algorithm according to the confusion matrix used to measure its precision • This model resulted in a 96% sensitivity 83% specificity, 85% pos predictive and 95% negative predictive value • Out of a total of 260 non-fraud transactions, it correctly identified 249 • Out of a total of 257 fraud transactions, it correctly identified 213
  • 9. DATA ACQUISITION, PREPARATION AND ANALYSIS EXCEL & ACCES 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 9 • Data was acquired from Kaggle and analysis was conducted with inspiration from University of Illinois –Urbana Champaign Professor Hudson (Machine Learning Algorithms with R in Business Analytics) • Tables with transaction data and customer data were initially joined in Acces and then explored in Excel • Initial exploration of the data led to the identification of multiple customer e-mails associated to one customer • This led to a new variable of binomial values being created to reflect transactions for customers with MULTIPLE emails
  • 10. DATA ACQUISITION, PREPARATION AND ANALYSIS POWER BI 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 10 • Power BI-Power Query was used to conduct more in-depth analysis of the variables • Based on “Column Distribution”, it was evident that some IP addresses, devices and billing addresses were being used by multiple customers (DUPLICATED) • Thus, a new “Duplicated” column was created to reflect these transactions
  • 11. DATA ACQUISITION, PREPARATION AND ANALYSIS R STUDIO 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 11 • R Studio was used to create the 2 Machine Learning (ML) algorithms • For the complete code, please visit my GitHub repository • To create both ML models, I uploaded the necessary libraries, converted strings to factors, created confusion matrix, visualized the balance of the dataset, split the data into training and testing sets, trained the models and the evaluated them on the test data, made predictions, and finally used confusion matrix to measure accuracy
  • 12. DATA VISUALIZATION POWER BI 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 12 • Power BI was used to create a map to show the geographical location of all transactions, color coded by fraud and non-fraud
  • 13. CHALLENGES AND COOL TECHNIQUES
  • 14. CHALLENGES AND COOL TECHNIQUES 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 14 • Challenge: High number of correlated variables • Cool Technique: Feature engineering--created two columns (with binomial values) to reflect transactions that had duplicated/multiple addresses, phone numbers and Ip addresses, instead of creating one column for • Challenge: Unbalanced dataset • Cool Technique: Balanced it using RUS (random under sampling) to create a dataset with roughly the same amount of fraud/non- fraud transactions What If I had More Time? • If I had more time, I would have done social networking to see how transactions may associate to one another
  • 15. THANK YOU FOR CHECKING OUT MY PROJECT!  Follow me for more project ideas  If you have any questions, comments, feedback, JOB OFFERS , feel free to DM me 2023 E-Commerce Fraud Machine Learning Models-Ximena Bustamante 15