SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
DISTRIBUTED DATA MINING IN
  CREDIT CARD FRAUD DETECTION
INTRODUCTION
Credit card transactions grow in number, taking a larger share of any
country’s payment system and this is turn has led to a higher rate of
stolen account numbers and subsequent losses by banks. Hence,
improved fraud detection has become essential to maintain the
viability of the country’s payment system.
Banks have used early fraud warning systems for some years. Large-
scale data-mining techniques can improve on the state of the art in
commercial practice. Scalable techniques to analyze massive
amounts of transaction data that efficiently compute fraud detectors
in a timely manner is an important problem, especially for e-
commerce.
Besides scalability and efficiency, the fraud-detection task exhibits
technical problems that include skewed distributions of training data
and non-uniform cost per error, both of which have not been widely
studied in the knowledge-discovery and datamining community.
In this project, a deep survey is made and evaluates a number of
techniques that address these three main issues concurrently.
Our proposed methods of combining multiple learned fraud detectors
under a “cost model” are general and demonstrably useful; our
empirical results demonstrate that we can significantly reduce loss
due to fraud through distributed data mining of fraud models.


DATA MINING AND MACHINE LEARNING
The aim of data mining is to extract knowledge from large amounts of
data. This knowledge is nontrivial and hidden in the data. Machine
learning is often used in data mining.
DATA MINING: A DEFINITION
Art/Science of uncovering non-trivial, valuable information from
                       a large database
Emphasis on:
  Non-obvious (difficult)

  Useful (cost vs benefit)

  Large (automatic)



Yet, no rules, provided that the process is efficient in time, space and
human resources.
  Data Mining is the process of finding interesting trends or
  patterns in large datasets in order to guide future decisions.

  Related to exploratory data analysis (area of statistics) and
  knowledge discovery (area in artificial intelligence, machine
  learning).
  Data Mining is characterized by having VERY LARGE datasets.



DATA MINING VS. MACHINE LEARNING

  Size: Databases are usually very large so algorithms must scale
  well

  Design Purpose: Databases are not usually designed for data
  mining (but for other purposes), and thus, may not have
  convenient attributes
  Errors and Noise: Databases almost always contain errors


The aim of machine learning is to adapt to new circumstances, to
detect and extrapolate. A distinction can be made between
unsupervised and supervised machine learning algorithms.
PROPOSED SYSTEM

In today’s increasingly electronic society and with the rapid advances
of electronic commerce on the Internet, the use of credit cards for
purchases has become convenient and necessary.

Credit card transactions have become the de facto standard for
Internet and Webbased e-commerce. The US government estimates
that credit cards accounted for approximately US $13 billion in
Internet sales during 1998. This figure is expected to grow rapidly
each year.

However, the growing number of credit card transactions provides
more opportunity for thieves to steal credit card numbers and
subsequently commit fraud. When banks lose money because of
credit card fraud, cardholders pay for all of that loss through higher
interest rates, higher fees, and reduced benefits.

Hence, it is in both the banks’ and cardholders’ interest to reduce
illegitimate use of credit cards by early fraud detection. For many
years, the credit card industry has studied computing models for
automated detection systems; recently, these models have been the
subject of academic research, especially with respect to e-
commerce.

The credit card fraud-detection domain presents a number of
challenging issues for data mining:

  There are millions of credit card transactions processed each day.
  Mining such massive amounts of data requires highly efficient
  techniques that scale.

  The data are highly skewed—many more transactions are
  legitimate than fraudulent.

  Typical accuracy-based mining techniques can generate highly
  accurate fraud detectors by simply predicting that all transactions
  are legitimate, although this is equivalent to not detecting fraud at
  all.
Each transaction record has a different dollar amount and thus has
  a variable potential loss, rather than a fixed misclassification cost
  per error type, as is commonly assumed in cost-based mining
  techniques.


Our approach addresses the efficiency and scalability issues in
several ways. We divide large data set of labeled transactions (either
fraudulent or legitimate) into smaller subsets, apply mining
techniques to generate classifiers

in parallel, and combine the resultant base models by metalearning
from the classifiers’ behavior to generate a metaclassifier. Our
approach treats the classifiers as black boxes so that we can employ
a variety of learning algorithms.

Besides extensibility, combining multiple models computed over all
available data produces metaclassifiers that can offset the loss of
predictive performance that usually occurs when mining from data
subsets or sampling.

Furthermore, when we use the learned classifiers (for example,
during transaction authorization), the base classifiers can execute in
parallel, with the metaclassifier then combining their results. So, our
approach is highly efficient in generating these models and also
relatively efficient in applying them.

Another parallel approach focuses on parallelizing a particular
algorithm on a particular parallel architecture. However, a new
algorithm or architecture requires a substantial amount of parallel-
programming work.

Although our architecture and algorithm-independent approach is not
as efficient as some fine-grained parallelization approaches, it lets
users plug different off-the-shelf learning programs into a parallel and
distributed environment with relative ease and eliminates the need for
expensive parallel hardware.
We are going to use the ADACost algorithm.



SOFTWARE TOOLS
  • ASP .NET

  • Oracle Database


HARDWARE TOOLS
  • Pentium Server with Client

Más contenido relacionado

Más de ncct

Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological SignalsBiomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological Signalsncct
 
Digital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy DetectionDigital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy Detectionncct
 
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...ncct
 
Cockpit White Box
Cockpit White BoxCockpit White Box
Cockpit White Boxncct
 
Rail Track Inspector
Rail Track InspectorRail Track Inspector
Rail Track Inspectorncct
 
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...ncct
 
Bot Robo Tanker Sound Detector
Bot Robo  Tanker  Sound DetectorBot Robo  Tanker  Sound Detector
Bot Robo Tanker Sound Detectorncct
 
Distance Protection
Distance ProtectionDistance Protection
Distance Protectionncct
 
Bluetooth Jammer
Bluetooth  JammerBluetooth  Jammer
Bluetooth Jammerncct
 
Crypkit 1
Crypkit 1Crypkit 1
Crypkit 1ncct
 
I E E E 2009 Java Projects
I E E E 2009  Java  ProjectsI E E E 2009  Java  Projects
I E E E 2009 Java Projectsncct
 
B E Projects M C A Projects B
B E  Projects  M C A  Projects  BB E  Projects  M C A  Projects  B
B E Projects M C A Projects Bncct
 
J2 E E Projects, I E E E Projects 2009
J2 E E  Projects,  I E E E  Projects 2009J2 E E  Projects,  I E E E  Projects 2009
J2 E E Projects, I E E E Projects 2009ncct
 
J2 M E Projects, I E E E Projects 2009
J2 M E  Projects,  I E E E  Projects 2009J2 M E  Projects,  I E E E  Projects 2009
J2 M E Projects, I E E E Projects 2009ncct
 
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...ncct
 
B E M E Projects M C A Projects B
B E  M E  Projects  M C A  Projects  BB E  M E  Projects  M C A  Projects  B
B E M E Projects M C A Projects Bncct
 
I E E E 2009 Java Projects, I E E E 2009 A S P
I E E E 2009  Java  Projects,  I E E E 2009  A S PI E E E 2009  Java  Projects,  I E E E 2009  A S P
I E E E 2009 Java Projects, I E E E 2009 A S Pncct
 
Advantages Of Software Projects N C C T
Advantages Of  Software  Projects  N C C TAdvantages Of  Software  Projects  N C C T
Advantages Of Software Projects N C C Tncct
 
Engineering Projects
Engineering  ProjectsEngineering  Projects
Engineering Projectsncct
 
Software Projects Java Projects Mobile Computing
Software  Projects  Java  Projects  Mobile  ComputingSoftware  Projects  Java  Projects  Mobile  Computing
Software Projects Java Projects Mobile Computingncct
 

Más de ncct (20)

Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological SignalsBiomedical Wearable Device For Remote Monitoring Ofphysiological Signals
Biomedical Wearable Device For Remote Monitoring Ofphysiological Signals
 
Digital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy DetectionDigital Water Marking For Video Piracy Detection
Digital Water Marking For Video Piracy Detection
 
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...Self Repairing Tree Topology Enabling  Content Based Routing In Local Area Ne...
Self Repairing Tree Topology Enabling Content Based Routing In Local Area Ne...
 
Cockpit White Box
Cockpit White BoxCockpit White Box
Cockpit White Box
 
Rail Track Inspector
Rail Track InspectorRail Track Inspector
Rail Track Inspector
 
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...Botminer   Clustering Analysis Of Network Traffic For Protocol  And Structure...
Botminer Clustering Analysis Of Network Traffic For Protocol And Structure...
 
Bot Robo Tanker Sound Detector
Bot Robo  Tanker  Sound DetectorBot Robo  Tanker  Sound Detector
Bot Robo Tanker Sound Detector
 
Distance Protection
Distance ProtectionDistance Protection
Distance Protection
 
Bluetooth Jammer
Bluetooth  JammerBluetooth  Jammer
Bluetooth Jammer
 
Crypkit 1
Crypkit 1Crypkit 1
Crypkit 1
 
I E E E 2009 Java Projects
I E E E 2009  Java  ProjectsI E E E 2009  Java  Projects
I E E E 2009 Java Projects
 
B E Projects M C A Projects B
B E  Projects  M C A  Projects  BB E  Projects  M C A  Projects  B
B E Projects M C A Projects B
 
J2 E E Projects, I E E E Projects 2009
J2 E E  Projects,  I E E E  Projects 2009J2 E E  Projects,  I E E E  Projects 2009
J2 E E Projects, I E E E Projects 2009
 
J2 M E Projects, I E E E Projects 2009
J2 M E  Projects,  I E E E  Projects 2009J2 M E  Projects,  I E E E  Projects 2009
J2 M E Projects, I E E E Projects 2009
 
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...Engineering  College  Projects,  M C A  Projects,  B E  Projects,  B Tech  Pr...
Engineering College Projects, M C A Projects, B E Projects, B Tech Pr...
 
B E M E Projects M C A Projects B
B E  M E  Projects  M C A  Projects  BB E  M E  Projects  M C A  Projects  B
B E M E Projects M C A Projects B
 
I E E E 2009 Java Projects, I E E E 2009 A S P
I E E E 2009  Java  Projects,  I E E E 2009  A S PI E E E 2009  Java  Projects,  I E E E 2009  A S P
I E E E 2009 Java Projects, I E E E 2009 A S P
 
Advantages Of Software Projects N C C T
Advantages Of  Software  Projects  N C C TAdvantages Of  Software  Projects  N C C T
Advantages Of Software Projects N C C T
 
Engineering Projects
Engineering  ProjectsEngineering  Projects
Engineering Projects
 
Software Projects Java Projects Mobile Computing
Software  Projects  Java  Projects  Mobile  ComputingSoftware  Projects  Java  Projects  Mobile  Computing
Software Projects Java Projects Mobile Computing
 

Último

IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxNeo4j
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 

Último (20)

IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 

Distributed Data Mining In Credit Card Fraud Detection

  • 1. DISTRIBUTED DATA MINING IN CREDIT CARD FRAUD DETECTION INTRODUCTION Credit card transactions grow in number, taking a larger share of any country’s payment system and this is turn has led to a higher rate of stolen account numbers and subsequent losses by banks. Hence, improved fraud detection has become essential to maintain the viability of the country’s payment system. Banks have used early fraud warning systems for some years. Large- scale data-mining techniques can improve on the state of the art in commercial practice. Scalable techniques to analyze massive amounts of transaction data that efficiently compute fraud detectors in a timely manner is an important problem, especially for e- commerce. Besides scalability and efficiency, the fraud-detection task exhibits technical problems that include skewed distributions of training data and non-uniform cost per error, both of which have not been widely studied in the knowledge-discovery and datamining community. In this project, a deep survey is made and evaluates a number of techniques that address these three main issues concurrently. Our proposed methods of combining multiple learned fraud detectors under a “cost model” are general and demonstrably useful; our empirical results demonstrate that we can significantly reduce loss due to fraud through distributed data mining of fraud models. DATA MINING AND MACHINE LEARNING The aim of data mining is to extract knowledge from large amounts of data. This knowledge is nontrivial and hidden in the data. Machine learning is often used in data mining.
  • 2. DATA MINING: A DEFINITION Art/Science of uncovering non-trivial, valuable information from a large database Emphasis on: Non-obvious (difficult) Useful (cost vs benefit) Large (automatic) Yet, no rules, provided that the process is efficient in time, space and human resources. Data Mining is the process of finding interesting trends or patterns in large datasets in order to guide future decisions. Related to exploratory data analysis (area of statistics) and knowledge discovery (area in artificial intelligence, machine learning). Data Mining is characterized by having VERY LARGE datasets. DATA MINING VS. MACHINE LEARNING Size: Databases are usually very large so algorithms must scale well Design Purpose: Databases are not usually designed for data mining (but for other purposes), and thus, may not have convenient attributes Errors and Noise: Databases almost always contain errors The aim of machine learning is to adapt to new circumstances, to detect and extrapolate. A distinction can be made between unsupervised and supervised machine learning algorithms.
  • 3. PROPOSED SYSTEM In today’s increasingly electronic society and with the rapid advances of electronic commerce on the Internet, the use of credit cards for purchases has become convenient and necessary. Credit card transactions have become the de facto standard for Internet and Webbased e-commerce. The US government estimates that credit cards accounted for approximately US $13 billion in Internet sales during 1998. This figure is expected to grow rapidly each year. However, the growing number of credit card transactions provides more opportunity for thieves to steal credit card numbers and subsequently commit fraud. When banks lose money because of credit card fraud, cardholders pay for all of that loss through higher interest rates, higher fees, and reduced benefits. Hence, it is in both the banks’ and cardholders’ interest to reduce illegitimate use of credit cards by early fraud detection. For many years, the credit card industry has studied computing models for automated detection systems; recently, these models have been the subject of academic research, especially with respect to e- commerce. The credit card fraud-detection domain presents a number of challenging issues for data mining: There are millions of credit card transactions processed each day. Mining such massive amounts of data requires highly efficient techniques that scale. The data are highly skewed—many more transactions are legitimate than fraudulent. Typical accuracy-based mining techniques can generate highly accurate fraud detectors by simply predicting that all transactions are legitimate, although this is equivalent to not detecting fraud at all.
  • 4. Each transaction record has a different dollar amount and thus has a variable potential loss, rather than a fixed misclassification cost per error type, as is commonly assumed in cost-based mining techniques. Our approach addresses the efficiency and scalability issues in several ways. We divide large data set of labeled transactions (either fraudulent or legitimate) into smaller subsets, apply mining techniques to generate classifiers in parallel, and combine the resultant base models by metalearning from the classifiers’ behavior to generate a metaclassifier. Our approach treats the classifiers as black boxes so that we can employ a variety of learning algorithms. Besides extensibility, combining multiple models computed over all available data produces metaclassifiers that can offset the loss of predictive performance that usually occurs when mining from data subsets or sampling. Furthermore, when we use the learned classifiers (for example, during transaction authorization), the base classifiers can execute in parallel, with the metaclassifier then combining their results. So, our approach is highly efficient in generating these models and also relatively efficient in applying them. Another parallel approach focuses on parallelizing a particular algorithm on a particular parallel architecture. However, a new algorithm or architecture requires a substantial amount of parallel- programming work. Although our architecture and algorithm-independent approach is not as efficient as some fine-grained parallelization approaches, it lets users plug different off-the-shelf learning programs into a parallel and distributed environment with relative ease and eliminates the need for expensive parallel hardware.
  • 5. We are going to use the ADACost algorithm. SOFTWARE TOOLS • ASP .NET • Oracle Database HARDWARE TOOLS • Pentium Server with Client