SlideShare una empresa de Scribd logo
1 de 5
ASIET Kalady
Data mining
Techniques
Assignment
RespaPeter
10/8/2013
Data mining
Data mining is the analysis of (often large) observational data sets to find unsuspected
relationships and to summarize the data in novel ways that are both understandable and useful to
the data owner.
Data mining Techniques
In additiontousinga particulardata miningtool, internal auditors can choose from a variety of
data miningtechniques.The mostcommonlyusedtechniquesincludeartificial neuralnetworks,decision
trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:
 Artificial neural networks are non-linear, predictive models that learn through training.
Although they are powerful predictive modeling techniques, some of the power comes at
the expense of ease of use and deployment.
One area where auditors can easily use them is when reviewing records to
identify fraud and fraud-like actions. Because of their complexity, they are better
employed in situations where they can be used and reused, such as reviewing credit card
transactions every month to check for anomalies.
 Decision trees are tree-shaped structures that represent decision sets. These decisions
generate rules, which then are used to classify data. Decision trees are the favored
technique for building understandable models. Auditors can use them to assess, for
example, whether the organization is using an appropriate cost-effective marketing
strategy that is based on the assigned value of the customer, such as profit.
 The nearest-neighbor method classifies dataset records based on similar data in a
historical dataset. Auditors can use this approach to define a document that is interesting
to them and ask the system to search for similar items.
The most commonly used techniques include artificial neural networks, decision trees,
and the nearest-neighbor method. Each of these techniques analyzes data in different ways:
 Association
Association (or relation) is probably the better known and most familiar and
straightforward data mining technique. Here, it make a simple correlation between two or
more items, often of the same type to identify patterns. For example, when tracking
people's buying habits, you might identify that a customer always buys cream when they
buy strawberries, and therefore suggest that the next time that they buy strawberries they
might also want to buy cream.
 Classification
It use to build up an idea of the type of customer, item, or object by describing
multiple attributes to identify a particular class. For example, you can easily classify cars
into different types (sedan, 4x4, convertible) by identifying different attributes (number
of seats, car shape, driven wheels). Given a new car, you might apply it into a particular
class by comparing the attributes with our known definition. You can apply the same
principles to customers, for example by classifying them by age and social group.
 Clustering
Clustering is the task of segmenting a diverse group into a number of similar sub groups
or clusters.The distingiush clustering from classification is that clustering is not rely
on predefined classes.
The records are grouped together on the basis of self similarity.clustering is often done as
a prelude to some other form of datamining.
 Prediction
Prediction is a wide topic and runs from predicting the failure of components or
machinery, to identifying fraud and even the prediction of company profits. Used in combination
with the other data mining techniques, prediction involves analyzing trends, classification,
pattern matching, and relation. By analyzing past events or instances, you can make a prediction
about an event.
Using the credit card authorization, for example, you might combine decision tree analysis of
individual past transactions with classification and historical pattern matches to identify whether
a transaction is fraudulent. Making a match between the purchase of flights to the US and
transactions in the US, it is likely that the transaction is valid.
 Sequential patterns
Oftern used over longer-term data, sequential patterns are a useful method for identifying
trends, or regular occurrences of similar events. For example, with customer data you can
identify that customers buy a particular collection of products together at different times of the
year. In a shopping basket application, you can use this information to automatically suggest that
certain items be added to a basket based on their frequency and past purchasing history.
 Decision trees
Related to most of the other techniques (primarily classification and prediction), the
decision tree can be used either as a part of the selection criteria, or to support the use and
selection of specific data within the overall structure. Within the decision tree, you start with a
simple question that has two (or sometimes more) answers. Each answer leads to a further
question to help classify or identify the data so that it can be categorized, or so that a prediction
can be made based on each answer.
Each of these approaches brings different advantages and disadvantages that need to be
considered prior to their use. Neural networks, which are difficult to implement, require all input
and resultant output to be expressed numerically, thus needing some sort of interpretation
depending on the nature of the data-mining exercise
The decisiontree techniqueisthe mostcommonlyused methodology, because it is simple and
straightforwardtoimplement.Finally,the nearest-neighbormethodreliesmore onlinkingsimilar items
and, therefore, works better for extrapolation rather than predictive enquiries.
A goodway to applyadvanceddataminingtechniquesis to have a flexible and interactive data
mining tool that is fully integrated with a database or data warehouse. Using a tool that operates
outside of the database or data warehouse is not as efficient
Regardlessof the technique used,the real value behind data mining is modeling — the process
of buildingamodel basedonuser-specifiedcriteriafromalreadycaptureddata.Once a model is built, it
can be used in similar situations where an answer is not known.
For example,an organization looking to acquire new customers can create a model of its ideal
customer that is based on existing data captured from people who previously purchased the product.
The model then is used to query data on prospective customers to see if they match the profile.
Modeling also can be used in audit departments to predict the number of auditors required to
undertake an audit plan based on previous attempts and similar work.

Más contenido relacionado

La actualidad más candente

Data mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedData mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedYugal Kumar
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data miningEr. Nawaraj Bhandari
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceIJDKP
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data MiningValerii Klymchuk
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
5 data preparation and processing2
5 data preparation and processing25 data preparation and processing2
5 data preparation and processing2Mahmoud Alfarra
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesRajendran
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 

La actualidad más candente (17)

Data mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedData mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updated
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Database
DatabaseDatabase
Database
 
Ghhh
GhhhGhhh
Ghhh
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Part1
Part1Part1
Part1
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
5 data preparation and processing2
5 data preparation and processing25 data preparation and processing2
5 data preparation and processing2
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Data model
Data modelData model
Data model
 
Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
1234
12341234
1234
 

Destacado (6)

Why Datamining?
Why Datamining?Why Datamining?
Why Datamining?
 
Biometrics--The Technology of Tomorrow
Biometrics--The Technology of TomorrowBiometrics--The Technology of Tomorrow
Biometrics--The Technology of Tomorrow
 
Biometric Technology
Biometric TechnologyBiometric Technology
Biometric Technology
 
biometric technology
biometric technologybiometric technology
biometric technology
 
3d internet
3d internet3d internet
3d internet
 
Biometrics
BiometricsBiometrics
Biometrics
 

Similar a DataMining Techniq

EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSeditorijettcs
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfAgile dock
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docxhealdkathaleen
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysisData analysis ireland
 
Data and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxData and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxLamees EL- Ghazoly
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of datanomanbhutta
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıMurat YAZICI, M.Sc.
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxhp41112004
 

Similar a DataMining Techniq (20)

Data Mining
Data MiningData Mining
Data Mining
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
 
Data mining-basic
Data mining-basicData mining-basic
Data mining-basic
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Unit 4 Advanced Data Analytics
Unit 4 Advanced Data AnalyticsUnit 4 Advanced Data Analytics
Unit 4 Advanced Data Analytics
 
Data Mining
Data MiningData Mining
Data Mining
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Running Head Data Mining in The Cloud .docx
Running Head Data Mining in The Cloud                            .docxRunning Head Data Mining in The Cloud                            .docx
Running Head Data Mining in The Cloud .docx
 
Chapter 1.pdf
Chapter 1.pdfChapter 1.pdf
Chapter 1.pdf
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
Data and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxData and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptx
 
Mining internal sources of data
Mining internal sources of dataMining internal sources of data
Mining internal sources of data
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat Yazıcı
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
 
Data mining
Data miningData mining
Data mining
 
Data Mining Lec1.pptx
Data Mining Lec1.pptxData Mining Lec1.pptx
Data Mining Lec1.pptx
 

Más de Respa Peter

Tpes of Softwares
Tpes of SoftwaresTpes of Softwares
Tpes of SoftwaresRespa Peter
 
Information technology for business
Information technology for business Information technology for business
Information technology for business Respa Peter
 
Types of sql injection attacks
Types of sql injection attacksTypes of sql injection attacks
Types of sql injection attacksRespa Peter
 
software failures
 software failures software failures
software failuresRespa Peter
 
Managing software development
Managing software developmentManaging software development
Managing software developmentRespa Peter
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithmRespa Peter
 
Matrix multiplicationdesign
Matrix multiplicationdesignMatrix multiplicationdesign
Matrix multiplicationdesignRespa Peter
 
Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...Respa Peter
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplicationRespa Peter
 
Open shortest path first (ospf)
Open shortest path first (ospf)Open shortest path first (ospf)
Open shortest path first (ospf)Respa Peter
 

Más de Respa Peter (13)

Tpes of Softwares
Tpes of SoftwaresTpes of Softwares
Tpes of Softwares
 
Information technology for business
Information technology for business Information technology for business
Information technology for business
 
Types of sql injection attacks
Types of sql injection attacksTypes of sql injection attacks
Types of sql injection attacks
 
software failures
 software failures software failures
software failures
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Managing software development
Managing software developmentManaging software development
Managing software development
 
Data mining
Data miningData mining
Data mining
 
Knime
KnimeKnime
Knime
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Matrix multiplicationdesign
Matrix multiplicationdesignMatrix multiplicationdesign
Matrix multiplicationdesign
 
Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...Web services have made the development of mobile Web applications much easier...
Web services have made the development of mobile Web applications much easier...
 
Matrix chain multiplication
Matrix chain multiplicationMatrix chain multiplication
Matrix chain multiplication
 
Open shortest path first (ospf)
Open shortest path first (ospf)Open shortest path first (ospf)
Open shortest path first (ospf)
 

Último

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Último (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

DataMining Techniq

  • 2. Data mining Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner. Data mining Techniques In additiontousinga particulardata miningtool, internal auditors can choose from a variety of data miningtechniques.The mostcommonlyusedtechniquesincludeartificial neuralnetworks,decision trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:  Artificial neural networks are non-linear, predictive models that learn through training. Although they are powerful predictive modeling techniques, some of the power comes at the expense of ease of use and deployment. One area where auditors can easily use them is when reviewing records to identify fraud and fraud-like actions. Because of their complexity, they are better employed in situations where they can be used and reused, such as reviewing credit card transactions every month to check for anomalies.  Decision trees are tree-shaped structures that represent decision sets. These decisions generate rules, which then are used to classify data. Decision trees are the favored technique for building understandable models. Auditors can use them to assess, for example, whether the organization is using an appropriate cost-effective marketing strategy that is based on the assigned value of the customer, such as profit.  The nearest-neighbor method classifies dataset records based on similar data in a historical dataset. Auditors can use this approach to define a document that is interesting to them and ask the system to search for similar items.
  • 3. The most commonly used techniques include artificial neural networks, decision trees, and the nearest-neighbor method. Each of these techniques analyzes data in different ways:  Association Association (or relation) is probably the better known and most familiar and straightforward data mining technique. Here, it make a simple correlation between two or more items, often of the same type to identify patterns. For example, when tracking people's buying habits, you might identify that a customer always buys cream when they buy strawberries, and therefore suggest that the next time that they buy strawberries they might also want to buy cream.  Classification It use to build up an idea of the type of customer, item, or object by describing multiple attributes to identify a particular class. For example, you can easily classify cars into different types (sedan, 4x4, convertible) by identifying different attributes (number of seats, car shape, driven wheels). Given a new car, you might apply it into a particular class by comparing the attributes with our known definition. You can apply the same principles to customers, for example by classifying them by age and social group.  Clustering Clustering is the task of segmenting a diverse group into a number of similar sub groups or clusters.The distingiush clustering from classification is that clustering is not rely on predefined classes. The records are grouped together on the basis of self similarity.clustering is often done as a prelude to some other form of datamining.  Prediction
  • 4. Prediction is a wide topic and runs from predicting the failure of components or machinery, to identifying fraud and even the prediction of company profits. Used in combination with the other data mining techniques, prediction involves analyzing trends, classification, pattern matching, and relation. By analyzing past events or instances, you can make a prediction about an event. Using the credit card authorization, for example, you might combine decision tree analysis of individual past transactions with classification and historical pattern matches to identify whether a transaction is fraudulent. Making a match between the purchase of flights to the US and transactions in the US, it is likely that the transaction is valid.  Sequential patterns Oftern used over longer-term data, sequential patterns are a useful method for identifying trends, or regular occurrences of similar events. For example, with customer data you can identify that customers buy a particular collection of products together at different times of the year. In a shopping basket application, you can use this information to automatically suggest that certain items be added to a basket based on their frequency and past purchasing history.  Decision trees Related to most of the other techniques (primarily classification and prediction), the decision tree can be used either as a part of the selection criteria, or to support the use and selection of specific data within the overall structure. Within the decision tree, you start with a simple question that has two (or sometimes more) answers. Each answer leads to a further question to help classify or identify the data so that it can be categorized, or so that a prediction can be made based on each answer. Each of these approaches brings different advantages and disadvantages that need to be considered prior to their use. Neural networks, which are difficult to implement, require all input and resultant output to be expressed numerically, thus needing some sort of interpretation depending on the nature of the data-mining exercise
  • 5. The decisiontree techniqueisthe mostcommonlyused methodology, because it is simple and straightforwardtoimplement.Finally,the nearest-neighbormethodreliesmore onlinkingsimilar items and, therefore, works better for extrapolation rather than predictive enquiries. A goodway to applyadvanceddataminingtechniquesis to have a flexible and interactive data mining tool that is fully integrated with a database or data warehouse. Using a tool that operates outside of the database or data warehouse is not as efficient Regardlessof the technique used,the real value behind data mining is modeling — the process of buildingamodel basedonuser-specifiedcriteriafromalreadycaptureddata.Once a model is built, it can be used in similar situations where an answer is not known. For example,an organization looking to acquire new customers can create a model of its ideal customer that is based on existing data captured from people who previously purchased the product. The model then is used to query data on prospective customers to see if they match the profile. Modeling also can be used in audit departments to predict the number of auditors required to undertake an audit plan based on previous attempts and similar work.