Associative Classification: Synopsis

HYBRID TECHNIQUE FOR ASSOCIATIVE
CLASSIFICATION: A NOVAL APPROACH

Jagdeep Singh

Table of Contents
Introduction

Ø 

Ø  Data

Ø 

Ø 

Mining Process
Ø  Classification
Ø  Association
Ø 
Ø 
Ø 
Ø 

Motivation
Literature Survey
Problem Formulation
Objectives

Ø 

Methodology
Facilities Required
References

Data Mining
Data mining computational process of finding patterns
in large data sets including methods at the intersection
of machine learning, artificial intelligence, statistics
and database systems. The main focus of data mining
process is to obtain information from the data and
converted it into an knowledgeable and reasonable
structure for further use.

Data Mining Process

Figure 1 : The Data Mining Process [10]

Classification
Classification is the problem of identifying to which of
a set of categories a new observation belongs, on the
basis of a training set of data containing observations
(or instances) whose category membership is known.

Association
Association learning method for discovering interesting
relations between variables in large databases. It is
intended to identify strong rules discovered in
databases using different measures of interestingness.

For example, the rule :
{onions, potatoes} => {burger}.

Example : The Weather Problem
ID

outlook

temperature

humidity

windy

play

1

sunny

hot

high

false

no

2

sunny

hot

high

true

no

3

overcast

hot

high

false

yes

4

rainy

mild

high

false

yes

5

rainy

cool

normal

false

yes

6

rainy

cool

normal

true

no

7

overcast

cool

normal

true

yes

8

sunny

mild

high

false

no

9

sunny

cool

normal

false

yes

10

rainy

mild

normal

false

yes

11

sunny

mild

normal

true

yes

12

overcast

mild

high

true

yes

13

overcast

hot

normal

false

yes

14

rainy

mild

high

true

no

Association rules for: Weather Problem
1. humidity=normal windy=FALSE (4) ==> play=yes (4)
2. temperature=cool (4)== humidity=normal (4)

3. outlook=overcast (4) == play=yes (4)
4. temperature=cool play=yes (3) == humidity=normal (3)
5. outlook=rainy windy=FALSE (3) == play=yes (3)
6. outlook=rainy play=yes (3) == windy=FALSE (3)
7. outlook=sunny humidity=high(3) == play=no (3)
8. outlook=sunny play=no (3) == humidity=high (3)
9. temperature=cool windy=FALSE (2) == humidity=normal play=yes (2)
10. temperature=cool humidity=normal windy=FALSE (2) == play=yes (2)

Result new prediction ?

Outlook

Temp.

Humidity

Wind

Sunny

Cool

High

True

Play

Literature Survey
Ø 

Liao et al. [8] author report about data mining techniques and application,
development through a survey of literature, form 2000 to 2011. Paper surveys
three areas of data mining research: knowledge types, analysis types, and
architecture types. A discussion deals with future progress in social science and
Engineering methodologies implement data mining techniques and the development
of applications in problem- oriented

Ø 

The first association rule mining algorithm was the Apriori algorithm [3] developed
by Agrawal, and swami. The Apriori algorithm generates the candidate item sets in
one pass through only the item sets with large support in the previous pass, without
considering the transactions in the database.

Continue…
Ø 

Kwon et al.[9] evaluated the data set features are most affective on
classification algorithms performance. It is a complex problem to find out
which algorithm is highly effective in relation to which data set. Author’s
research experimentally examines how data set characteristics affect
algorithm performance, in terms of elapsed time and accuracy.

Ø 

B. Liu et al. [2] presented an associative classification, to integrate
classification rules and association rule mining. The integration is done by
focusing on mining a special subset of association rules whose consequent
parts are restricted to the classification class labels, called Class Association
Rules (CARs).

Problem Formulation
Ø 

Associative and classification suffers from inefficiency due to the fact that it
often generates a very large number of rules in association rule mining.
Often this leads to generation of a large number of insignificant rules and
at the same time good rules with relatively low support are not produced. It
takes efforts to select high quality rules from among them.

Ø 

Most of the associative classification algorithms adopt the exhaustive search
method presented in the famous Apriori algorithm to discover the rules and
require multiple passes over the database. Furthermore, they find frequent
items in one phase and generate the rules in a separate phase consuming
more resources such as storage and processing time.

Objectives
Ø 

Ø 
Ø 

Purpose a framework that can generate
Classification Association Rules (CARs) efficiently.
Perform evaluation of proposed approach.
Comparative analysis of proposed Algorithm with
other state-of-the-art techniques.

Methodology
Ø 

Ø 

Ø 

Ø 

Review of the classification and association rule
generation methods.
Understanding the existing model associative
classification.
Implement a classification system based on
association rules and compare the performance of
several model construction methods or algorithms in
Weka environment.
Comparison of proposed approach with exiting
methods.

Facilities Required
Ø 

Data mining tools is used for the
implementation of the proposed project
work like Weka.

References
Ø 
Ø 

Ø 

Ø 

Ø 

Ø 

Tom M. Mitchell, “Machine Learning”, 1st ed.U.K.: McGraw-Hill, 1997.
Bing Liu, Wynne Hsu, and Yiming Ma, “Integrating classification and association rule
mining”. In Knowledge Discovery and Data Mining, New York, vol. 2, pp 80–86,
1998.
R. Agrawal and R. Srikant, “Fast algorithms for mining association rules”, In VLDB,
pp. 487-499, Santiago, Chile, September 12-15, 1994.
Wenmin Li, Jiawei Han, and Jian Pei, “CMAR: Accurate and efficient classification
based on multiple class-association rules”. In ICDM'01 Proc. of the 2001 IEEE
International Conference on Data Mining, pp 369–376, IEEE Computer Society
Washington, DC, USA , 2001.
X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules,” Proc.
SIAM Int. Conf. on Data Mining, pp. 331-335, San Francisco, CA, May 2003.
Thabtah, Fadi Abdeljaber, “A review of associative classification mining”. Knowledge
Engineering Review, vol. 1, pp. 37-65, 2007.

Continue …
Ø 

Ø 

Ø 

Ø 

T.V.Mahendra, N.Deepika and N.Keasava Rao, “Data Mining for High Performance
Data Cloud using Association Rule Mining”, International Journal of Advanced
Research in Computer Science and Software Engineering, vol. 2, Issue 1, 2012.
S. H. Liao, P. H. Chu, and P. Y. Hsiao, “Data mining techniques and applications – A
decade review from 2000 to 2011”, Elsevier Expert Systems with Applications, vol.
39, pp. 11303–11311, 2012.
Ohbyung Kwon and Jae Mun Sim, “Effects of data set features on the performances
of classification algorithms”, Expert Systems with Applications, vol. 40, pp. 1847–
1857, 2013.
http://www.infovis-wiki.net/index.php?title=File:Fayyad96kdd-process.png

Associative Classification: Synopsis

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Associative Classification: Synopsis

Similar a Associative Classification: Synopsis (20)

Más de Jagdeep Singh Malhi

Más de Jagdeep Singh Malhi (7)

Último

Último (20)

Associative Classification: Synopsis