SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
C4.5 algorithm and Multivariate Decision Trees

                                                   Thales Sehn Korting

                 Image Processing Division, National Institute for Space Research – INPE
                                   S˜o Jos´ dos Campos – SP, Brazil
                                     a     e

                                                tkorting@dpi.inpe.br


                        Abstract
   The aim of this article is to show a brief description
about the C4.5 algorithm, used to create Univariate De-
cision Trees. We also talk about Multivariate Decision
Trees, their process to classify instances using more than
one attribute per node in the tree. We try to discuss how
they work, and how to implement the algorithms that
build such trees, including examples of Univariate and
Multivariate results.



1. Introduction
    Describing the Pattern Recognition process, the goal
is to learn (or to “teach” a machine) how to classify ob-
jects, through the analysis of an instances set, whose
classes1 are known [5].
    As we know the classes of an instances set (or train-
ing set), we can use several algorithms to discover the
way the attributes-vector of the instances behaves, to             Figure 1. Simple example of a classification pro-
estimate the classes for new instances. One manner to              cess.
do this is through Decision Trees (DT’s).
    A tree is either a leaf node labeled with a class, or
a structure containing a test, linked to two or more                The DT’s can deal with one attribute per test node
nodes (or subtrees) [5]. So, to classify some instance,          or with more than one. The former approach is called
first we get its attribute-vector, and apply this vec-            Univariate DT, and the second is the Multivariate
tor to the tree. The tests are performed into these at-          method. This article explains the construction of Uni-
tributes, reaching one or other leaf, to complete the            variate DT’s and the C4.5 algorithm, used to build such
classification process, as in Figure 1.                           trees (Section 2). After this, we discuss the Multivari-
    If we have n attributes for our instances, we’ll have        ate approach, and how to construct such trees (Section
a n-dimensional space to the classes. And the DT will            3). At the end of each approach (Uni and Multivari-
create hyperplanes (or partitions) to divide this space          ate), we show some results for different test cases.
to the classes. A 2D space is shown in Figure 2, and
the lines means the hyperplanes in this dimension.               2. C4.5 Algorithm
1   Mutually exclusive labels, such as “buildings”, “deforest-      This section explains one of the algorithms used to
    ment”, etc.                                                  create Univariate DT’s. This one, called C4.5, is based
and finally, we define Gain by

                                                                      Gain(y, j) = Entropy(y − Entropy(j|y))

                                                                  The aim is to maximize the Gain, dividing by over-
                                                               all entropy due to split argument y by value j.

                                                               2.3. Prunning

                                                                  This is an important step to the result because of
                                                               the outliers. All data sets contain a little subset of in-
                                                               stances that are not well-defined, and differs from the
                                                               other ones on its neighborhood.
           Figure 2. Partitions created in a DT.
                                                                  After the complete creation of the tree, that must
                                                               classify all the instances in the training set, it is pruned.
on the ID32 algorithm, that tries to find small (or sim-        This is to reduce classification errors, caused by espe-
ple) DT’s. We start presenting some premisses on wich          cialization in the training set; this is done to make the
this algorithm is based, and after we discuss the infer-       tree more general.
ence of the weights and tests in the nodes of the trees.
                                                               2.4. Results
2.1. Construction

   Some premisses guide this algorithm, such as the fol-          To show concrete examples of the C4.5 algorithm ap-
lowing [4]:                                                    plication, we used the System WEKA [6]. One training
                                                               set, considering some aspects of working people, like
    • if all cases are of the same class, the tree is a leaf   vacation time, working hours, health plan was used.
      and so the leaf is returned labelled with this class;    The resulting classes are about the work conditions,
    • for each attribute, calculate the potential informa-     i.e. good or bad. Figure 3 shows the resulting DT, us-
      tion provided by a test on the attribute (based on       ing C4.5 implementation from WEKA.
      the probabilities of each case having a particular          Another example deals with levels of contact-lenses,
      value for the attribute). Also calculate the gain in     according to some characteristics of the patients. Re-
      information that would result from a test on the         sults in Figure 4.
      attribute (based on the probabilities of each case
      with a particular value for the attribute being of
                                                               3. Multivariate DT’s
      a particular class);
    • depending on the current selection criterion, find           Talking about Multivariate DT’s, and inductive-
      the best attribute to branch on.                         learning, they are able to generalize well when deal-
                                                               ing with attributes correlation. Also, the results are
2.2. Counting gain                                             easy to the humans, i.e. we can understand the influ-
                                                               ence of each attribute to the whole process [2].
   This process uses the “Entropy”, i.e. a measure of             One problem, when using simple (or Univariate)
the disorder of the data. The Entropy of y is calculated       DT’s, is that in the whole path, they test some at-
by                                                             tributes more than once. Sometimes this prejudices
                                    n                          the performance of the system, because with a sim-
                                        |yj |     |yj |
              Entropy(y) = −                  log              ple transformation in the data, such as principal com-
                                  j=1
                                         |y|       |y|
                                                               ponents, we can reduce de correlation between the at-
iterating over all possible values of y. The conditional       tributes, and with a simple test realize the same clas-
Entropy is                                                     sification. But the aim of the Multivariate DT’s are to
                                                               perform different tests with the data, according to the
                                     |yj |     |yj |           Figure 5.
                Entropy(j|y) =             log
                                      |y|       |y|               The purpose of the Multivariate approach is to use
                                                               more than one attribute in the test leaves. In the ex-
2    ID3 stands for Iterative Dichotomiser 3                   ample of Figure 5, we can change the whole set of tests
Figure 3. Simple Univariate DT, created by the C4.5 algorithm. In blue are the tests, green and red are the
   resulting classes.




   Figure 4. Other Univariate DT, created by the C4.5 algorithm. In blue are the tests, and in red the resulting
   classes.


by the simple one x + y ≥ 8. But, how to develop an al-          3.1. Tree Construction
gorithm that is able to “discover” such planes? This is
the content of the following sections.                              The first step in this phase is to have a set of train-
   We can think this approach like a linear combina-             ing instances. All of them have a attributes, and a as-
tion of the attributes, at each internal node. For exem-         sociated class. This is the default procedure for all clas-
ple, an instance with this attributes y = y1 , y2 , . . . , yn   sification methods.
belonging to class Cj . The tests at each node of the               Through a top-down decision tree algorithm, and a
tree will follow the form:                                       merit selection criterion, the process chooses the best
                                                                 test to split the data, creating a branch. Now, in the
                       n+1
                                                                 first time, we have two partitions, on wich the algo-
                              wi yi > 0
                                                                 rithm do the same top-down analysis, to make more
                        i=1
                                                                 partitions, according to the criteria.
where w1 , w2 , . . . wn+1 are real-valued coefficients [3].          One of the stop criterion is when some partition
Let’s also consider the attributes y1 , y2 , . . . , yn can be   presents just a single class, so this node becomes a
real too, but some approaches deals with symbolic ones,          leave, with an associated class.
most of the times inserting them into a scale of num-               But, we want to know how the process splits the
bers.                                                            data, and here is the difference between Multi and Uni-
   Multivariate and Univariate DT’s share some prop-             variate DT’s.
erties, when modelling the tree, specially at the stage             Considering a multiclass instance set, we can repre-
of prunig statistically invalid branches.                        sent the multivariate tests with a Linear Machine (LM)
Figure 5. Problem in the Univariate approach [2]. It performs several, and the blue line (Multivariate) is
   much more efficient.


[2].                                                          3.1.2. Thermal Perceptron: For not linearly sep-
                                                              arable instances, one method is the “thermal percep-
LM: Let y be an instance description consisting of 1          tron” [1], that also adjusts wi and wj , and deals with
   and the n features that describe the instance. Then        some constants
   each discriminant function gi (y) has the form                                          B
                                                                                     c=
                              T
                             wi y                                                        B+k
                                                              and
       where wi is a vector of n + 1 coefficients. The LM                               (wj − wi )T y
       infers instance y to belong to class i iff                                k=
                                                                                         2y T y
                   (∀j, i = j)gi (y) > gj (y)                 The process is according to the following algorithm:
                                                              1. B = 2;
   Some methods for training a LM have been pro-              2. If LM is correct for all instances
posed. We can start the weights vector with a default            Or B < 0.001, RETURN
value for all wi , i = 1, . . . N . Here, we show the abso-   3. Otherwise, for each misclassified instance
lute error correction rule, and the thermal perceptron.          3.1. Compute correction c
                                                                      update w[i] and w[j]
3.1.1. Absolute Error Correction rule: One ap-
                                                                 3.2. Adjust B <- aB - b
proach for updating the weight of the discriminat func-
                                                                      with a = 0.99 and b = 0.0005
tions is the absolute error correction rule, wich adjusts
                                                              4. Back to step 2
wi , where i is the class to which the instance belongs,
and wj , where j is the class to which the LM incor-             The basic idea of this algorithm is to correct the
rectly assigns the instance. The correction is accom-         weights-vector until all instances become correct, or in
plished by                                                    the worst case, a certain number of iterations is reached
                                                              (represented by the atualization of B value, decreasing
                      wi ← wi + cy                            according the equation B = aB − b, as a = 99% and
                                                              b = 0.0005 is also a linear small decreasing of the value
and                                                           B.

                      wj ← wj − cy                            3.2. Prunning

where                                                            When prunnig Multivariate DT’s, one must consider
                                                              that this can result in more classification errors than
                         (wj − wi )T y                        in generalization increasing. Generally, just some fea-
                   c=                                         tures (or attributes) are extracted from the multivari-
                            2y T y
                                                              ate tests, instead of prunnnig the whole node. [2] says
is the smallest integer such that the updated LM will         that a multivariate test with n−1 features is more gen-
classify the instance correctly.                              eral than one based on n features.
3.3. Results                                                    [4] J. Quinlan. C 4. 5: Programs for Machine Learning. Mor-
                                                                    gan Kaufmann, 1992.
   Figure 6 shows a good example, doing the classifi-            [5] J. Quinlan. Learning decision tree classifiers. ACM Com-
cation with simple tests, even with a complicated data              puting Surveys (CSUR), 28(1):71–72, 1996.
set.                                                            [6] Weka. WEKA (Data Mining Software). Available at
                                                                    http://www.cs.waikato.ac.nz/ml/weka/. 2006.

4. Conclusion
   In this article we made a discussion about Decision
Trees, the Univariate and the Multivariate approaches.
The C4.5 algorithm implements one way to build Uni-
variate DT’s and some results were shown. About the
Multivariate approach, first we discussed about the ad-
vantages of using it, and we showed how to build such
trees with the Linear Machine approach, using the Ab-
solute Error Correction and also the Thermal percep-
tron rules.
   DT’s are a powerful tool for classification, specially
when the results need to be interpreted by human. Mul-
tivariate DT’s deals well with attributes correlation,
presenting advantages in the tests, considering the Uni-
variate approach.

References
[1] C. Brodley and P. Utgoff. Multivariate Versus Univariate
    Decision Trees. 1992.
[2] C. Brodley and P. Utgoff. Multivariate decision trees. Ma-
    chine Learning, 19(1):45–77, 1995.
[3] S. Murthy, S. Kasif, and S. Salzberg. A System for
    Induction of Oblique Decision Trees. Arxiv preprint
    cs.AI/9408103, 1994.




   Figure 6. Multivariate DT, created by the OC1
   algorithm (Oblique Classifier 1) [3].

Más contenido relacionado

La actualidad más candente

Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit NotesAAKANKSHA JAIN
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentalsA B Shinde
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligenceMinakshi Atre
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rulesHarini Balamurugan
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soniShyam Soni
 
Control Strategies in AI
Control Strategies in AI Control Strategies in AI
Control Strategies in AI Bharat Bhushan
 
Fundamentals and image compression models
Fundamentals and image compression modelsFundamentals and image compression models
Fundamentals and image compression modelslavanya marichamy
 
Edge linking in image processing
Edge linking in image processingEdge linking in image processing
Edge linking in image processingVARUN KUMAR
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
Knowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningKnowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningSagacious IT Solution
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognitionsaniya shaikh
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 

La actualidad más candente (20)

Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
 
Image processing fundamentals
Image processing fundamentalsImage processing fundamentals
Image processing fundamentals
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligence
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
knowledge representation using rules
knowledge representation using rulesknowledge representation using rules
knowledge representation using rules
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soni
 
Control Strategies in AI
Control Strategies in AI Control Strategies in AI
Control Strategies in AI
 
Fundamentals and image compression models
Fundamentals and image compression modelsFundamentals and image compression models
Fundamentals and image compression models
 
Edge linking in image processing
Edge linking in image processingEdge linking in image processing
Edge linking in image processing
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Learning in AI
Learning in AILearning in AI
Learning in AI
 
Knowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningKnowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and Reasoning
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognition
 
Polygon clipping
Polygon clippingPolygon clipping
Polygon clipping
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Distributed Deadlock Detection.ppt
Distributed Deadlock Detection.pptDistributed Deadlock Detection.ppt
Distributed Deadlock Detection.ppt
 
Hog
HogHog
Hog
 

Destacado

CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesMarc Garcia
 
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREESTUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREEAkshay Jain
 
Predicting Student Performance in Solving Parameterized Exercises
Predicting Student Performance in Solving Parameterized ExercisesPredicting Student Performance in Solving Parameterized Exercises
Predicting Student Performance in Solving Parameterized ExercisesShaghayegh (Sherry) Sahebi
 
Data mining to predict academic performance.
Data mining to predict academic performance. Data mining to predict academic performance.
Data mining to predict academic performance. Ranjith Gowda
 
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Sunil Nair
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 

Destacado (12)

C3.3.1
C3.3.1C3.3.1
C3.3.1
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
 
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREESTUDENT PERFORMANCE ANALYSIS USING DECISION TREE
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
 
Predicting Student Performance in Solving Parameterized Exercises
Predicting Student Performance in Solving Parameterized ExercisesPredicting Student Performance in Solving Parameterized Exercises
Predicting Student Performance in Solving Parameterized Exercises
 
Data mining to predict academic performance.
Data mining to predict academic performance. Data mining to predict academic performance.
Data mining to predict academic performance.
 
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision trees
Decision treesDecision trees
Decision trees
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Decision tree
Decision treeDecision tree
Decision tree
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 

Similar a Multivariate decision tree

LE03.doc
LE03.docLE03.doc
LE03.docbutest
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Miningijsrd.com
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentationrohan_anil
 
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 r-kor
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesijsc
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 
Two-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential EvolutionTwo-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential EvolutionXin-She Yang
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques ijsc
 
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...cscpconf
 
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...csandit
 

Similar a Multivariate decision tree (20)

Hx3115011506
Hx3115011506Hx3115011506
Hx3115011506
 
LE03.doc
LE03.docLE03.doc
LE03.doc
 
Decision tree
Decision treeDecision tree
Decision tree
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Decision tree
Decision tree Decision tree
Decision tree
 
Survey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data MiningSurvey on Various Classification Techniques in Data Mining
Survey on Various Classification Techniques in Data Mining
 
Rohan's Masters presentation
Rohan's Masters presentationRohan's Masters presentation
Rohan's Masters presentation
 
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
Two-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential EvolutionTwo-Stage Eagle Strategy with Differential Evolution
Two-Stage Eagle Strategy with Differential Evolution
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
 
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
 
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
 
Dbm630 lecture06
Dbm630 lecture06Dbm630 lecture06
Dbm630 lecture06
 

Último

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 

Último (20)

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 

Multivariate decision tree

  • 1. C4.5 algorithm and Multivariate Decision Trees Thales Sehn Korting Image Processing Division, National Institute for Space Research – INPE S˜o Jos´ dos Campos – SP, Brazil a e tkorting@dpi.inpe.br Abstract The aim of this article is to show a brief description about the C4.5 algorithm, used to create Univariate De- cision Trees. We also talk about Multivariate Decision Trees, their process to classify instances using more than one attribute per node in the tree. We try to discuss how they work, and how to implement the algorithms that build such trees, including examples of Univariate and Multivariate results. 1. Introduction Describing the Pattern Recognition process, the goal is to learn (or to “teach” a machine) how to classify ob- jects, through the analysis of an instances set, whose classes1 are known [5]. As we know the classes of an instances set (or train- ing set), we can use several algorithms to discover the way the attributes-vector of the instances behaves, to Figure 1. Simple example of a classification pro- estimate the classes for new instances. One manner to cess. do this is through Decision Trees (DT’s). A tree is either a leaf node labeled with a class, or a structure containing a test, linked to two or more The DT’s can deal with one attribute per test node nodes (or subtrees) [5]. So, to classify some instance, or with more than one. The former approach is called first we get its attribute-vector, and apply this vec- Univariate DT, and the second is the Multivariate tor to the tree. The tests are performed into these at- method. This article explains the construction of Uni- tributes, reaching one or other leaf, to complete the variate DT’s and the C4.5 algorithm, used to build such classification process, as in Figure 1. trees (Section 2). After this, we discuss the Multivari- If we have n attributes for our instances, we’ll have ate approach, and how to construct such trees (Section a n-dimensional space to the classes. And the DT will 3). At the end of each approach (Uni and Multivari- create hyperplanes (or partitions) to divide this space ate), we show some results for different test cases. to the classes. A 2D space is shown in Figure 2, and the lines means the hyperplanes in this dimension. 2. C4.5 Algorithm 1 Mutually exclusive labels, such as “buildings”, “deforest- This section explains one of the algorithms used to ment”, etc. create Univariate DT’s. This one, called C4.5, is based
  • 2. and finally, we define Gain by Gain(y, j) = Entropy(y − Entropy(j|y)) The aim is to maximize the Gain, dividing by over- all entropy due to split argument y by value j. 2.3. Prunning This is an important step to the result because of the outliers. All data sets contain a little subset of in- stances that are not well-defined, and differs from the other ones on its neighborhood. Figure 2. Partitions created in a DT. After the complete creation of the tree, that must classify all the instances in the training set, it is pruned. on the ID32 algorithm, that tries to find small (or sim- This is to reduce classification errors, caused by espe- ple) DT’s. We start presenting some premisses on wich cialization in the training set; this is done to make the this algorithm is based, and after we discuss the infer- tree more general. ence of the weights and tests in the nodes of the trees. 2.4. Results 2.1. Construction Some premisses guide this algorithm, such as the fol- To show concrete examples of the C4.5 algorithm ap- lowing [4]: plication, we used the System WEKA [6]. One training set, considering some aspects of working people, like • if all cases are of the same class, the tree is a leaf vacation time, working hours, health plan was used. and so the leaf is returned labelled with this class; The resulting classes are about the work conditions, • for each attribute, calculate the potential informa- i.e. good or bad. Figure 3 shows the resulting DT, us- tion provided by a test on the attribute (based on ing C4.5 implementation from WEKA. the probabilities of each case having a particular Another example deals with levels of contact-lenses, value for the attribute). Also calculate the gain in according to some characteristics of the patients. Re- information that would result from a test on the sults in Figure 4. attribute (based on the probabilities of each case with a particular value for the attribute being of 3. Multivariate DT’s a particular class); • depending on the current selection criterion, find Talking about Multivariate DT’s, and inductive- the best attribute to branch on. learning, they are able to generalize well when deal- ing with attributes correlation. Also, the results are 2.2. Counting gain easy to the humans, i.e. we can understand the influ- ence of each attribute to the whole process [2]. This process uses the “Entropy”, i.e. a measure of One problem, when using simple (or Univariate) the disorder of the data. The Entropy of y is calculated DT’s, is that in the whole path, they test some at- by tributes more than once. Sometimes this prejudices n the performance of the system, because with a sim- |yj | |yj | Entropy(y) = − log ple transformation in the data, such as principal com- j=1 |y| |y| ponents, we can reduce de correlation between the at- iterating over all possible values of y. The conditional tributes, and with a simple test realize the same clas- Entropy is sification. But the aim of the Multivariate DT’s are to perform different tests with the data, according to the |yj | |yj | Figure 5. Entropy(j|y) = log |y| |y| The purpose of the Multivariate approach is to use more than one attribute in the test leaves. In the ex- 2 ID3 stands for Iterative Dichotomiser 3 ample of Figure 5, we can change the whole set of tests
  • 3. Figure 3. Simple Univariate DT, created by the C4.5 algorithm. In blue are the tests, green and red are the resulting classes. Figure 4. Other Univariate DT, created by the C4.5 algorithm. In blue are the tests, and in red the resulting classes. by the simple one x + y ≥ 8. But, how to develop an al- 3.1. Tree Construction gorithm that is able to “discover” such planes? This is the content of the following sections. The first step in this phase is to have a set of train- We can think this approach like a linear combina- ing instances. All of them have a attributes, and a as- tion of the attributes, at each internal node. For exem- sociated class. This is the default procedure for all clas- ple, an instance with this attributes y = y1 , y2 , . . . , yn sification methods. belonging to class Cj . The tests at each node of the Through a top-down decision tree algorithm, and a tree will follow the form: merit selection criterion, the process chooses the best test to split the data, creating a branch. Now, in the n+1 first time, we have two partitions, on wich the algo- wi yi > 0 rithm do the same top-down analysis, to make more i=1 partitions, according to the criteria. where w1 , w2 , . . . wn+1 are real-valued coefficients [3]. One of the stop criterion is when some partition Let’s also consider the attributes y1 , y2 , . . . , yn can be presents just a single class, so this node becomes a real too, but some approaches deals with symbolic ones, leave, with an associated class. most of the times inserting them into a scale of num- But, we want to know how the process splits the bers. data, and here is the difference between Multi and Uni- Multivariate and Univariate DT’s share some prop- variate DT’s. erties, when modelling the tree, specially at the stage Considering a multiclass instance set, we can repre- of prunig statistically invalid branches. sent the multivariate tests with a Linear Machine (LM)
  • 4. Figure 5. Problem in the Univariate approach [2]. It performs several, and the blue line (Multivariate) is much more efficient. [2]. 3.1.2. Thermal Perceptron: For not linearly sep- arable instances, one method is the “thermal percep- LM: Let y be an instance description consisting of 1 tron” [1], that also adjusts wi and wj , and deals with and the n features that describe the instance. Then some constants each discriminant function gi (y) has the form B c= T wi y B+k and where wi is a vector of n + 1 coefficients. The LM (wj − wi )T y infers instance y to belong to class i iff k= 2y T y (∀j, i = j)gi (y) > gj (y) The process is according to the following algorithm: 1. B = 2; Some methods for training a LM have been pro- 2. If LM is correct for all instances posed. We can start the weights vector with a default Or B < 0.001, RETURN value for all wi , i = 1, . . . N . Here, we show the abso- 3. Otherwise, for each misclassified instance lute error correction rule, and the thermal perceptron. 3.1. Compute correction c update w[i] and w[j] 3.1.1. Absolute Error Correction rule: One ap- 3.2. Adjust B <- aB - b proach for updating the weight of the discriminat func- with a = 0.99 and b = 0.0005 tions is the absolute error correction rule, wich adjusts 4. Back to step 2 wi , where i is the class to which the instance belongs, and wj , where j is the class to which the LM incor- The basic idea of this algorithm is to correct the rectly assigns the instance. The correction is accom- weights-vector until all instances become correct, or in plished by the worst case, a certain number of iterations is reached (represented by the atualization of B value, decreasing wi ← wi + cy according the equation B = aB − b, as a = 99% and b = 0.0005 is also a linear small decreasing of the value and B. wj ← wj − cy 3.2. Prunning where When prunnig Multivariate DT’s, one must consider that this can result in more classification errors than (wj − wi )T y in generalization increasing. Generally, just some fea- c= tures (or attributes) are extracted from the multivari- 2y T y ate tests, instead of prunnnig the whole node. [2] says is the smallest integer such that the updated LM will that a multivariate test with n−1 features is more gen- classify the instance correctly. eral than one based on n features.
  • 5. 3.3. Results [4] J. Quinlan. C 4. 5: Programs for Machine Learning. Mor- gan Kaufmann, 1992. Figure 6 shows a good example, doing the classifi- [5] J. Quinlan. Learning decision tree classifiers. ACM Com- cation with simple tests, even with a complicated data puting Surveys (CSUR), 28(1):71–72, 1996. set. [6] Weka. WEKA (Data Mining Software). Available at http://www.cs.waikato.ac.nz/ml/weka/. 2006. 4. Conclusion In this article we made a discussion about Decision Trees, the Univariate and the Multivariate approaches. The C4.5 algorithm implements one way to build Uni- variate DT’s and some results were shown. About the Multivariate approach, first we discussed about the ad- vantages of using it, and we showed how to build such trees with the Linear Machine approach, using the Ab- solute Error Correction and also the Thermal percep- tron rules. DT’s are a powerful tool for classification, specially when the results need to be interpreted by human. Mul- tivariate DT’s deals well with attributes correlation, presenting advantages in the tests, considering the Uni- variate approach. References [1] C. Brodley and P. Utgoff. Multivariate Versus Univariate Decision Trees. 1992. [2] C. Brodley and P. Utgoff. Multivariate decision trees. Ma- chine Learning, 19(1):45–77, 1995. [3] S. Murthy, S. Kasif, and S. Salzberg. A System for Induction of Oblique Decision Trees. Arxiv preprint cs.AI/9408103, 1994. Figure 6. Multivariate DT, created by the OC1 algorithm (Oblique Classifier 1) [3].