SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Introduction to Machine
       Learning
                  Lecture 5


               Albert Orriols i Puig
              aorriols@salle.url.edu

     Artificial Intelligence – Machine Learning
                       g                      g
         Enginyeria i Arquitectura La Salle
                Universitat Ramon Llull
Recap of Lecture 4

       The input consists of examples featured by
       different characteristics




                                                           Slide 2
Artificial Intelligence                 Machine Learning
Recap of Lecture 4
        Different problems in machine learning
                Classification: Find the class to which a new instance belongs to
                          E.g.: Find whether a new patient has cancer or not

                Numeric prediction: A variation of classification in which the output
                         p                                                        p
                consists of numeric classes
                          E.g.: Find the frequency of cancerous cell found

                Regression: Find a function that fits your examples
                          E.g.: Find a function that controls your chain process

                Association: Find association among your problem attributes or
                variables
                          E.g.: Find relations such as a patient with high-blood-pressure is
                          more lik l to have heart-attack di
                                likely t h     h t tt k disease

                Clustering: Process to cluster/group the instances into classes
                          E.g.:
                          E g : Group clients whose purchases are similar


                                                                                           Slide 3
Artificial Intelligence                         Machine Learning
Today’s Agenda

        Reviewing the Goal of Data Classification
        What’s a Decision Tree
        How to build Decision Trees:
                ID3
                From ID3 to C4.5
        Run C4.5 on Weka




                                                      Slide 4
Artificial Intelligence            Machine Learning
The Goal of Data Classification

                  Data Set               Classification Model         How?




       The classification model can be implemented in several ways:
               • Rules
               • Decision trees
               • Mathematical formulae




                                                                             Slide 5
Artificial Intelligence                    Machine Learning
The Goal of Data Classification
        Data can have complex structures
        We will accept the following type of data:
                Data described by features which have a single measurement
                Features can be
                          Nominal: @attribute color {green, orange, red}
                          Continuous: @attribute length real [0,10]
                I can have unknown values
                          I could have lost – or never have measured – the attribute of
                          a particular example




                                                                                   Slide 6
Artificial Intelligence                      Machine Learning
Classifying Plants
        Let’s classify different plants in three classes:
                Iris-setosa, iris-virginica, and iris-versicolor




                Weka format
                Dataset publically available at UCI repository

                                                                   Slide 7
Artificial Intelligence                    Machine Learning
Classifying Plants
        Decision tree form this output



                                                   Internal node: test of the
                                                   value of a given attribute
                                                   Branch: value of an
                                                   attribute
                                                   Leaf: predicted class




        How could I automatically generate these types of trees?


                                                                    Slide 8
Artificial Intelligence         Machine Learning
Types of DT Builders
        So, where is the trick?
                Chose the attribute in each internal node and chose the best
                partition
        Several algorithms able to generate decision trees:
        S     ll     ith    bl t         t d ii      t
                Naïve Bayes Tree
                Random forests
                CART (classification and regression tree)
                ID3
                C45
        We are going to start from ID3 and finish with C4.5
                C 5 s the ost
                C4.5 is t e most influential dec s o t ee bu de a go t
                                     ue t a decision tree builder algorithm


                                                                              Slide 9
Artificial Intelligence                Machine Learning
ID3
        Ross Quinlan started with
                ID3 (Quinlan, 1979)
                C4.5 (Quinlan, 1993)


        Some assumptions in the basic algorithm
                  p                     g
                All attributes are nominal
                We do not have unknown values


        With these assumptions, Quinlan designed a heuristic
                   assumptions
        approach to infer decision trees from labeled data



                                                           Slide 10
Artificial Intelligence                 Machine Learning
Description of ID3
ID3(D, Target, Atts)              (Mitchell,1997)
returns: a decision tree that correctly classifies the given examples

variables
D: Training set of examples
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree

create a Root node for the tree
if D are all positive then Root ← +
else if D are all negative then Root ← –
else if Attrs = ∅ then          Root ← most common value of target in D
else
        A← the best decision attribute from Atts
        root ← A
        for each possible value vi of A
              add a new tree branch with A=vi
              Dvi ← subset of D that have value vi for A
              if Dvi = ∅ add then leaf ← most common value of Target in D
              else add the subtree ID3( Dvi Target Atts {A} )
                                        Dvi, Target, Atts-{A}


                                                                                 Slide 11
Artificial Intelligence                    Machine Learning
Description of ID3
                                                                                 D = {d1 , d 2 ,..., d L }
ID3(D, Target, Atts)                             (Mitchell,1997)

                                                                                 Atts = {a1 , a2 ,..., ak }
returns: a decision tree that correctly classifies the given examples

variables
D: Training set of examples
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree

create a Root node for the tree
if D are all positive then Root ← +
else if D are all negative then Root ← –
else if Attrs = ∅ then          Root ← most common value of target in D
else
        A← the best decision attribute from Atts
        root ← A
        for each possible value vi of A
               add a new tree branch with A=vi
                Dvi ← subset of D that have value vi for A
                if Dvi = ∅ add then leaf ← most common value of Target in D

                else add the subtree ID3( Dvi, Target, Atts-{A} )
                                                       Atts {A}




                                                                                                              Slide 12
Artificial Intelligence                                      Machine Learning
Description of ID3
                                                                                 D = {d1 , d 2 ,..., d L }
ID3(D, Target, Atts)                             (Mitchell,1997)

                                                                                 Atts = {a1 , a2 ,..., ak }
returns: a decision tree that correctly classifies the given examples

variables
D: Training set of examples
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree

create a Root node for the tree
if D are all positive then Root ← +
else if D are all negative then Root ← –
else if Attrs = ∅ then          Root ← most common value of target in D
else
        A← the best decision attribute from Atts
        root ← A
        for each possible value vi of A
               add a new tree branch with A=vi
                Dvi ← subset of D that have value vi for A
                if Dvi = ∅ add then leaf ← most common value of Target in D

                else add the subtree ID3( Dvi, Target, Atts-{A} )
                                                       Atts {A}




                                                                                                              Slide 13
Artificial Intelligence                                      Machine Learning
Description of ID3
                                                                                 D = {d1 , d 2 ,..., d L }
ID3(D, Target, Atts)                             (Mitchell,1997)
returns: a decision tree that correctly classifies the given examples
                                                                                 Atts = {a1 , a2 ,..., ak }
variables
D: Training set of examples
                                                                                           ai
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree

create a Root node for the tree
if D are all positive then Root ← +
else if D are all negative then Root ← –
else if Attrs = ∅ then          Root ← most common value of target in D
else
        A← the best decision attribute from Atts
        root ← A
        for each possible value vi of A
               add a new tree branch with A=vi
                Dvi ← subset of D that have value vi for A
                if Dvi = ∅ add then leaf ← most common value of Target in D

                else add the subtree ID3( Dvi, Target, Atts-{A} )
                                                       Atts {A}




                                                                                                              Slide 14
Artificial Intelligence                                      Machine Learning
Description of ID3
                                                                                           D = {d1 , d 2 ,..., d L }
ID3(D, Target, Atts)                             (Mitchell,1997)
returns: a decision tree that correctly classifies the given examples
                                                                                           Atts = {a1 , a2 ,..., ak }
variables
D: Training set of examples
                                                                                                     ai
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree

create a Root node for the tree
                                                                                 ai = v1            ai = v2               ai = vn
if D are all positive then Root ← +
else if D are all negative then Root ← –                                                                        …
else if Attrs = ∅ then          Root ← most common value of target in D
else
        A← the best decision attribute from Atts
        root ← A
        for each possible value vi of A
               add a new tree branch with A=vi
                Dvi ← subset of D that have value vi for A
                if Dvi = ∅ add then leaf ← most common value of Target in D

                else add the subtree ID3( Dvi, Target, Atts-{A} )
                                                       Atts {A}




                                                                                                                        Slide 15
Artificial Intelligence                                      Machine Learning
Description of ID3
                                                                                                       D = {d1 , d 2 ,..., d L }
ID3(D, Target, Atts)                             (Mitchell,1997)
                                                                                                       Atts = {a1 , a2 ,..., ak }
returns: a decision tree that correctly classifies the given examples

variables
                                                                                                                 ai
D: Training set of examples
Target: Attribute whose value is to be predicted by the tree
Atts: List of other attributes that may be tested by the learned decision tree
                                                                                       ai = v1                  ai = v2                    ai = vn
                                                                                       ai = v1                  ai = v2 …                  ai = vn
create a Root node for the tree
if D are all positive then Root ← +
                                                                                                                            …
else if D are all negative then Root ← –
else if Attrs = ∅ then          Root ← most common value of target in D
else
                                                                        D = {d1' , d 2' ,..., d L }
                                                                                                '
        A← the best decision attribute from Atts
                                                                                                                             D = {d1''' , d 2''' ,..., d L''' }
        root ← A
                                                               Atts = {a1 ,..ai −1 , ai +1 ,.., ak }                         Atts = {a1 ,..ai −1 , ai +1 ,.., ak }
        for each possible value vi of A
               add a new tree branch with A=vi
                Dvi ← subset of D that have value vi for A
                if Dvi = ∅ add then leaf ← most common value of Target in D

                else add the subtree ID3( Dvi, Target, Atts-{A} )
                                                       Atts {A}




                                                                                                                                                Slide 16
Artificial Intelligence                                      Machine Learning
Which Attribute Should I Select First?

        Which is the best choice?
                We have 29 positive examples and 35 negative ones
                Should I use attribute 1 or attribute 2 in this iteration of the
                node?
                  d?




                                                                                   Slide 17
Artificial Intelligence                  Machine Learning
Let’s Rely on Information Theory
        Use the concept of Entropy
            Characterizes the impurity of an
            arbitrary collection of examples
            Given S:


                    Entropy ( S ) = − p⊕ log 2 p⊕ − p− log 2 p−

                  where p+ and p- are the proportion of
                  positive/negative examples in S.


            Extension t c classes:
            Et    i to     l                                                Examples:
                                       c
                                                                               p+=0 → entropy=0
                     Entropy ( S ) = −∑ p log 2 pi
                                             i

                                                                               p+=1 → entropy=0
                                      i =1


                                                                               p+=0.5 p-=0.5 → entropy=1
                                                                               (maximum)
                                                                               P+=9 p-=5 → entropy=-
                                                                               (
                                                                               (9/14)log2(9/14)-
                                                                                    ) g(      )
                                                                               (5/14)log2(5/14)=0.940


                                                                                                           Slide 18
Artificial Intelligence                                  Machine Learning
Entropy
        What does this measure mean?
                Entropy is the minimum number of bits needed to encode the
                classification of a member of S randomly drawn.
                          p
                          p+=1, the receiver knows the class, no message sent, Entropy=0.
                              ,                             ,         g      ,      py
                          p+=0.5, 1 bit needed.

                Optimal length code assigns -log2p to message having probability p
                The idea behind is to assign shorter codes to the more probable
                messages and longer codes to less likely examples.
                Thus, the expected number of bits t encode + or – of random
                Th    th        td    b    f bit to     d          f    d
                member of S is:

                             Entropy ( S ) = p⊕ (− log 2 p⊕ ) + p− (− log 2 p− )




                                                                                       Slide 19
Artificial Intelligence                           Machine Learning
Information Gain

      Measures the expected reduction in entropy caused by partitioning the
      examples according to the given attribute
      Gain(S,A): the number of bits saved when encoding the target value of an
      arbitrary member of S, knowing the value of attribute A.
      Expected reduction in entropy caused by knowing the value of A.
                                                                           |S |
                                                                            Sv
                          Gain( S , A) = Entropy ( S ) − ∑v∈values ( A )        Entropy ( S v )
                                                                           |S|


           where values(A) is the set of all possible values for A, and Sv is the
      subset of S for which attribute A has value v




                                                                                                  Slide 20
Artificial Intelligence                                  Machine Learning
Remember the Example?
        Which is the best choice?
                We have 29 positive examples and 35 negative ones
                Should I use attribute 1 or attribute 2 in this iteration of the
                node?
                  d?




                          Gain(A1)=0.993 - 26/64 0.70 - 36/64 0.74=0.292
                          Gain(A2)=0.993 - 51/64 0.93 -13/64 0.61 = 0.128



                                                                                   Slide 21
Artificial Intelligence                       Machine Learning
Yet Another Example
        The textbook example




                                                  Slide 22
Artificial Intelligence        Machine Learning
Yet Another Example




                                                Slide 23
Artificial Intelligence      Machine Learning
Hypothesis Search Space




                                             Slide 24
Artificial Intelligence   Machine Learning
To Sum Up
        ID3 is a strong system that
                Uses hill-climbing search based on the information gain
                measure to search through the space of decision trees
                Outputs
                O t t a single hypothesis.
                         i lh      th i
                Never backtracks. It converges to locally optimal solutions.
                Uses all training examples at each step, contrary to methods
                that make decisions incrementally.
                Uses statistical properties of all examples: the search is less
                sensitive to errors in individual training examples.
                Can handle noisy data by modifying its termination criterion to
                accept hypotheses that imperfectly fit the data.




                                                                               Slide 25
Artificial Intelligence                 Machine Learning
Next Class


        From ID3 to C4.5. C4.5 extends ID3 and enables the
        system to:
                Be more robust in the presence of noise. Avoiding overfitting
                Deal with continuous attributes
                Deal with missing data
                Convert trees to rules




                                                                           Slide 26
Artificial Intelligence                  Machine Learning
Introduction to Machine
       Learning
                  Lecture 5

               Albert Orriols i Puig
              aorriols@salle.url.edu

     Artificial Intelligence – Machine Learning
                       g                      g
         Enginyeria i Arquitectura La Salle
                Universitat Ramon Llull

Más contenido relacionado

La actualidad más candente

Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxShubham Jaybhaye
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3Laila Fatehy
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learningKien Le
 
Evolutionary computing - soft computing
Evolutionary computing - soft computingEvolutionary computing - soft computing
Evolutionary computing - soft computingSakshiMahto1
 
07 regularization
07 regularization07 regularization
07 regularizationRonald Teo
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithmRashid Ansari
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsSalah Amean
 

La actualidad más candente (20)

Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptx
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Lecture7 - IBk
Lecture7 - IBkLecture7 - IBk
Lecture7 - IBk
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Regularization
RegularizationRegularization
Regularization
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Evolutionary computing - soft computing
Evolutionary computing - soft computingEvolutionary computing - soft computing
Evolutionary computing - soft computing
 
07 regularization
07 regularization07 regularization
07 regularization
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Markov Random Field (MRF)
Markov Random Field (MRF)Markov Random Field (MRF)
Markov Random Field (MRF)
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 

Similar a Lecture5 - C4.5

Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Miningbutest
 
CCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsCCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsAlbert Orriols-Puig
 
Classfication Basic.ppt
Classfication Basic.pptClassfication Basic.ppt
Classfication Basic.ppthenonah
 
Machine learning
Machine learningMachine learning
Machine learningRohit Kumar
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Miningijdmtaiir
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningMustafa Aldemir
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
Enhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationEnhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationIDES Editor
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsVidya sagar Sharma
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀GDSCNiT
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachinePulse
 
Computational decision making
Computational decision makingComputational decision making
Computational decision makingBoris Adryan
 
Intrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross ValidationIntrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross Validationinventionjournals
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionijtsrd
 
Auto-encoding variational bayes
Auto-encoding variational bayesAuto-encoding variational bayes
Auto-encoding variational bayesKyuri Kim
 

Similar a Lecture5 - C4.5 (20)

Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
Lecture17
Lecture17Lecture17
Lecture17
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
 
CCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data setsCCIA'2008: On the dimensions of data complexity through synthetic data sets
CCIA'2008: On the dimensions of data complexity through synthetic data sets
 
Dbm630 lecture06
Dbm630 lecture06Dbm630 lecture06
Dbm630 lecture06
 
Decision tree
Decision treeDecision tree
Decision tree
 
Lecture2 - Machine Learning
Lecture2 - Machine LearningLecture2 - Machine Learning
Lecture2 - Machine Learning
 
Classfication Basic.ppt
Classfication Basic.pptClassfication Basic.ppt
Classfication Basic.ppt
 
Machine learning
Machine learningMachine learning
Machine learning
 
Analysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data MiningAnalysis of Classification Algorithm in Data Mining
Analysis of Classification Algorithm in Data Mining
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Enhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationEnhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K Anonymization
 
Mis End Term Exam Theory Concepts
Mis End Term Exam Theory ConceptsMis End Term Exam Theory Concepts
Mis End Term Exam Theory Concepts
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Computational decision making
Computational decision makingComputational decision making
Computational decision making
 
Intrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross ValidationIntrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross Validation
 
Deployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement predictionDeployment of ID3 decision tree algorithm for placement prediction
Deployment of ID3 decision tree algorithm for placement prediction
 
Auto-encoding variational bayes
Auto-encoding variational bayesAuto-encoding variational bayes
Auto-encoding variational bayes
 

Más de Albert Orriols-Puig

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsAlbert Orriols-Puig
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIIAlbert Orriols-Puig
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IIAlbert Orriols-Puig
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesAlbert Orriols-Puig
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...Albert Orriols-Puig
 

Más de Albert Orriols-Puig (20)

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture23
Lecture23Lecture23
Lecture23
 
Lecture22
Lecture22Lecture22
Lecture22
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture20
Lecture20Lecture20
Lecture20
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART III
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rules
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...
 

Último

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Último (20)

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

Lecture5 - C4.5

  • 1. Introduction to Machine Learning Lecture 5 Albert Orriols i Puig aorriols@salle.url.edu Artificial Intelligence – Machine Learning g g Enginyeria i Arquitectura La Salle Universitat Ramon Llull
  • 2. Recap of Lecture 4 The input consists of examples featured by different characteristics Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lecture 4 Different problems in machine learning Classification: Find the class to which a new instance belongs to E.g.: Find whether a new patient has cancer or not Numeric prediction: A variation of classification in which the output p p consists of numeric classes E.g.: Find the frequency of cancerous cell found Regression: Find a function that fits your examples E.g.: Find a function that controls your chain process Association: Find association among your problem attributes or variables E.g.: Find relations such as a patient with high-blood-pressure is more lik l to have heart-attack di likely t h h t tt k disease Clustering: Process to cluster/group the instances into classes E.g.: E g : Group clients whose purchases are similar Slide 3 Artificial Intelligence Machine Learning
  • 4. Today’s Agenda Reviewing the Goal of Data Classification What’s a Decision Tree How to build Decision Trees: ID3 From ID3 to C4.5 Run C4.5 on Weka Slide 4 Artificial Intelligence Machine Learning
  • 5. The Goal of Data Classification Data Set Classification Model How? The classification model can be implemented in several ways: • Rules • Decision trees • Mathematical formulae Slide 5 Artificial Intelligence Machine Learning
  • 6. The Goal of Data Classification Data can have complex structures We will accept the following type of data: Data described by features which have a single measurement Features can be Nominal: @attribute color {green, orange, red} Continuous: @attribute length real [0,10] I can have unknown values I could have lost – or never have measured – the attribute of a particular example Slide 6 Artificial Intelligence Machine Learning
  • 7. Classifying Plants Let’s classify different plants in three classes: Iris-setosa, iris-virginica, and iris-versicolor Weka format Dataset publically available at UCI repository Slide 7 Artificial Intelligence Machine Learning
  • 8. Classifying Plants Decision tree form this output Internal node: test of the value of a given attribute Branch: value of an attribute Leaf: predicted class How could I automatically generate these types of trees? Slide 8 Artificial Intelligence Machine Learning
  • 9. Types of DT Builders So, where is the trick? Chose the attribute in each internal node and chose the best partition Several algorithms able to generate decision trees: S ll ith bl t t d ii t Naïve Bayes Tree Random forests CART (classification and regression tree) ID3 C45 We are going to start from ID3 and finish with C4.5 C 5 s the ost C4.5 is t e most influential dec s o t ee bu de a go t ue t a decision tree builder algorithm Slide 9 Artificial Intelligence Machine Learning
  • 10. ID3 Ross Quinlan started with ID3 (Quinlan, 1979) C4.5 (Quinlan, 1993) Some assumptions in the basic algorithm p g All attributes are nominal We do not have unknown values With these assumptions, Quinlan designed a heuristic assumptions approach to infer decision trees from labeled data Slide 10 Artificial Intelligence Machine Learning
  • 11. Description of ID3 ID3(D, Target, Atts) (Mitchell,1997) returns: a decision tree that correctly classifies the given examples variables D: Training set of examples Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree create a Root node for the tree if D are all positive then Root ← + else if D are all negative then Root ← – else if Attrs = ∅ then Root ← most common value of target in D else A← the best decision attribute from Atts root ← A for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi Target Atts {A} ) Dvi, Target, Atts-{A} Slide 11 Artificial Intelligence Machine Learning
  • 12. Description of ID3 D = {d1 , d 2 ,..., d L } ID3(D, Target, Atts) (Mitchell,1997) Atts = {a1 , a2 ,..., ak } returns: a decision tree that correctly classifies the given examples variables D: Training set of examples Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree create a Root node for the tree if D are all positive then Root ← + else if D are all negative then Root ← – else if Attrs = ∅ then Root ← most common value of target in D else A← the best decision attribute from Atts root ← A for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi, Target, Atts-{A} ) Atts {A} Slide 12 Artificial Intelligence Machine Learning
  • 13. Description of ID3 D = {d1 , d 2 ,..., d L } ID3(D, Target, Atts) (Mitchell,1997) Atts = {a1 , a2 ,..., ak } returns: a decision tree that correctly classifies the given examples variables D: Training set of examples Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree create a Root node for the tree if D are all positive then Root ← + else if D are all negative then Root ← – else if Attrs = ∅ then Root ← most common value of target in D else A← the best decision attribute from Atts root ← A for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi, Target, Atts-{A} ) Atts {A} Slide 13 Artificial Intelligence Machine Learning
  • 14. Description of ID3 D = {d1 , d 2 ,..., d L } ID3(D, Target, Atts) (Mitchell,1997) returns: a decision tree that correctly classifies the given examples Atts = {a1 , a2 ,..., ak } variables D: Training set of examples ai Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree create a Root node for the tree if D are all positive then Root ← + else if D are all negative then Root ← – else if Attrs = ∅ then Root ← most common value of target in D else A← the best decision attribute from Atts root ← A for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi, Target, Atts-{A} ) Atts {A} Slide 14 Artificial Intelligence Machine Learning
  • 15. Description of ID3 D = {d1 , d 2 ,..., d L } ID3(D, Target, Atts) (Mitchell,1997) returns: a decision tree that correctly classifies the given examples Atts = {a1 , a2 ,..., ak } variables D: Training set of examples ai Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree create a Root node for the tree ai = v1 ai = v2 ai = vn if D are all positive then Root ← + else if D are all negative then Root ← – … else if Attrs = ∅ then Root ← most common value of target in D else A← the best decision attribute from Atts root ← A for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi, Target, Atts-{A} ) Atts {A} Slide 15 Artificial Intelligence Machine Learning
  • 16. Description of ID3 D = {d1 , d 2 ,..., d L } ID3(D, Target, Atts) (Mitchell,1997) Atts = {a1 , a2 ,..., ak } returns: a decision tree that correctly classifies the given examples variables ai D: Training set of examples Target: Attribute whose value is to be predicted by the tree Atts: List of other attributes that may be tested by the learned decision tree ai = v1 ai = v2 ai = vn ai = v1 ai = v2 … ai = vn create a Root node for the tree if D are all positive then Root ← + … else if D are all negative then Root ← – else if Attrs = ∅ then Root ← most common value of target in D else D = {d1' , d 2' ,..., d L } ' A← the best decision attribute from Atts D = {d1''' , d 2''' ,..., d L''' } root ← A Atts = {a1 ,..ai −1 , ai +1 ,.., ak } Atts = {a1 ,..ai −1 , ai +1 ,.., ak } for each possible value vi of A add a new tree branch with A=vi Dvi ← subset of D that have value vi for A if Dvi = ∅ add then leaf ← most common value of Target in D else add the subtree ID3( Dvi, Target, Atts-{A} ) Atts {A} Slide 16 Artificial Intelligence Machine Learning
  • 17. Which Attribute Should I Select First? Which is the best choice? We have 29 positive examples and 35 negative ones Should I use attribute 1 or attribute 2 in this iteration of the node? d? Slide 17 Artificial Intelligence Machine Learning
  • 18. Let’s Rely on Information Theory Use the concept of Entropy Characterizes the impurity of an arbitrary collection of examples Given S: Entropy ( S ) = − p⊕ log 2 p⊕ − p− log 2 p− where p+ and p- are the proportion of positive/negative examples in S. Extension t c classes: Et i to l Examples: c p+=0 → entropy=0 Entropy ( S ) = −∑ p log 2 pi i p+=1 → entropy=0 i =1 p+=0.5 p-=0.5 → entropy=1 (maximum) P+=9 p-=5 → entropy=- ( (9/14)log2(9/14)- ) g( ) (5/14)log2(5/14)=0.940 Slide 18 Artificial Intelligence Machine Learning
  • 19. Entropy What does this measure mean? Entropy is the minimum number of bits needed to encode the classification of a member of S randomly drawn. p p+=1, the receiver knows the class, no message sent, Entropy=0. , , g , py p+=0.5, 1 bit needed. Optimal length code assigns -log2p to message having probability p The idea behind is to assign shorter codes to the more probable messages and longer codes to less likely examples. Thus, the expected number of bits t encode + or – of random Th th td b f bit to d f d member of S is: Entropy ( S ) = p⊕ (− log 2 p⊕ ) + p− (− log 2 p− ) Slide 19 Artificial Intelligence Machine Learning
  • 20. Information Gain Measures the expected reduction in entropy caused by partitioning the examples according to the given attribute Gain(S,A): the number of bits saved when encoding the target value of an arbitrary member of S, knowing the value of attribute A. Expected reduction in entropy caused by knowing the value of A. |S | Sv Gain( S , A) = Entropy ( S ) − ∑v∈values ( A ) Entropy ( S v ) |S| where values(A) is the set of all possible values for A, and Sv is the subset of S for which attribute A has value v Slide 20 Artificial Intelligence Machine Learning
  • 21. Remember the Example? Which is the best choice? We have 29 positive examples and 35 negative ones Should I use attribute 1 or attribute 2 in this iteration of the node? d? Gain(A1)=0.993 - 26/64 0.70 - 36/64 0.74=0.292 Gain(A2)=0.993 - 51/64 0.93 -13/64 0.61 = 0.128 Slide 21 Artificial Intelligence Machine Learning
  • 22. Yet Another Example The textbook example Slide 22 Artificial Intelligence Machine Learning
  • 23. Yet Another Example Slide 23 Artificial Intelligence Machine Learning
  • 24. Hypothesis Search Space Slide 24 Artificial Intelligence Machine Learning
  • 25. To Sum Up ID3 is a strong system that Uses hill-climbing search based on the information gain measure to search through the space of decision trees Outputs O t t a single hypothesis. i lh th i Never backtracks. It converges to locally optimal solutions. Uses all training examples at each step, contrary to methods that make decisions incrementally. Uses statistical properties of all examples: the search is less sensitive to errors in individual training examples. Can handle noisy data by modifying its termination criterion to accept hypotheses that imperfectly fit the data. Slide 25 Artificial Intelligence Machine Learning
  • 26. Next Class From ID3 to C4.5. C4.5 extends ID3 and enables the system to: Be more robust in the presence of noise. Avoiding overfitting Deal with continuous attributes Deal with missing data Convert trees to rules Slide 26 Artificial Intelligence Machine Learning
  • 27. Introduction to Machine Learning Lecture 5 Albert Orriols i Puig aorriols@salle.url.edu Artificial Intelligence – Machine Learning g g Enginyeria i Arquitectura La Salle Universitat Ramon Llull