SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Note to other teachers and users of these slides. Andrew would be
                                                 delighted if you found this source material useful in giving you r own
                                                 lectures. Feel free to use these slides verbatim, or to modify them to fit
                                                 your own needs. PowerPoint originals are available. If you make use of
                                                 a significant portion of these slides in your own lecture, please include
                                                 this message, or the following link to the source repository of Andrew’s
                                                 tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and
                                                 corrections gratefully received.




                 Learning Gaussian
                  Bayes Classifiers
                                   Andrew W. Moore
                                  Associate Professor
                              School of Computer Science
                              Carnegie Mellon University
                                       www.cs.cmu.edu/~awm
                                         awm@cs.cmu.edu
                                           412-268-7599


         Copyright © 2001, Andrew W. Moore                                           Sep 10th, 2001




        Maximum Likelihood learning of
         Gaussians for Classification
• Why we should care
• 3 seconds to teach you a new learning
  algorithm
• What if there are 10,000 dimensions?
• What if there are categorical inputs?
• Examples “out the wazoo”




Copyright © 2001, Andrew W. Moore                                           Gaussian Bayes Classifiers: Slide 2




                                                                                                                              1
Why we should care
• One of the original “Data Mining” algorithms
• Very simple and effective
• Demonstrates the usefulness of our earlier
  groundwork




Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 3




 Where we were at the end of the
         MLE lecture…

                                         Categorical   Real-valued          Mixed Real /
                                         inputs only   inputs only          Cat okay

                                       Joint BC
                               Predict
 Inputs




                                                                             Dec Tree
                 Classifier   category Naïve BC

                                         Joint DE      Gauss DE
                               Prob-
 Inputs Inputs




                  Density
                               ability
                 Estimator               Naïve DE

                               Predict
                 Regressor    real no.



Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 4




                                                                                                    2
This lecture…

                                               Categorical      Real-valued              Mixed Real /
                                               inputs only      inputs only              Cat okay

                                             Joint BC           Gauss BC
                                     Predict
 Inputs




                                                                                          Dec Tree
                       Classifier   category Naïve BC

                                               Joint DE          Gauss DE
                                     Prob-
 Inputs Inputs




                        Density
                                     ability
                       Estimator               Naïve DE

                                     Predict
                   Regressor        real no.



Copyright © 2001, Andrew W. Moore                                          Gaussian Bayes Classifiers: Slide 5




                                           Road Map
                                               Probability
                                                                                   Decision
                                                                                    Trees
                                                        Density
                                    PDFs               Estimation



                               Gaussians                       Bayes
                 MLE
                                                             Classifiers

                                        MLE of
                                       Gaussians




Copyright © 2001, Andrew W. Moore                                          Gaussian Bayes Classifiers: Slide 6




                                                                                                                 3
Road Map
                                          Probability
                                                                              Decision
                                                                               Trees
                                                   Density
                               PDFs               Estimation



                      Gaussians                           Bayes
     MLE
                                                        Classifiers

                                     MLE of
                                    Gaussians
                                                        Gaussian
                                                          Bayes
                                                        Classifiers

Copyright © 2001, Andrew W. Moore                                     Gaussian Bayes Classifiers: Slide 7




              Gaussian Bayes Classifier
                    Assumption
•  The i’th record in the database is created
   using the following algorithm
1. Generate the output (the “class”) by
   drawing yi~Multinomial(p1,p2,…p Ny )
2. Generate the inputs from a Gaussian PDF
   that depends on the value of yi :
                 xi ~ N(µ i ,Σ i).
      Test your understanding. Given Ny classes and m input attributes, how
                     many distinct scalar parameters need to be estimated?

Copyright © 2001, Andrew W. Moore                                     Gaussian Bayes Classifiers: Slide 8




                                                                                                            4
MLE Gaussian Bayes Classifier
   Let DB = Subset of                            |DB |
 • The i i’th record in the p mle = ------is created
                             databasei
 database DB in which         i
     using classfollowing algorithm
            the is y = i
the output                            |DB|
 1. Generate the output (the “class”) by
     drawing yi~Multinomial(p1,p2,…p Ny )
 2. Generate the inputs from a Gaussian PDF
     that depends on the value of yi :
                    xi ~ N(µ i ,Σ i).
      Test your understanding. Given Ny classes and m input attributes, how
                     many distinct scalar parameters need to be estimated?

Copyright © 2001, Andrew W. Moore                     Gaussian Bayes Classifiers: Slide 9




      MLE Gaussian Bayes Classifier
   Let DB = Subset of
 • The i i’th record in the database is created
 database DB in which
the output class following algorithm
      using the is y = i
 1. mle mle the output (the “class”) by
      Generate
  (µ i , Σ i )= MLE Gaussian for DBi
      drawing yi~Multinomial(p1,p2,…p Ny )
 2. Generate the inputs from a Gaussian PDF
      that depends on the value of yi :
                     xi ~ N(µ i ,Σ i).
      Test your understanding. Given Ny classes and m input attributes, how
                     many distinct scalar parameters need to be estimated?

Copyright © 2001, Andrew W. Moore                    Gaussian Bayes Classifiers: Slide 10




                                                                                            5
MLE Gaussian Bayes Classifier
    Let DB = Subset of
 • The i i’th record in the database is created
 database DB in which
the output class following algorithm
      using the is y = i
 1. mle mle the output (the “class”) by
      Generate
  (µ i , Σ i )= MLE Gaussian for DBi
      drawing yi~Multinomial(p1,p2,…p Ny )
 2. Generate the inputs from a Gaussian PDF
      that depends on the value of yi :
                     xi ~ N(µ i ,Σ i).
                     R

                                                           (              )(                  )
            1
                   ∑
                                                   R
                                          1
                                                  ∑ x k input x k − µimle
µ mle =                                                                    T
       Test your| understanding. Given Ny classes and m− µ i attributes, how
                               Si =
                       xk        mle                       mle
  i
         | DB i x k ∈DB i              | DB i | x ∈DB
                               many distinct scalar parameters need to be estimated?
                                                       k   i



 Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 11




        Gaussian Bayes Classification
                          p (x | y = i) P ( y = i)
 P ( y = i | x) =
                                   p( x)




 Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 12




                                                                                                      6
Gaussian Bayes Classification
                                p (x | y = i) P ( y = i)
  P ( y = i | x) =
                                         p( x)

                                                            1                               
                                                        exp  − (x k − µ i ) S i (x k − µ i ) p i
                                          1                                 T

                           ( 2π )                           2                               
                                     m /2           1/2
                                           || S i ||
P ( y = i | x) =
                                                               p (x )

                                          How do we deal with that?




  Copyright © 2001, Andrew W. Moore                                                                Gaussian Bayes Classifiers: Slide 13




                                   Here is a dataset
               age      employment
                                education edunum
                                              marital       …        job         relation  race    gender hours_worked wealth
                                                                                                               country
                                                            …
                   39   State_gov Bachelors 13    Never_married
                                                            …        Adm_clericalNot_in_family
                                                                                           White   Male     40 United_States
                                                                                                                         poor
                   51   Self_emp_not_inc
                                  Bachelors 13    Married   …        Exec_managerial
                                                                                 Husband White     Male     13 United_States
                                                                                                                         poor
                   39   Private   HS_grad     9   Divorced …         Handlers_cleaners
                                                                                 Not_in_family
                                                                                           White   Male     40 United_States
                                                                                                                         poor
                   54   Private   11th        7   Married   …        Handlers_cleaners
                                                                                 Husband Black     Male     40 United_States
                                                                                                                         poor
                   28   Private   Bachelors 13    Married   …        Prof_specialty
                                                                                 Wife      Black   Female 40 Cuba        poor
                   38   Private   Masters    14   Married   …        Exec_managerial
                                                                                 Wife      White   Female 40 United_States
                                                                                                                         poor
                   50   Private   9th         5   Married_spouse_absent
                                                            …        Other_service
                                                                                 Not_in_family
                                                                                           Black   Female 16 Jamaica poor
                   52   Self_emp_not_inc
                                  HS_grad     9   Married   …        Exec_managerial
                                                                                 Husband White     Male     45 United_States
                                                                                                                         rich
                   31   Private   Masters    14   Never_married
                                                            …        Prof_specialty
                                                                                 Not_in_family
                                                                                           White   Female 50 United_States
                                                                                                                         rich
                   42   Private   Bachelors 13    Married   …        Exec_managerial
                                                                                 Husband White     Male     40 United_States
                                                                                                                         rich
                   37   Private   Some_college0
                                             1    Married   …        Exec_managerial
                                                                                 Husband Black     Male     80 United_States
                                                                                                                         rich
                   30   State_gov Bachelors 13    Married   …        Prof_specialty
                                                                                 Husband Asian     Male     40 India     rich
                   24   Private   Bachelors 13    Never_married
                                                            …        Adm_clericalOwn_child White   Female 30 United_States
                                                                                                                         poor
                   33   Private   Assoc_acdm12    Never_married
                                                            …        Sales       Not_in_family
                                                                                           Black   Male     50 United_States
                                                                                                                         poor
                   41   Private   Assoc_voc 11    Married   …        Craft_repairHusband Asian     Male     40 *MissingValue*
                                                                                                                         rich
                   34   Private   7th_8th     4   Married   …        Transport_moving
                                                                                 Husband Amer_Indian
                                                                                                   Male     45 Mexico    poor
                   26   Self_emp_not_inc
                                  HS_grad     9   Never_married
                                                            …        Farming_fishing
                                                                                 Own_child White   Male     35 United_States
                                                                                                                         poor
                   33   Private   HS_grad     9   Never_married
                                                            …        Machine_op_inspct White
                                                                                 Unmarried         Male     40 United_States
                                                                                                                         poor
                   38   Private   11th        7   Married   …        Sales       Husband White     Male     50 United_States
                                                                                                                         poor
                   44   Self_emp_not_inc
                                  Masters    14   Divorced …         Exec_managerial
                                                                                 Unmarried White   Female 45 United_States
                                                                                                                         rich
                   41   Private   Doctorate 16    Married   …        Prof_specialty
                                                                                 Husband White     Male     60 United_States
                                                                                                                         rich
               :        :         :        :      :         :        :           :         :       :      :    :         :




  48,000 records, 16 attributes [Kohavi 1995]
  Copyright © 2001, Andrew W. Moore                                                                Gaussian Bayes Classifiers: Slide 14




                                                                                                                                          7
Predicting wealth from age




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 15




           Predicting wealth from age




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 16




                                                                           8
Wealth from hours worked




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 17




   Wealth from years of education




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 18




                                                                           9
age, hours → wealth




Copyright © 2001, Andrew W. Moore    Gaussian Bayes Classifiers: Slide 19




                     age, hours → wealth




Copyright © 2001, Andrew W. Moore    Gaussian Bayes Classifiers: Slide 20




                                                                            10
age, hours → wealth




                                      Having 2 inputs
                                      instead of one helps
                                      in two ways:
                                      1. Combining
                                      evidence from two 1d
                                      Gaussians
                                      2. Off-diagonal
                                      covariance
                                      distinguishes class
                                      “shape”
Copyright © 2001, Andrew W. Moore    Gaussian Bayes Classifiers: Slide 21




                     age, hours → wealth




                                      Having 2 inputs
                                      instead of one helps
                                      in two ways:
                                      1. Combining
                                      evidence from two 1d
                                      Gaussians
                                      2. Off-diagonal
                                      covariance
                                      distinguishes class
                                      “shape”
Copyright © 2001, Andrew W. Moore    Gaussian Bayes Classifiers: Slide 22




                                                                            11
age, edunum → wealth




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 23




                 age, edunum → wealth




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 24




                                                                           12
hours, edunum → wealth




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 25




             hours, edunum → wealth




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 26




                                                                           13
Accuracy




Copyright © 2001, Andrew W. Moore              Gaussian Bayes Classifiers: Slide 27




                        An “MPG” example




Copyright © 2001, Andrew W. Moore              Gaussian Bayes Classifiers: Slide 28




                                                                                      14
An “MPG” example




  Copyright © 2001, Andrew W. Moore     Gaussian Bayes Classifiers: Slide 29




                          An “MPG” example
Things to note:
•Class Boundaries can be weird
shapes (hyperconic sections)
•Class regions can be non-simply-
connected
•But it’s impossible to model
arbitrarily weirdly shaped regions
•Test your understanding: With
one input, must classes be simply
connected?




  Copyright © 2001, Andrew W. Moore     Gaussian Bayes Classifiers: Slide 30




                                                                               15
Overfitting dangers
• Problem with “Joint” Bayes classifier:
            #parameters exponential with #dimensions.
                 This means we just memorize the
                  training data, and can overfit.




Copyright © 2001, Andrew W. Moore                Gaussian Bayes Classifiers: Slide 31




                       Overfitting dangers
• Problem with “Joint” Bayes classifier:
            #parameters exponential with #dimensions.
                 This means we just memorize the
                  training data, and can overfit.
• Problemette with Gaussian Bayes classifier:
            #parameters quadratic with #dimensions.
                 With 10,000 dimensions and only 1,000
                  datapoints we could overfit.

                        Question: Any suggested solutions?
Copyright © 2001, Andrew W. Moore                Gaussian Bayes Classifiers: Slide 32




                                                                                        16
General: O(m2)                         σ 21               L σ 1m 
                                                   σ 12
                                                                   
                                        σ         σ 22      L σ 2m 
   parameters                       S =  12                        
                                        M          M        O   M
                                        σ                   L σ 2m 
                                                   σ 2m
                                         1m                        




Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 33




  General: O(m2)                         σ 21               L σ 1m 
                                                   σ 12
                                                                   
                                        σ         σ 22      L σ 2m 
   parameters                       S =  12                        
                                        M          M        O   M
                                        σ                   L σ 2m 
                                                   σ 2m
                                         1m                        




Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 34




                                                                               17
 σ 21                                  0
                                                            L
                                              0      0               0
    Aligned: O(m)
                                                                                
                                        0 σ 22                                0
                                                            L
                                                    0                0
                                       0                                      0
                                                   σ 23     L
                                              0                      0
     parameters                     S =                                         
                                       M                                      M
                                              M     M       O        M
                                                                                
                                                            L σ 2 m−1
                                         0   0      0                         0
                                       0                                      2
                                                                              σ m
                                                            L
                                             0      0          0




Copyright © 2001, Andrew W. Moore           Gaussian Bayes Classifiers: Slide 35




                                        σ 21                                  0
                                                            L
                                              0      0               0
    Aligned: O(m)
                                                                                 
                                        0 σ 22                                0
                                                            L
                                                    0                0
                                       0                                      0
                                                   σ 23     L
                                              0                      0
     parameters                     S =                                          
                                       M                                       M
                                              M     M       O        M
                                                                                 
                                                            L σ 2 m−1
                                         0   0      0                         0
                                       0                                     σ 2m
                                                            L
                                                                                 
                                              0      0          0




Copyright © 2001, Andrew W. Moore           Gaussian Bayes Classifiers: Slide 36




                                                                                      18
σ 2 0                               0
                                                             L
                                                0                    0
  Spherical: O(1)                                                            
                                       0 σ                                 0
                                              2
                                                             L
                                                0                    0
                                       0                                   0
                                            0 σ2             L
  cov parameters
                                                                     0
                                    S =                                      
                                       M                                   M
                                            M   M            O       M
                                                                             
                                                             L σ2
                                                                           0
                                         0  0   0
                                       0                                  σ 2
                                                             L
                                                                             
                                            0   0                    0




Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 37




                                       σ 2 0                               0
                                                             L
                                              0                      0
  Spherical: O(1)                                                            
                                        0 σ2 0                             0
                                                             L       0
                                       0                                   0
                                            0 σ2             L
  cov parameters
                                                                     0
                                    S =                                      
                                       M                                   M
                                            M  M             O       M
                                                                             
                                                             L σ2
                                       0                                   0
                                            0 0
                                       0                                  σ 2
                                                             L
                                                                             
                                            0 0                      0




Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 38




                                                                                  19
BCs that have both real and
                     categorical inputs?

                                         Categorical   Real-valued          Mixed Real /
                                         inputs only   inputs only          Cat okay

                                       Joint BC        Gauss BC
                               Predict
 Inputs




                                                                             Dec Tree
                 Classifier   category Naïve BC
                                                                             BC Here???
                                         Joint DE      Gauss DE
                               Prob-
 Inputs Inputs




                  Density
                               ability
                 Estimator               Naïve DE

                               Predict
                 Regressor    real no.



Copyright © 2001, Andrew W. Moore                            Gaussian Bayes Classifiers: Slide 39




                 BCs that have both real and
                     categorical inputs?

                                         Categorical   Real-valued          Mixed Real /
                                         inputs only   inputs only          Cat okay

                                       Joint BC        Gauss BC
                               Predict
 Inputs




                                                                             Dec Tree
                 Classifier   category Naïve BC
                                                                             BC Here???
                                         Joint DE      Gauss DE
                               Prob-
 Inputs Inputs




                  Density
                               ability
                 Estimator               Naïve DE
                                                                Easy!
                               Predict
                 Regressor    real no.                      Guess how?


Copyright © 2001, Andrew W. Moore                            Gaussian Bayes Classifiers: Slide 40




                                                                                                    20
BCs that have both real and
                     categorical inputs?

                                         Categorical   Real-valued          Mixed Real /
                                         inputs only   inputs only          Cat okay

                                       Joint BC        Gauss BC             Dec Tree
                               Predict
 Inputs




                 Classifier                                                 Gauss/Joint BC
                              category Naïve BC
                                                                            Gauss Naïve BC
                                         Joint DE      Gauss DE            Gauss/Joint DE
                               Prob-
 Inputs Inputs




                  Density
                               ability
                 Estimator               Naïve DE      Gauss DE            Gauss Naïve DE

                               Predict
                 Regressor    real no.



Copyright © 2001, Andrew W. Moore                            Gaussian Bayes Classifiers: Slide 41




                 BCs that have both real and
                     categorical inputs?

                                         Categorical   Real-valued          Mixed Real /
                                         inputs only   inputs only          Cat okay

                                       Joint BC        Gauss BC             Dec Tree
                               Predict
 Inputs




                 Classifier                                                 Gauss/Joint BC
                              category Naïve BC
                                                                            Gauss Naïve BC
                                         Joint DE      Gauss DE            Gauss/Joint DE
                               Prob-
 Inputs Inputs




                  Density
                               ability
                 Estimator               Naïve DE      Gauss DE            Gauss Naïve DE

                               Predict
                 Regressor    real no.



Copyright © 2001, Andrew W. Moore                            Gaussian Bayes Classifiers: Slide 42




                                                                                                    21
Mixed Categorical / Real Density
            Estimation
• Write x = (u,v) = (u1 ,u2 ,…uq ,v1 ,v2 … vm-q)

                                           Real valued       Categorical valued


  P(x |M)= P(u,v |M)


  (where M is any Density Estimation Model)



Copyright © 2001, Andrew W. Moore                        Gaussian Bayes Classifiers: Slide 43




                                         Joint / Gauss DE
                            sty
                     ich ta
                                              Combo
               re wh          …
        Not su oy? Try our
               nj
              e
        DE to


P(u,v |M) = P(u |v ,M) P(v |M)
                                     Gaussian with       Big “m-q”-dimensional
                                      parameters              lookup table
                                    depending on v




Copyright © 2001, Andrew W. Moore                        Gaussian Bayes Classifiers: Slide 44




                                                                                                22
MLE learning of the Joint /
                      Gauss DE Combo
P(u,v |M) = P(u |v ,M) P(v |M)
µ v = Mean of u among
              records matching v

Σ v = Cov. of u among
              records matching v

qv = Fraction of records
              that match v


                        u |v ,M ~ N(µv , Σv ) , P(v |M) = qv
Copyright © 2001, Andrew W. Moore                   Gaussian Bayes Classifiers: Slide 45




                  MLE learning of the Joint /
                      Gauss DE Combo
P(u,v |M) = P(u |v ,M) P(v |M)
                                       1
µ v = Mean of u among               =
                                               ∑ uk
              records matching v      R v k s.t. v k = v
Σ v = Cov. of u among               =1
                                             ∑ =(u k − µ v )(u k − µ v ) T
              records matching v
                                      R v k s.t.v k v
qv = Fraction of records            = Rv
              that match v
                                         R R = # records that match v
                                                v


                        u |v ,M ~ N(µv , Σv ) , P(v |M) = qv
Copyright © 2001, Andrew W. Moore                   Gaussian Bayes Classifiers: Slide 46




                                                                                           23
Gender and Hours Worked*




              *As with all the results from the UCI “adult census” dataset, we can’t
              draw any real-world conclusions since it’s such a non-real-world sample
Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 47




                                          Joint / Gauss DE
What we just did
                                               Combo




Copyright © 2001, Andrew W. Moore                             Gaussian Bayes Classifiers: Slide 48




                                                                                                     24
Joint / Gauss BC
What we do next
                                                   Combo




Copyright © 2001, Andrew W. Moore                                   Gaussian Bayes Classifiers: Slide 49




                                              Joint / Gauss BC
                                                   Combo
                                                         p (u , v | M i ) P (Y = i )
                                    P (Y = i | u , v ) =
                                                                   p (u , v )
                                            p (u, | v , M i ) p ( v | M i ) P (Y = i )
                                          =
                                                             p (u , v )

                                              N (u; µ i , v , S i , v ) q i , v pi
                                          =
                                                        p (u, v )




Copyright © 2001, Andrew W. Moore                                   Gaussian Bayes Classifiers: Slide 50




                                                                                                           25
Joint / Gauss BC
                                                     Combo
                                            p (u , v | M i ) P (Y = i )
                                      P (Y = i | u , v ) =
                                                      p (u , v )
µ i,v   = Mean of u among
          records matching v = p (u, | v , M i ) p ( v | M i ) P (Y = i )
          and in which y=i                      p (u , v )
Σ i,v = Cov. of u among
       records matching v
                                                N (u; µ i , v , S i , v ) q i , v pi
       and in which y=i                     =
qi,v = Fraction of “y=i”                                  p (u, v )
       records that match
                                                     Rather so-so-notation for
       v
                                                     “Gaussian with mean µ i,v and
pi = Fraction of records                             covariance Σ i,v evaluated at u”
       that match “y=i”
  Copyright © 2001, Andrew W. Moore                                   Gaussian Bayes Classifiers: Slide 51




                  Gender, Hours→Wealth




  Copyright © 2001, Andrew W. Moore                                   Gaussian Bayes Classifiers: Slide 52




                                                                                                             26
Gender, Hours→Wealth




Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 53




       Joint / Gauss DE Combo and
       Joint / Gauss BC Combo: The
                 downside

• (Yawn…we’ve done this before…)
      More than a few categorical attributes blah blah
       blah massive table blah blah lots of parameters
       blah blah just memorize training data blah blah
       blah do worse on future data blah blah need to
       be more conservative blah


Copyright © 2001, Andrew W. Moore       Gaussian Bayes Classifiers: Slide 54




                                                                               27
Naïve/Gauss combo for Density
                  Estimation
      Categorical
Real

                 q                m − q               
                  ∏ p (u j | M )  ∏ P (v j | M ) 
p (u , v | M ) =                  j =1                
                  j =1                                
 u j | M ~ N ( µ j , σ j ) v j | M ~ Multinomia l[q j1 , q j 2 ,..., q jN j ]
                        2




                                          How many parameters?
 Copyright © 2001, Andrew W. Moore                    Gaussian Bayes Classifiers: Slide 55




     Naïve/Gauss combo for Density
                  Estimation
      Categorical
Real

                 q                m − q               
p (u , v | M ) =  ∏ p (u j | M )  ∏ P (v j | M ) 
                  j =1            j =1                
                                                      
 u j | M ~ N ( µ j , σ j ) v j | M ~ Multinomia l[q j1 , q j 2 ,..., q jN j ]
                        2


        1
          ∑ ukj
  µj =
        Rk
        1
 σ 2 = ∑ (u kj − µ j )
                       2
    j
        Rk
        # of records in which v j = h
 q jh =
                      R
 Copyright © 2001, Andrew W. Moore                    Gaussian Bayes Classifiers: Slide 56




                                                                                             28
Naïve/Gauss DE Example




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 57




              Naïve/Gauss DE Example




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 58




                                                                           29
Naïve /                                               p (u , v | Y = i ) P (Y = i )
                                     P (Y = i | u , v ) =
     Gauss BC                                                          p (u , v )
                                      m− q
              q
      1
            ∏ p (u j | µ ij , σ ij ) ∏ P (v j | q ij ) P(Y = i )
=                                2

  p (u , v ) j =1                       j =1
                                             m− q
                   q
            1
                  ∏ N (u j ; µ ij ,σ ij ) ∏ qij [ v j ] pi
     =                                2

        p (u, v ) j =1                        j =1


µ ij          = Mean of uj among records in which y=i
σ2ij          = Var. of uj among records in which y=i
qij[h] = Fraction of “y=i” records in which vj = h
pi            = Fraction of records that match “y=i”
 Copyright © 2001, Andrew W. Moore                                Gaussian Bayes Classifiers: Slide 59




             Gauss / Naïve BC Example




 Copyright © 2001, Andrew W. Moore                                Gaussian Bayes Classifiers: Slide 60




                                                                                                         30
Gauss / Naïve BC Example




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 61




   Learn Wealth from 15 attributes




Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 62




                                                                           31
Learn Wealth from 15 attributes
real values discretized
Same data, except all

      to 3 levels




     Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 63




               Learn Race from 15 attributes




     Copyright © 2001, Andrew W. Moore   Gaussian Bayes Classifiers: Slide 64




                                                                                32
What you should know
• A lot of this should have just been a
  corollary of what you already knew
• Turning Gaussian DEs into Gaussian BCs
• Mixing Categorical and Real-Valued




Copyright © 2001, Andrew W. Moore     Gaussian Bayes Classifiers: Slide 65




                      Questions to Ponder
• Suppose you wanted to create an example
  dataset where a BC involving Gaussians
  crushed decision trees like a bug. What
  would you do?
• Could you combine Decision Trees and
  Bayes Classifiers? How? (maybe there is
  more than one possible way)



Copyright © 2001, Andrew W. Moore     Gaussian Bayes Classifiers: Slide 66




                                                                             33

Más contenido relacionado

La actualidad más candente

DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmPınar Yahşi
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemblekalung0313
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronMostafa G. M. Mostafa
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
A survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsA survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsJoseph Paul Cohen PhD
 
Random forest
Random forestRandom forest
Random forestUjjawal
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Brain tumor classification using artificial neural network on mri images
Brain tumor classification using artificial neural network on mri imagesBrain tumor classification using artificial neural network on mri images
Brain tumor classification using artificial neural network on mri imageseSAT Journals
 
Probability & Information theory
Probability & Information theoryProbability & Information theory
Probability & Information theory성재 최
 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud ComputingJithin Parakka
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning ClassifiersMostafa
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorialbutest
 

La actualidad más candente (20)

DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering Algorithm
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemble
 
Neural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's PerceptronNeural Networks: Rosenblatt's Perceptron
Neural Networks: Rosenblatt's Perceptron
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
A survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applicationsA survey of deep learning approaches to medical applications
A survey of deep learning approaches to medical applications
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Random forest
Random forestRandom forest
Random forest
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Data Augmentation
Data AugmentationData Augmentation
Data Augmentation
 
Random forest
Random forestRandom forest
Random forest
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Brain tumor classification using artificial neural network on mri images
Brain tumor classification using artificial neural network on mri imagesBrain tumor classification using artificial neural network on mri images
Brain tumor classification using artificial neural network on mri images
 
Probability & Information theory
Probability & Information theoryProbability & Information theory
Probability & Information theory
 
Market oriented Cloud Computing
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
 
Kernel Method
Kernel MethodKernel Method
Kernel Method
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
 
KNN
KNN KNN
KNN
 
GMM
GMMGMM
GMM
 

Destacado

Short Overview of Bayes Nets
Short Overview of Bayes NetsShort Overview of Bayes Nets
Short Overview of Bayes Netsguestfee8698
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesguestfee8698
 
Knowledge Exchange Team
Knowledge Exchange TeamKnowledge Exchange Team
Knowledge Exchange Teamblount_l
 
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regressionPredicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regressionguestfee8698
 
The Interim Advantage
The Interim AdvantageThe Interim Advantage
The Interim Advantagecleaman
 
Project AK/Informatica
Project AK/InformaticaProject AK/Informatica
Project AK/Informaticaguest391da30
 
Ao Dai Ao Ngan Yem Dao
Ao Dai Ao Ngan Yem DaoAo Dai Ao Ngan Yem Dao
Ao Dai Ao Ngan Yem DaoPauldoright
 
Introduzione alle app Windows Universal
Introduzione alle app Windows UniversalIntroduzione alle app Windows Universal
Introduzione alle app Windows UniversalAlessandro Scardova
 
Het nieuwe werken, de mens, de cultuur en regie v11
Het nieuwe werken,   de mens, de cultuur en regie v11Het nieuwe werken,   de mens, de cultuur en regie v11
Het nieuwe werken, de mens, de cultuur en regie v11mscholte
 
Niagara falls, ny– city of missed
Niagara falls, ny– city of missedNiagara falls, ny– city of missed
Niagara falls, ny– city of missedDaniel Davis
 
FHTM Outline
FHTM OutlineFHTM Outline
FHTM Outlinerjbarros
 
Introduzione al design di Windows 10
Introduzione al design di Windows 10Introduzione al design di Windows 10
Introduzione al design di Windows 10Alessandro Scardova
 
Digital Lab Research areas
Digital Lab Research areasDigital Lab Research areas
Digital Lab Research areasblount_l
 

Destacado (19)

Short Overview of Bayes Nets
Short Overview of Bayes NetsShort Overview of Bayes Nets
Short Overview of Bayes Nets
 
Hope VI
Hope VIHope VI
Hope VI
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Knowledge Exchange Team
Knowledge Exchange TeamKnowledge Exchange Team
Knowledge Exchange Team
 
Metro contro Metro
Metro contro MetroMetro contro Metro
Metro contro Metro
 
Ao Lua Ha Dong
Ao Lua Ha DongAo Lua Ha Dong
Ao Lua Ha Dong
 
Predicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regressionPredicting Real-valued Outputs: An introduction to regression
Predicting Real-valued Outputs: An introduction to regression
 
The Interim Advantage
The Interim AdvantageThe Interim Advantage
The Interim Advantage
 
Project AK/Informatica
Project AK/InformaticaProject AK/Informatica
Project AK/Informatica
 
Ao Lua Ha Dong
Ao Lua Ha DongAo Lua Ha Dong
Ao Lua Ha Dong
 
Fast Camera
Fast CameraFast Camera
Fast Camera
 
Ao Dai Ao Ngan Yem Dao
Ao Dai Ao Ngan Yem DaoAo Dai Ao Ngan Yem Dao
Ao Dai Ao Ngan Yem Dao
 
Introduzione alle app Windows Universal
Introduzione alle app Windows UniversalIntroduzione alle app Windows Universal
Introduzione alle app Windows Universal
 
FHTM
FHTMFHTM
FHTM
 
Het nieuwe werken, de mens, de cultuur en regie v11
Het nieuwe werken,   de mens, de cultuur en regie v11Het nieuwe werken,   de mens, de cultuur en regie v11
Het nieuwe werken, de mens, de cultuur en regie v11
 
Niagara falls, ny– city of missed
Niagara falls, ny– city of missedNiagara falls, ny– city of missed
Niagara falls, ny– city of missed
 
FHTM Outline
FHTM OutlineFHTM Outline
FHTM Outline
 
Introduzione al design di Windows 10
Introduzione al design di Windows 10Introduzione al design di Windows 10
Introduzione al design di Windows 10
 
Digital Lab Research areas
Digital Lab Research areasDigital Lab Research areas
Digital Lab Research areas
 

Más de guestfee8698

Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Modelsguestfee8698
 
K-means and Hierarchical Clustering
K-means and Hierarchical ClusteringK-means and Hierarchical Clustering
K-means and Hierarchical Clusteringguestfee8698
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Modelsguestfee8698
 
A Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian ClassifiersA Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian Classifiersguestfee8698
 
Learning Bayesian Networks
Learning Bayesian NetworksLearning Bayesian Networks
Learning Bayesian Networksguestfee8698
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networksguestfee8698
 
Eight Regression Algorithms
Eight Regression AlgorithmsEight Regression Algorithms
Eight Regression Algorithmsguestfee8698
 
Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)guestfee8698
 
Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimationguestfee8698
 
Probability for Data Miners
Probability for Data MinersProbability for Data Miners
Probability for Data Minersguestfee8698
 

Más de guestfee8698 (16)

PAC Learning
PAC LearningPAC Learning
PAC Learning
 
VC dimensio
VC dimensioVC dimensio
VC dimensio
 
Hidden Markov Models
Hidden Markov ModelsHidden Markov Models
Hidden Markov Models
 
K-means and Hierarchical Clustering
K-means and Hierarchical ClusteringK-means and Hierarchical Clustering
K-means and Hierarchical Clustering
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
 
A Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian ClassifiersA Short Intro to Naive Bayesian Classifiers
A Short Intro to Naive Bayesian Classifiers
 
Learning Bayesian Networks
Learning Bayesian NetworksLearning Bayesian Networks
Learning Bayesian Networks
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networks
 
Bayesian Networks
Bayesian NetworksBayesian Networks
Bayesian Networks
 
Eight Regression Algorithms
Eight Regression AlgorithmsEight Regression Algorithms
Eight Regression Algorithms
 
Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)Instance-based learning (aka Case-based or Memory-based or non-parametric)
Instance-based learning (aka Case-based or Memory-based or non-parametric)
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Cross-Validation
Cross-ValidationCross-Validation
Cross-Validation
 
Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimation
 
Gaussians
GaussiansGaussians
Gaussians
 
Probability for Data Miners
Probability for Data MinersProbability for Data Miners
Probability for Data Miners
 

Último

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Gaussian Bayes Classifiers

  • 1. Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving you r own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully received. Learning Gaussian Bayes Classifiers Andrew W. Moore Associate Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 412-268-7599 Copyright © 2001, Andrew W. Moore Sep 10th, 2001 Maximum Likelihood learning of Gaussians for Classification • Why we should care • 3 seconds to teach you a new learning algorithm • What if there are 10,000 dimensions? • What if there are categorical inputs? • Examples “out the wazoo” Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 2 1
  • 2. Why we should care • One of the original “Data Mining” algorithms • Very simple and effective • Demonstrates the usefulness of our earlier groundwork Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 3 Where we were at the end of the MLE lecture… Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Predict Inputs Dec Tree Classifier category Naïve BC Joint DE Gauss DE Prob- Inputs Inputs Density ability Estimator Naïve DE Predict Regressor real no. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 4 2
  • 3. This lecture… Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Gauss BC Predict Inputs Dec Tree Classifier category Naïve BC Joint DE Gauss DE Prob- Inputs Inputs Density ability Estimator Naïve DE Predict Regressor real no. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 5 Road Map Probability Decision Trees Density PDFs Estimation Gaussians Bayes MLE Classifiers MLE of Gaussians Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 6 3
  • 4. Road Map Probability Decision Trees Density PDFs Estimation Gaussians Bayes MLE Classifiers MLE of Gaussians Gaussian Bayes Classifiers Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 7 Gaussian Bayes Classifier Assumption • The i’th record in the database is created using the following algorithm 1. Generate the output (the “class”) by drawing yi~Multinomial(p1,p2,…p Ny ) 2. Generate the inputs from a Gaussian PDF that depends on the value of yi : xi ~ N(µ i ,Σ i). Test your understanding. Given Ny classes and m input attributes, how many distinct scalar parameters need to be estimated? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 8 4
  • 5. MLE Gaussian Bayes Classifier Let DB = Subset of |DB | • The i i’th record in the p mle = ------is created databasei database DB in which i using classfollowing algorithm the is y = i the output |DB| 1. Generate the output (the “class”) by drawing yi~Multinomial(p1,p2,…p Ny ) 2. Generate the inputs from a Gaussian PDF that depends on the value of yi : xi ~ N(µ i ,Σ i). Test your understanding. Given Ny classes and m input attributes, how many distinct scalar parameters need to be estimated? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 9 MLE Gaussian Bayes Classifier Let DB = Subset of • The i i’th record in the database is created database DB in which the output class following algorithm using the is y = i 1. mle mle the output (the “class”) by Generate (µ i , Σ i )= MLE Gaussian for DBi drawing yi~Multinomial(p1,p2,…p Ny ) 2. Generate the inputs from a Gaussian PDF that depends on the value of yi : xi ~ N(µ i ,Σ i). Test your understanding. Given Ny classes and m input attributes, how many distinct scalar parameters need to be estimated? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 10 5
  • 6. MLE Gaussian Bayes Classifier Let DB = Subset of • The i i’th record in the database is created database DB in which the output class following algorithm using the is y = i 1. mle mle the output (the “class”) by Generate (µ i , Σ i )= MLE Gaussian for DBi drawing yi~Multinomial(p1,p2,…p Ny ) 2. Generate the inputs from a Gaussian PDF that depends on the value of yi : xi ~ N(µ i ,Σ i). R ( )( ) 1 ∑ R 1 ∑ x k input x k − µimle µ mle = T Test your| understanding. Given Ny classes and m− µ i attributes, how Si = xk mle mle i | DB i x k ∈DB i | DB i | x ∈DB many distinct scalar parameters need to be estimated? k i Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 11 Gaussian Bayes Classification p (x | y = i) P ( y = i) P ( y = i | x) = p( x) Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 12 6
  • 7. Gaussian Bayes Classification p (x | y = i) P ( y = i) P ( y = i | x) = p( x) 1  exp  − (x k − µ i ) S i (x k − µ i ) p i 1 T ( 2π ) 2  m /2 1/2 || S i || P ( y = i | x) = p (x ) How do we deal with that? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 13 Here is a dataset age employment education edunum marital … job relation race gender hours_worked wealth country … 39 State_gov Bachelors 13 Never_married … Adm_clericalNot_in_family White Male 40 United_States poor 51 Self_emp_not_inc Bachelors 13 Married … Exec_managerial Husband White Male 13 United_States poor 39 Private HS_grad 9 Divorced … Handlers_cleaners Not_in_family White Male 40 United_States poor 54 Private 11th 7 Married … Handlers_cleaners Husband Black Male 40 United_States poor 28 Private Bachelors 13 Married … Prof_specialty Wife Black Female 40 Cuba poor 38 Private Masters 14 Married … Exec_managerial Wife White Female 40 United_States poor 50 Private 9th 5 Married_spouse_absent … Other_service Not_in_family Black Female 16 Jamaica poor 52 Self_emp_not_inc HS_grad 9 Married … Exec_managerial Husband White Male 45 United_States rich 31 Private Masters 14 Never_married … Prof_specialty Not_in_family White Female 50 United_States rich 42 Private Bachelors 13 Married … Exec_managerial Husband White Male 40 United_States rich 37 Private Some_college0 1 Married … Exec_managerial Husband Black Male 80 United_States rich 30 State_gov Bachelors 13 Married … Prof_specialty Husband Asian Male 40 India rich 24 Private Bachelors 13 Never_married … Adm_clericalOwn_child White Female 30 United_States poor 33 Private Assoc_acdm12 Never_married … Sales Not_in_family Black Male 50 United_States poor 41 Private Assoc_voc 11 Married … Craft_repairHusband Asian Male 40 *MissingValue* rich 34 Private 7th_8th 4 Married … Transport_moving Husband Amer_Indian Male 45 Mexico poor 26 Self_emp_not_inc HS_grad 9 Never_married … Farming_fishing Own_child White Male 35 United_States poor 33 Private HS_grad 9 Never_married … Machine_op_inspct White Unmarried Male 40 United_States poor 38 Private 11th 7 Married … Sales Husband White Male 50 United_States poor 44 Self_emp_not_inc Masters 14 Divorced … Exec_managerial Unmarried White Female 45 United_States rich 41 Private Doctorate 16 Married … Prof_specialty Husband White Male 60 United_States rich : : : : : : : : : : : : : 48,000 records, 16 attributes [Kohavi 1995] Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 14 7
  • 8. Predicting wealth from age Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 15 Predicting wealth from age Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 16 8
  • 9. Wealth from hours worked Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 17 Wealth from years of education Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 18 9
  • 10. age, hours → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 19 age, hours → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 20 10
  • 11. age, hours → wealth Having 2 inputs instead of one helps in two ways: 1. Combining evidence from two 1d Gaussians 2. Off-diagonal covariance distinguishes class “shape” Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 21 age, hours → wealth Having 2 inputs instead of one helps in two ways: 1. Combining evidence from two 1d Gaussians 2. Off-diagonal covariance distinguishes class “shape” Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 22 11
  • 12. age, edunum → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 23 age, edunum → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 24 12
  • 13. hours, edunum → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 25 hours, edunum → wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 26 13
  • 14. Accuracy Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 27 An “MPG” example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 28 14
  • 15. An “MPG” example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 29 An “MPG” example Things to note: •Class Boundaries can be weird shapes (hyperconic sections) •Class regions can be non-simply- connected •But it’s impossible to model arbitrarily weirdly shaped regions •Test your understanding: With one input, must classes be simply connected? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 30 15
  • 16. Overfitting dangers • Problem with “Joint” Bayes classifier: #parameters exponential with #dimensions. This means we just memorize the training data, and can overfit. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 31 Overfitting dangers • Problem with “Joint” Bayes classifier: #parameters exponential with #dimensions. This means we just memorize the training data, and can overfit. • Problemette with Gaussian Bayes classifier: #parameters quadratic with #dimensions. With 10,000 dimensions and only 1,000 datapoints we could overfit. Question: Any suggested solutions? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 32 16
  • 17. General: O(m2)  σ 21 L σ 1m  σ 12   σ σ 22 L σ 2m  parameters S =  12  M M O M σ L σ 2m  σ 2m  1m  Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 33 General: O(m2)  σ 21 L σ 1m  σ 12   σ σ 22 L σ 2m  parameters S =  12  M M O M σ L σ 2m  σ 2m  1m  Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 34 17
  • 18.  σ 21 0 L 0 0 0 Aligned: O(m)    0 σ 22 0 L 0 0 0 0 σ 23 L 0 0 parameters S =  M M M M O M   L σ 2 m−1  0 0 0 0 0 2 σ m L  0 0 0 Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 35  σ 21 0 L 0 0 0 Aligned: O(m)    0 σ 22 0 L 0 0 0 0 σ 23 L 0 0 parameters S =  M M M M O M   L σ 2 m−1  0 0 0 0 0 σ 2m L   0 0 0 Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 36 18
  • 19. σ 2 0 0 L 0 0 Spherical: O(1)   0 σ 0 2 L 0 0 0 0 0 σ2 L cov parameters 0 S =  M M M M O M   L σ2  0 0 0 0 0 σ 2 L   0 0 0 Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 37 σ 2 0 0 L 0 0 Spherical: O(1)    0 σ2 0 0 L 0 0 0 0 σ2 L cov parameters 0 S =  M M M M O M   L σ2 0 0 0 0 0 σ 2 L   0 0 0 Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 38 19
  • 20. BCs that have both real and categorical inputs? Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Gauss BC Predict Inputs Dec Tree Classifier category Naïve BC BC Here??? Joint DE Gauss DE Prob- Inputs Inputs Density ability Estimator Naïve DE Predict Regressor real no. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 39 BCs that have both real and categorical inputs? Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Gauss BC Predict Inputs Dec Tree Classifier category Naïve BC BC Here??? Joint DE Gauss DE Prob- Inputs Inputs Density ability Estimator Naïve DE Easy! Predict Regressor real no. Guess how? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 40 20
  • 21. BCs that have both real and categorical inputs? Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Gauss BC Dec Tree Predict Inputs Classifier Gauss/Joint BC category Naïve BC Gauss Naïve BC Joint DE Gauss DE Gauss/Joint DE Prob- Inputs Inputs Density ability Estimator Naïve DE Gauss DE Gauss Naïve DE Predict Regressor real no. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 41 BCs that have both real and categorical inputs? Categorical Real-valued Mixed Real / inputs only inputs only Cat okay Joint BC Gauss BC Dec Tree Predict Inputs Classifier Gauss/Joint BC category Naïve BC Gauss Naïve BC Joint DE Gauss DE Gauss/Joint DE Prob- Inputs Inputs Density ability Estimator Naïve DE Gauss DE Gauss Naïve DE Predict Regressor real no. Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 42 21
  • 22. Mixed Categorical / Real Density Estimation • Write x = (u,v) = (u1 ,u2 ,…uq ,v1 ,v2 … vm-q) Real valued Categorical valued P(x |M)= P(u,v |M) (where M is any Density Estimation Model) Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 43 Joint / Gauss DE sty ich ta Combo re wh … Not su oy? Try our nj e DE to P(u,v |M) = P(u |v ,M) P(v |M) Gaussian with Big “m-q”-dimensional parameters lookup table depending on v Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 44 22
  • 23. MLE learning of the Joint / Gauss DE Combo P(u,v |M) = P(u |v ,M) P(v |M) µ v = Mean of u among records matching v Σ v = Cov. of u among records matching v qv = Fraction of records that match v u |v ,M ~ N(µv , Σv ) , P(v |M) = qv Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 45 MLE learning of the Joint / Gauss DE Combo P(u,v |M) = P(u |v ,M) P(v |M) 1 µ v = Mean of u among = ∑ uk records matching v R v k s.t. v k = v Σ v = Cov. of u among =1 ∑ =(u k − µ v )(u k − µ v ) T records matching v R v k s.t.v k v qv = Fraction of records = Rv that match v R R = # records that match v v u |v ,M ~ N(µv , Σv ) , P(v |M) = qv Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 46 23
  • 24. Gender and Hours Worked* *As with all the results from the UCI “adult census” dataset, we can’t draw any real-world conclusions since it’s such a non-real-world sample Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 47 Joint / Gauss DE What we just did Combo Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 48 24
  • 25. Joint / Gauss BC What we do next Combo Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 49 Joint / Gauss BC Combo p (u , v | M i ) P (Y = i ) P (Y = i | u , v ) = p (u , v ) p (u, | v , M i ) p ( v | M i ) P (Y = i ) = p (u , v ) N (u; µ i , v , S i , v ) q i , v pi = p (u, v ) Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 50 25
  • 26. Joint / Gauss BC Combo p (u , v | M i ) P (Y = i ) P (Y = i | u , v ) = p (u , v ) µ i,v = Mean of u among records matching v = p (u, | v , M i ) p ( v | M i ) P (Y = i ) and in which y=i p (u , v ) Σ i,v = Cov. of u among records matching v N (u; µ i , v , S i , v ) q i , v pi and in which y=i = qi,v = Fraction of “y=i” p (u, v ) records that match Rather so-so-notation for v “Gaussian with mean µ i,v and pi = Fraction of records covariance Σ i,v evaluated at u” that match “y=i” Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 51 Gender, Hours→Wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 52 26
  • 27. Gender, Hours→Wealth Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 53 Joint / Gauss DE Combo and Joint / Gauss BC Combo: The downside • (Yawn…we’ve done this before…) More than a few categorical attributes blah blah blah massive table blah blah lots of parameters blah blah just memorize training data blah blah blah do worse on future data blah blah need to be more conservative blah Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 54 27
  • 28. Naïve/Gauss combo for Density Estimation Categorical Real q  m − q   ∏ p (u j | M )  ∏ P (v j | M )  p (u , v | M ) =   j =1   j =1   u j | M ~ N ( µ j , σ j ) v j | M ~ Multinomia l[q j1 , q j 2 ,..., q jN j ] 2 How many parameters? Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 55 Naïve/Gauss combo for Density Estimation Categorical Real q  m − q  p (u , v | M ) =  ∏ p (u j | M )  ∏ P (v j | M )   j =1  j =1     u j | M ~ N ( µ j , σ j ) v j | M ~ Multinomia l[q j1 , q j 2 ,..., q jN j ] 2 1 ∑ ukj µj = Rk 1 σ 2 = ∑ (u kj − µ j ) 2 j Rk # of records in which v j = h q jh = R Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 56 28
  • 29. Naïve/Gauss DE Example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 57 Naïve/Gauss DE Example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 58 29
  • 30. Naïve / p (u , v | Y = i ) P (Y = i ) P (Y = i | u , v ) = Gauss BC p (u , v ) m− q q 1 ∏ p (u j | µ ij , σ ij ) ∏ P (v j | q ij ) P(Y = i ) = 2 p (u , v ) j =1 j =1 m− q q 1 ∏ N (u j ; µ ij ,σ ij ) ∏ qij [ v j ] pi = 2 p (u, v ) j =1 j =1 µ ij = Mean of uj among records in which y=i σ2ij = Var. of uj among records in which y=i qij[h] = Fraction of “y=i” records in which vj = h pi = Fraction of records that match “y=i” Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 59 Gauss / Naïve BC Example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 60 30
  • 31. Gauss / Naïve BC Example Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 61 Learn Wealth from 15 attributes Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 62 31
  • 32. Learn Wealth from 15 attributes real values discretized Same data, except all to 3 levels Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 63 Learn Race from 15 attributes Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 64 32
  • 33. What you should know • A lot of this should have just been a corollary of what you already knew • Turning Gaussian DEs into Gaussian BCs • Mixing Categorical and Real-Valued Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 65 Questions to Ponder • Suppose you wanted to create an example dataset where a BC involving Gaussians crushed decision trees like a bug. What would you do? • Could you combine Decision Trees and Bayes Classifiers? How? (maybe there is more than one possible way) Copyright © 2001, Andrew W. Moore Gaussian Bayes Classifiers: Slide 66 33