SlideShare a Scribd company logo
1 of 21
Download to read offline
Text Book slides modified by Prof M.Shashi as per the
AU syllabus
Data Generation Process
 The process that had generated the data is not
completely known and hence is modelled as a random
process.
 The outcome of a random process is modelled as a
random variable.
 Based on the available information or features the value
of a random variable is not predictable with certainty and
hence is non-deterministic.
 Probability theory deals with the study and analysis of
such random processes
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 2
Probability and Inference
 Result of tossing a coin is ∈ {Heads,Tails}
 Random var X ∈{1,0}
 Po denotes probability of heads, P(X=1)
 This implies P(X=0)=1- po
 X is Bernoulli distributed and its probability is expressed as
P{X} = po
X (1 ‒ po)(1 ‒ X)
 Data Sample: X = {xt }N
t =1
Estimation: po = # {Heads}/#{Tosses} = ∑t
xt / N
 Prediction of next toss:
Infer Heads if po > ½, Tails otherwise
3
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Classification
 Values of observable input variables is the basis for
prediction
 Credit scoring: Inputs are income and savings.
Output is low-risk vs high-risk
 Input: x = [x1,x2]T ,Output: C = {0,1}
 Prediction:
 Error =1- Max { P(C=1|(x1 , x2), P(C=0|(x1 , x2)}



=
=
>
=
=



=
>
=
=
otherwise
0
)
|
(
)
|
(
if
1
choose
or
otherwise
0
)
|
(
if
1
choose
C
C
C
C
,x
x
C
P
,x
x
C
P
.
,x
x
C
P
2
1
2
1
2
1
0
1
5
0
1
4
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Bayes’ Rule
( )
( ) ( )
( )
x
x
x
p
p
P
P
C
C
C
|
| =
( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) 1
|
1
|
0
0
0
|
1
1
|
1
1
0
=
=
+
=
=
=
+
=
=
=
=
=
+
=
x
x
x
x
x
C
C
C
C
C
C
C
C
P
p
P
p
P
p
p
P
P
5
posterior
Likelihood of X in C
prior
Prob of Evidence,x, irrespective of C
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
For Binary classification
Bayes’ Rule: K>2 Classes
( ) ( ) ( )
( )
( ) ( )
( ) ( )
∑
=
=
=
K
k
k
k
i
i
i
i
i
C
P
C
p
C
P
C
p
p
C
P
C
p
C
P
1
|
|
|
|
x
x
x
x
x
( ) ( )
( ) ( )
x
x |
max
|
if
choose
and
1
k
k
i
i
K
i
i
i
C
P
C
P
C
C
P
C
P
=
=
≥ ∑
=
1
0
6
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Decision Making considering
Losses and Risks
 Loss incurred by False Positives and False Negatives
may not be equal in domains like finance, health and
disaster management.
 Action to assign an input to Ci: αi
 Loss of αi when the true state is Ck : λik
 Expected risk in taking the action αi is
( ) ( )
( ) ( )
x
x
x
x
|
min
|
if
choose
|
|
k
k
i
i
k
K
k
ik
i
R
R
C
P
R
α
α
α
λ
α
=
= ∑
=1
7
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Losses and Risks: 0/1 Loss Case



≠
=
=
k
i
k
i
ik
if
if
1
0
λ
( ) ( )
( )
( )
x
x
x
x
|
|
|
|
i
i
k
k
K
k
k
ik
i
C
P
C
P
C
P
R
−
=
=
=
∑
∑
≠
=
1
1
λ
α
8
For minimum risk, choose the most probable class
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
All errors are equally costly:
Losses and Risks: Reject as CK+1
(if misclassification is costlier than manual work,
eg: sorting mail by optical digit recognizer)
1
0
1
1
0
<
<





+
=
=
= λ
λ
λ
otherwise
if
if
,
K
i
k
i
ik
( ) ( )
( ) ( ) ( )
x
x
x
x
x
|
|
|
|
|
i
i
k
k
i
K
k
k
K
C
P
C
P
R
C
P
R
−
=
=
=
=
∑
∑
≠
=
+
1
1
1
α
λ
λ
α
( ) ( ) ( )
otherwise
reject
|
and
|
|
if
choose λ
−
>
≠
∀
> 1
x
x
x i
k
i
i C
P
i
k
C
P
C
P
C
9
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Classification using Discriminant
Functions
( ) ( )
x
x k
k
i
i g
g
C max
if
choose =
( ) ( )
{ }
x
x
x k
k
i
i g
g max
| =
=
R
( )
( )
( )
( ) ( )




−
=
i
i
i
i
i
C
P
C
p
C
P
R
g
|
|
|
x
x
x
x
α
10
g(x) Divides the feature space into
K decision regions R1,...,RK
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Discriminant functions can be
defined as
K=2 Classes
 Dichotomizer (K=2) vs Polychotomizer (K>2)
 Single discriminant function is often used for 2-
class classification
 g(x) = g1(x) – g2(x)
 Log odds:
( )


 >
otherwise
if
choose
2
1 0
C
g
C x
( )
( )
x
x
|
|
log
2
1
C
P
C
P
11
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Utility Theory for making Rational
Decisions under uncertainty
 Prob of state k given exidence x: P (Sk|x)
 Utility of action αi when state is k: Uik
 Expected utility:
 Maximizing expected utility is equivalent to
minimizing expected risk.
( ) ( )
( ) ( )
x
x
x
x
|
max
|
if
Choose
|
|
j
j
i
i
k
k
ik
i
EU
EU
α
S
P
U
EU
α
α
α
=
= ∑
12
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Value of Information
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 13
• Observable features like blood test, MRI scan, etc. are
costly and unless they are needed for diagnosis
should not be asked for.
• Value of information has to be assessed in such
domains.
• Observed features: x Newly added features: z
• The expected utility of the best action in state k before
and after adding z is given by
• Value of Info given by z =(EU(x,z)-EU(x)) and if it is
greater than 0 then only z is useful.
( ) ( )
( ) ( )
∑
∑
=
=
k
k
jk
j
k
k
jk
j
z
S
P
U
z
EU
S
P
U
EU
,
|
max
,
|
max
x
x
x
x
Bayesian Belief Network
14
Bayesian Networks
 Directed Acyclic Graphical model to represent the interaction
between the random variables denoted by nodes and directed
edges between them.
 The nodes in the DAG structure have conditional probabilities
as parameters to be learned based on a set of known
examples or through domain knowledge.
 Bayesian networks represents conditional independence
between certain nodes which is helpful to break down the
problem of finding joint distribution of many variables into
local structures.
P(X1, …Xd )= ∏ P(Xi |parents(Xi ))
 Accordingly for the Bayesian Network shown in the diagram
P(C,S,R,W,F)=P(C)P(S|C)P(R|C)P(W|S,R)P(F|R)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
Bayesian Networks contd…
 In Bayesian networks the input and output variables are not
explicitly designated. Based on the available evidence, the
belief in the variables propagates to infer the prob of the other
variables.
 Hidden variables may also be represented by some of the
nodes and their conditional probabilities are estimated based
on the values of their parents representing related observed
variables.
 Deals with the numeric and categorical variables also
 The structure should be created by human expert after
identifying the casual relationships among the variables and
the local hierarchies.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
Influence Diagrams: Graphical models for
generalization of Bayesian Networks for Decision Making
Lecture Notes for E Alpaydın 2010 Introduction
to Machine Learning 2e © The MIT Press (V1.0) 17
 Influence Diagram contains
 chance nodes rep the Random
variables in BN,
 decision nodes rep choice of
action/classification and
 utility node for utility estimation.
Bayesian Network(BN)
for Classification
Association Rules
 Association rule: X → Y
 People who buy X are also likely to buy Y.
 A rule implies association, not necessarily causation.
 In order to find such associations, the frequent itemsets
are to be found out from the transaction database.
 The number of transactions that cover an itemset is
referred to as its support
 An itemset is considered frequent enough based on a
minimum support threshold.
18
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Apriori Property
 All subsets of a frequent itemset are frequent. Hence, if a
set is found to be infrequent all its supersets cease to be
frequent and hence pruned.
 For (X,Y,Z), a 3-item set, to be frequent (have enough
support), (X,Y), (X,Z), and (Y,Z) should be frequent.
 If (X,Y) is not frequent, none of its supersets can be
frequent.
 Once we find the frequent k-item sets, we convert them
to rules: X, Y → Z, ...
and X → Y, Z, ...
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 19
Association measures
 Support (X → Y):
 Confidence (X → Y):
 Lift (X → Y):
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 20
( ) { }
{ }
ons
transacti
all
#
and
covering
ns
transactio
#
,
Y
X
Y
X
P =
( ) ( )
{ }
{ }
X
Y
X
X
P
Y
X
P
X
Y
P
covering
ns
transactio
#
and
covering
ns
transactio
#
)
(
,
|
=
=
( )
)
(
)
|
(
)
(
)
(
,
Y
P
X
Y
P
Y
P
X
P
Y
X
P
=
=
Conclusion
 Discussed the formalism for optimal decision making
under uncertainty
 The concepts of probability theory are found to be useful
for modelling uncertainty and accordingly utility of
making a choice or decision is estimated.
 The next chapters focus on how to estimate these
probabilities from a given dataset. They are categorised
as:
 Parametric approaches
 Semiparametric and nonparametric approaches
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 21

More Related Content

Similar to Bayesian_Decision_Theory-3.pdf

isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..
butest
 
Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...
Wajih Alaiyan
 
Introduction
IntroductionIntroduction
Introduction
butest
 

Similar to Bayesian_Decision_Theory-3.pdf (20)

AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
Statement of stochastic programming problems
Statement of stochastic programming problemsStatement of stochastic programming problems
Statement of stochastic programming problems
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-fei
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...
 
Introduction
IntroductionIntroduction
Introduction
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
 
Introduction to Machine Learning Lectures
Introduction to Machine Learning LecturesIntroduction to Machine Learning Lectures
Introduction to Machine Learning Lectures
 
Isspit presentation
Isspit presentationIsspit presentation
Isspit presentation
 
FL-01 Introduction.pptx
FL-01 Introduction.pptxFL-01 Introduction.pptx
FL-01 Introduction.pptx
 
i2ml3e-chap3.pptx
i2ml3e-chap3.pptxi2ml3e-chap3.pptx
i2ml3e-chap3.pptx
 

Recently uploaded

The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
daisycvs
 
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in PakistanChallenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
vineshkumarsajnani12
 

Recently uploaded (20)

Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
JAJPUR CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN JAJPUR ESCORTS
JAJPUR CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN JAJPUR  ESCORTSJAJPUR CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN JAJPUR  ESCORTS
JAJPUR CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN JAJPUR ESCORTS
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Kalyan Call Girl 98350*37198 Call Girls in Escort service book now
Kalyan Call Girl 98350*37198 Call Girls in Escort service book nowKalyan Call Girl 98350*37198 Call Girls in Escort service book now
Kalyan Call Girl 98350*37198 Call Girls in Escort service book now
 
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NSCROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
 
Cannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 UpdatedCannabis Legalization World Map: 2024 Updated
Cannabis Legalization World Map: 2024 Updated
 
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
Lundin Gold - Q1 2024 Conference Call Presentation (Revised)
 
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in PakistanChallenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
Challenges and Opportunities: A Qualitative Study on Tax Compliance in Pakistan
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Durg CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN durg ESCORTS
Durg CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN durg ESCORTSDurg CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN durg ESCORTS
Durg CALL GIRL ❤ 82729*64427❤ CALL GIRLS IN durg ESCORTS
 
Pre Engineered Building Manufacturers Hyderabad.pptx
Pre Engineered  Building Manufacturers Hyderabad.pptxPre Engineered  Building Manufacturers Hyderabad.pptx
Pre Engineered Building Manufacturers Hyderabad.pptx
 
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
 
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptxQSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
QSM Chap 10 Service Culture in Tourism and Hospitality Industry.pptx
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 

Bayesian_Decision_Theory-3.pdf

  • 1. Text Book slides modified by Prof M.Shashi as per the AU syllabus
  • 2. Data Generation Process  The process that had generated the data is not completely known and hence is modelled as a random process.  The outcome of a random process is modelled as a random variable.  Based on the available information or features the value of a random variable is not predictable with certainty and hence is non-deterministic.  Probability theory deals with the study and analysis of such random processes Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 2
  • 3. Probability and Inference  Result of tossing a coin is ∈ {Heads,Tails}  Random var X ∈{1,0}  Po denotes probability of heads, P(X=1)  This implies P(X=0)=1- po  X is Bernoulli distributed and its probability is expressed as P{X} = po X (1 ‒ po)(1 ‒ X)  Data Sample: X = {xt }N t =1 Estimation: po = # {Heads}/#{Tosses} = ∑t xt / N  Prediction of next toss: Infer Heads if po > ½, Tails otherwise 3 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 4. Classification  Values of observable input variables is the basis for prediction  Credit scoring: Inputs are income and savings. Output is low-risk vs high-risk  Input: x = [x1,x2]T ,Output: C = {0,1}  Prediction:  Error =1- Max { P(C=1|(x1 , x2), P(C=0|(x1 , x2)}    = = > = =    = > = = otherwise 0 ) | ( ) | ( if 1 choose or otherwise 0 ) | ( if 1 choose C C C C ,x x C P ,x x C P . ,x x C P 2 1 2 1 2 1 0 1 5 0 1 4 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 5. Bayes’ Rule ( ) ( ) ( ) ( ) x x x p p P P C C C | | = ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 | 1 | 0 0 0 | 1 1 | 1 1 0 = = + = = = + = = = = = + = x x x x x C C C C C C C C P p P p P p p P P 5 posterior Likelihood of X in C prior Prob of Evidence,x, irrespective of C Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) For Binary classification
  • 6. Bayes’ Rule: K>2 Classes ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ∑ = = = K k k k i i i i i C P C p C P C p p C P C p C P 1 | | | | x x x x x ( ) ( ) ( ) ( ) x x | max | if choose and 1 k k i i K i i i C P C P C C P C P = = ≥ ∑ = 1 0 6 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 7. Decision Making considering Losses and Risks  Loss incurred by False Positives and False Negatives may not be equal in domains like finance, health and disaster management.  Action to assign an input to Ci: αi  Loss of αi when the true state is Ck : λik  Expected risk in taking the action αi is ( ) ( ) ( ) ( ) x x x x | min | if choose | | k k i i k K k ik i R R C P R α α α λ α = = ∑ =1 7 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 8. Losses and Risks: 0/1 Loss Case    ≠ = = k i k i ik if if 1 0 λ ( ) ( ) ( ) ( ) x x x x | | | | i i k k K k k ik i C P C P C P R − = = = ∑ ∑ ≠ = 1 1 λ α 8 For minimum risk, choose the most probable class Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) All errors are equally costly:
  • 9. Losses and Risks: Reject as CK+1 (if misclassification is costlier than manual work, eg: sorting mail by optical digit recognizer) 1 0 1 1 0 < <      + = = = λ λ λ otherwise if if , K i k i ik ( ) ( ) ( ) ( ) ( ) x x x x x | | | | | i i k k i K k k K C P C P R C P R − = = = = ∑ ∑ ≠ = + 1 1 1 α λ λ α ( ) ( ) ( ) otherwise reject | and | | if choose λ − > ≠ ∀ > 1 x x x i k i i C P i k C P C P C 9 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 10. Classification using Discriminant Functions ( ) ( ) x x k k i i g g C max if choose = ( ) ( ) { } x x x k k i i g g max | = = R ( ) ( ) ( ) ( ) ( )     − = i i i i i C P C p C P R g | | | x x x x α 10 g(x) Divides the feature space into K decision regions R1,...,RK Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Discriminant functions can be defined as
  • 11. K=2 Classes  Dichotomizer (K=2) vs Polychotomizer (K>2)  Single discriminant function is often used for 2- class classification  g(x) = g1(x) – g2(x)  Log odds: ( )    > otherwise if choose 2 1 0 C g C x ( ) ( ) x x | | log 2 1 C P C P 11 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 12. Utility Theory for making Rational Decisions under uncertainty  Prob of state k given exidence x: P (Sk|x)  Utility of action αi when state is k: Uik  Expected utility:  Maximizing expected utility is equivalent to minimizing expected risk. ( ) ( ) ( ) ( ) x x x x | max | if Choose | | j j i i k k ik i EU EU α S P U EU α α α = = ∑ 12 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 13. Value of Information Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 13 • Observable features like blood test, MRI scan, etc. are costly and unless they are needed for diagnosis should not be asked for. • Value of information has to be assessed in such domains. • Observed features: x Newly added features: z • The expected utility of the best action in state k before and after adding z is given by • Value of Info given by z =(EU(x,z)-EU(x)) and if it is greater than 0 then only z is useful. ( ) ( ) ( ) ( ) ∑ ∑ = = k k jk j k k jk j z S P U z EU S P U EU , | max , | max x x x x
  • 15. Bayesian Networks  Directed Acyclic Graphical model to represent the interaction between the random variables denoted by nodes and directed edges between them.  The nodes in the DAG structure have conditional probabilities as parameters to be learned based on a set of known examples or through domain knowledge.  Bayesian networks represents conditional independence between certain nodes which is helpful to break down the problem of finding joint distribution of many variables into local structures. P(X1, …Xd )= ∏ P(Xi |parents(Xi ))  Accordingly for the Bayesian Network shown in the diagram P(C,S,R,W,F)=P(C)P(S|C)P(R|C)P(W|S,R)P(F|R) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
  • 16. Bayesian Networks contd…  In Bayesian networks the input and output variables are not explicitly designated. Based on the available evidence, the belief in the variables propagates to infer the prob of the other variables.  Hidden variables may also be represented by some of the nodes and their conditional probabilities are estimated based on the values of their parents representing related observed variables.  Deals with the numeric and categorical variables also  The structure should be created by human expert after identifying the casual relationships among the variables and the local hierarchies. Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
  • 17. Influence Diagrams: Graphical models for generalization of Bayesian Networks for Decision Making Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 17  Influence Diagram contains  chance nodes rep the Random variables in BN,  decision nodes rep choice of action/classification and  utility node for utility estimation. Bayesian Network(BN) for Classification
  • 18. Association Rules  Association rule: X → Y  People who buy X are also likely to buy Y.  A rule implies association, not necessarily causation.  In order to find such associations, the frequent itemsets are to be found out from the transaction database.  The number of transactions that cover an itemset is referred to as its support  An itemset is considered frequent enough based on a minimum support threshold. 18 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 19. Apriori Property  All subsets of a frequent itemset are frequent. Hence, if a set is found to be infrequent all its supersets cease to be frequent and hence pruned.  For (X,Y,Z), a 3-item set, to be frequent (have enough support), (X,Y), (X,Z), and (Y,Z) should be frequent.  If (X,Y) is not frequent, none of its supersets can be frequent.  Once we find the frequent k-item sets, we convert them to rules: X, Y → Z, ... and X → Y, Z, ... Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 19
  • 20. Association measures  Support (X → Y):  Confidence (X → Y):  Lift (X → Y): Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 20 ( ) { } { } ons transacti all # and covering ns transactio # , Y X Y X P = ( ) ( ) { } { } X Y X X P Y X P X Y P covering ns transactio # and covering ns transactio # ) ( , | = = ( ) ) ( ) | ( ) ( ) ( , Y P X Y P Y P X P Y X P = =
  • 21. Conclusion  Discussed the formalism for optimal decision making under uncertainty  The concepts of probability theory are found to be useful for modelling uncertainty and accordingly utility of making a choice or decision is estimated.  The next chapters focus on how to estimate these probabilities from a given dataset. They are categorised as:  Parametric approaches  Semiparametric and nonparametric approaches Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 21