SlideShare una empresa de Scribd logo
1 de 31
Constructing Decision Trees
A Decision Tree Example
The weather data example.
ID code Outlook Temperature Humidity Windy Play
a
b
c
d
e
f
g
h
i
j
k
l
m
n
Sunny
Sunny
Overcast
Rainy
Rainy
Rainy
Overcast
Sunny
Sunny
Rainy
Sunny
Overcast
Overcast
Rainy
Hot
Hot
Hot
Mild
Cool
Cool
Cool
Mild
Cool
Mild
Mild
Mild
Hot
Mild
High
High
High
High
Normal
Normal
Normal
High
Normal
Normal
Normal
High
Normal
High
False
True
False
False
False
True
True
False
False
False
True
True
False
True
No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
~continues
Outlook
humidity windy
yes
no yes
yes no
sunny overcast rainy
high normal false true
Decision tree for the weather data.
The Process of Constructing a
Decision Tree
• Select an attribute to place at the root of the
decision tree and make one branch for every
possible value.
• Repeat the process recursively for each
branch.
Which Attribute Should Be Placed
at a Certain Node
• One common approach is based on the
information gained by placing a certain
attribute at this node.
Information Gained by Knowing
the Result of a Decision
• In the weather data example, there are 9
instances of which the decision to play is
“yes” and there are 5 instances of which the
decision to play is “no’. Then, the
information gained by knowing the result of
the decision is
bits.
940
.
0
14
5
log
14
5
14
9
log
14
9
























The General Form for Calculating
the Information Gain
• Entropy of a decision =
P1, P2, …, Pn are the probabilities of the n
possible outcomes.
n
n P
P
P
P
P
P log
log
log 2
2
1
1 





 
Information Further Required If
“Outlook” Is Placed at the Root
Outlook
yes
yes
no
no
no
yes
yes
yes
yes
yes
yes
yes
no
no
sunny overcast rainy
.
693
.
0
971
.
0
14
5
0
14
4
971
.
0
14
5
required
further
n
Informatio
bits

























Information Gained by Placing
Each of the 4 Attributes
• Gain(outlook) = 0.940 bits – 0.693 bits
= 0.247 bits.
• Gain(temperature) = 0.029 bits.
• Gain(humidity) = 0.152 bits.
• Gain(windy) = 0.048 bits.
The Strategy for Selecting an
Attribute to Place at a Node
• Select the attribute that gives us the largest
information gain.
• In this example, it is the attribute “Outlook”.
Outlook
2 “yes”
3 “no”
4 “yes” 3 “yes”
2 “no”
sunny overcast rainy
The Recursive Procedure for
Constructing a Decision Tree
• The operation discussed above is applied to each
branch recursively to construct the decision tree.
• For example, for the branch “Outlook = Sunny”,
we evaluate the information gained by applying
each of the remaining 3 attributes.
• Gain(Outlook=sunny;Temperature) = 0.971 – 0.4 =
0.571
• Gain(Outlook=sunny;Humidity) = 0.971 – 0 = 0.971
• Gain(Outlook=sunny;Windy) = 0.971 – 0.951 = 0.02
• Similarly, we also evaluate the information
gained by applying each of the remaining 3
attributes for the branch “Outlook = rainy”.
• Gain(Outlook=rainy;Temperature) = 0.971 –
0.951 = 0.02
• Gain(Outlook=rainy;Humidity) = 0.971 – 0.951
= 0.02
• Gain(Outlook=rainy;Windy) =0.971 – 0 =
0.971
The Over-fitting Issue
• Over-fitting is caused by creating decision
rules that work accurately on the training set
based on insufficient quantity of samples.
• As a result, these decision rules may not
work well in more general cases.
Example of the Over-fitting Problem
in Decision Tree Construction
bits
848
.
0
17
9
log
17
9
17
8
log
17
8
20
17
children
at the
entropy
Average
bits
993
.
0
20
9
log
20
9
20
11
log
20
11
subroot
at the
Entropy
2
2
2
2




















11 “Yes” and 9 “No” samples;
prediction = “Yes”
8 “Yes” and 9 “No” samples;
prediction = “No”
3 “Yes” and 0 “No” samples;
prediction = “Yes” Ai=0 Ai=1
• Hence, with the binary split, we gain more
information.
• However, if we look at the pessimistic error rate,
i.e. the upper bound of the confidence interval of
the error rate, we may get different conclusion.
• The formula for the pessimistic error rate is
• Note that the pessimistic error rate is a function of
the confidence level used.
  user.
by the
specified
level
confidence
the
is
and
,
samples,
of
number
the
is
rate,
error
observed
the
is
where
1
4
2
1
2
2
2
2
2
c
c
z
n
r
n
z
n
z
n
r
n
r
z
n
z
r
e









• The pessimistic error rates under 95%
confidence are
 
6598
.
0
17
645
.
1
1
1156
706
.
2
17
17
8
17
17
8
645
.
1
34
645
.
1
17
8
4742
.
0
3
645
.
1
1
36
645
.
1
645
.
1
6
645
.
1
6278
.
0
20
645
.
1
1
1600
706
.
2
20
45
.
0
20
45
.
0
645
.
1
40
645
.
1
45
.
0
2
2
2
17
8
2
2
2
3
0
2
2
2
20
9


















e
e
e
• Therefore, the average pessimistic error rate
at the children is
• Since the pessimistic error rate increases
with the split, we do not want to keep the
children. This practice is called “tree
pruning”.
6278
.
0
632
.
0
6598
.
0
20
17
4742
.
0
20
3





Tree Pruning based on 2 Test of
Independence
• We construct the corresponding
contingency table
Ai=
0
Ai=
1
Yes 3 8 11
No 0 9 9
3 17 20
11 “Yes” and
9 “No” samples;
8 “Yes” and
9 “No” samples;
3 “Yes” and
0 “No samples; Ai=0 Ai=1
15
.
1
20
7
1
9
20
7
1
9
-
9
20
9
3
20
9
3
-
0
20
1
1
17
20
1
1
17
-
8
20
3
11
20
3
11
-
3
statistic
The
2
2
2
2
2







 







 







 







 


• Therefore, we should not split the subroot
node, if we require that the 2 statistic must
be larger than 2
k,0.05 , where k is the degree
of freedom of the corresponding
contingency table.
Constructing Decision Trees based
on 2 test of Independence
• Using the following example, we can
construct a contingency table accordingly.
75 “Yes”s out of
100 samples;
Prediction = “Yes”
45 “Yes”s out of
50 samples;
20 “Yes”s out of
25 samples;
10 “Yes”s out of
25 samples; 100
100
50
100
25
100
25
100
25
5
15
5
100
75
45
10
20
2
1
0
No
Yes
Ai
Ai=0 Ai=1 Ai=2
• Therefore, we may say that the split is
statistically robust.
991
.
5
67
.
22
100
4
1
2
1
100
4
1
2
1
5
100
4
1
4
1
100
4
1
4
1
15
100
4
1
4
1
100
4
1
4
1
5
100
4
3
2
1
100
4
3
2
1
45
100
4
3
4
1
100
4
3
4
1
10
100
4
3
4
1
100
4
3
4
1
20
2
05
.
0
,
2
2
2
2
2
2
2
2













































































Assume that we have another attribute
Aj to consider
Aj=0 Aj=1
Yes 25 50 75
No 0 25 25
25 75 100
75 “Yes” out of
100 samples;
50 “Yes” out of
75 samples;
25 “Yes” out
of 25 samples; Aj=0 Aj=1
841
.
3
11
.
11
100
5
7
5
2
100
5
7
5
2
-
25
100
5
7
5
7
100
5
7
5
7
-
50
100
25
5
2
100
25
5
2
-
0
100
5
7
5
2
100
5
7
5
2
-
25
2
05
.
0
,
1
2
2
2
2
2









 







 







 







 



• Now, both Ai and Aj pass our criterion. How
should we make our selection?
• We can make our selection based on the
significance levels of the two contingency tables.
 
   
  .
10
8
0008
.
0
33
.
3
)
1
,
0
(
Prob
2
33
.
3
)
1
,
0
(
Prob
33
.
3
)
1
,
0
(
Prob
'
11
.
11
)
1
,
0
(
Prob
)
11
.
11
(
1
'
11
.
11
4
2
2
'
,
1 2
1



















N
N
N
N
F


 

• Therefore, Ai is preferred over Aj.
.
10
19
.
1
1
1
)
67
.
22
(
1
"
67
.
22
5
)
67
.
22
(
2
1
2
"
,
2 2
2



















e
F
 

• If a subtree is as follows
• 2 = 4.543 < 5.991
• In this case, we do not want to carry out the split.
15 “Yes”s out of
20 samples;
9 “Yes”s out of
10 samples;
4 “Yes”s out of
5 samples;
2 “Yes”s out of
5 samples;
Termination of Split due to Low
Significance level
A More Realistic Example and
Some Remarks
• In the following example, a bank wants to
derive a credit evaluation tree for future use
based on the records of existing customers.
• As the data set shows, it is highly likely that
the training data set contains inconsistencies.
• Furthermore, some values may be missing.
• Therefore, for most cases, it is impossible to
derive perfect decision trees, i.e. decision
trees with 100% accuracy.
~continues
Attributes Class
Education Annual Income Age Own House Sex Credit ranking
College High Old Yes Male Good
High school ----- Middle Yes Male Good
High school Middle Young No Female Good
College High Old Yes Male Poor
College High Old Yes Male Good
College Middle Young No Female Good
High school High Old Yes Male Poor
College Middle Middle ----- Female Good
High school Middle Young No Male Poor
~continues
• A quality measure of decision trees can be
based on the accuracy. There are alternative
measures depending on the nature of
applications.
• Overfitting is a problem caused by making
the derived decision tree work accurately
for the training set. As a result, the decision
tree may work less accurately in the real
world.
~continues
• There are two situations in which
overfitting may occur:
• insufficient number of samples at the subroot.
• some attributes are highly branched.
• A conventional practice for handling
missing values is to treat them as possible
attribute values. That is, each attribute has
one additional attribute value corresponding
to the missing value.
Alternative Measures of Quality of
Decision Trees
• The recall rate and precision are two widely used
measures.
• where C is the set of samples in the class and C’ is
the set of samples which the decision tree puts into
the class.
'
Precision
Rate
Recall
C
C
C
C
C
C
'
'




~continues
• A situation in which the recall rate is the
main concern:
• “A bank wants to find all the potential credit
card customers”.
• A situation in which precision is the main
concern:
• “A bank wants to find a decision tree for credit
approval.”

Más contenido relacionado

Similar a Decision tree.10.11

Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outlinebutest
 
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)dradilkhan87
 
Classification & Clustering.pptx
Classification & Clustering.pptxClassification & Clustering.pptx
Classification & Clustering.pptxImXaib
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning FoundationsAlbert Y. C. Chen
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptxmohammedalherwi1
 
Chapter 8 review
Chapter 8 reviewChapter 8 review
Chapter 8 reviewdrahkos1
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.pptbutest
 
Mark Graban Deming Red Bead 2016 SHS
Mark Graban Deming Red Bead 2016 SHSMark Graban Deming Red Bead 2016 SHS
Mark Graban Deming Red Bead 2016 SHSMark Graban
 
Part XIV
Part XIVPart XIV
Part XIVbutest
 
UNIT 2: Part 1: Data Warehousing and Data Mining
UNIT 2: Part 1: Data Warehousing and Data MiningUNIT 2: Part 1: Data Warehousing and Data Mining
UNIT 2: Part 1: Data Warehousing and Data MiningNandakumar P
 
introDMintroDMintroDMintroDMintroDMintroDM.ppt
introDMintroDMintroDMintroDMintroDMintroDM.pptintroDMintroDMintroDMintroDMintroDMintroDM.ppt
introDMintroDMintroDMintroDMintroDMintroDM.pptDEEPAK948083
 
Section 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxSection 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxkenjordan97598
 
Top Five Ideas -- Statistics for Project Management
Top Five Ideas -- Statistics for Project ManagementTop Five Ideas -- Statistics for Project Management
Top Five Ideas -- Statistics for Project ManagementJohn Goodpasture
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision treehktripathy
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsScott Fraundorf
 
VSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsVSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsBigML, Inc
 

Similar a Decision tree.10.11 (20)

Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outline
 
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)
Introduction to Data Mining (Why Mine Data? Commercial Viewpoint)
 
Classification & Clustering.pptx
Classification & Clustering.pptxClassification & Clustering.pptx
Classification & Clustering.pptx
 
Machine Learning Foundations
Machine Learning FoundationsMachine Learning Foundations
Machine Learning Foundations
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx
 
Chapter 8 review
Chapter 8 reviewChapter 8 review
Chapter 8 review
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.ppt
 
Mark Graban Deming Red Bead 2016 SHS
Mark Graban Deming Red Bead 2016 SHSMark Graban Deming Red Bead 2016 SHS
Mark Graban Deming Red Bead 2016 SHS
 
Part XIV
Part XIVPart XIV
Part XIV
 
UNIT 2: Part 1: Data Warehousing and Data Mining
UNIT 2: Part 1: Data Warehousing and Data MiningUNIT 2: Part 1: Data Warehousing and Data Mining
UNIT 2: Part 1: Data Warehousing and Data Mining
 
introDMintroDMintroDMintroDMintroDMintroDM.ppt
introDMintroDMintroDMintroDMintroDMintroDM.pptintroDMintroDMintroDMintroDMintroDMintroDM.ppt
introDMintroDMintroDMintroDMintroDMintroDM.ppt
 
introDM.ppt
introDM.pptintroDM.ppt
introDM.ppt
 
Section 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docxSection 7 Analyzing our Marketing Test, Survey Results .docx
Section 7 Analyzing our Marketing Test, Survey Results .docx
 
BIIntro.ppt
BIIntro.pptBIIntro.ppt
BIIntro.ppt
 
Top Five Ideas -- Statistics for Project Management
Top Five Ideas -- Statistics for Project ManagementTop Five Ideas -- Statistics for Project Management
Top Five Ideas -- Statistics for Project Management
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random Intercepts
 
VSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic RegressionsVSSML17 L2. Ensembles and Logistic Regressions
VSSML17 L2. Ensembles and Logistic Regressions
 
Machine Learning and Data Mining
Machine Learning and Data MiningMachine Learning and Data Mining
Machine Learning and Data Mining
 

Más de okeee

Week02 answer
Week02 answerWeek02 answer
Week02 answerokeee
 
Dm uitwerkingen wc4
Dm uitwerkingen wc4Dm uitwerkingen wc4
Dm uitwerkingen wc4okeee
 
Dm uitwerkingen wc2
Dm uitwerkingen wc2Dm uitwerkingen wc2
Dm uitwerkingen wc2okeee
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1okeee
 
Dm uitwerkingen wc3
Dm uitwerkingen wc3Dm uitwerkingen wc3
Dm uitwerkingen wc3okeee
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1okeee
 
Dm part03 neural-networks-handout
Dm part03 neural-networks-handoutDm part03 neural-networks-handout
Dm part03 neural-networks-handoutokeee
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homeworkokeee
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508okeee
 
Hcm p137 hilliges
Hcm p137 hilligesHcm p137 hilliges
Hcm p137 hilligesokeee
 
Prob18
Prob18Prob18
Prob18okeee
 
Overfit10
Overfit10Overfit10
Overfit10okeee
 
Dm week01 linreg.handout
Dm week01 linreg.handoutDm week01 linreg.handout
Dm week01 linreg.handoutokeee
 
Dm week02 decision-trees-handout
Dm week02 decision-trees-handoutDm week02 decision-trees-handout
Dm week02 decision-trees-handoutokeee
 
Dm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutDm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutokeee
 
Dm week01 intro.handout
Dm week01 intro.handoutDm week01 intro.handout
Dm week01 intro.handoutokeee
 
Dm week01 homework(1)
Dm week01 homework(1)Dm week01 homework(1)
Dm week01 homework(1)okeee
 
Chapter7 huizing
Chapter7 huizingChapter7 huizing
Chapter7 huizingokeee
 
Chapter8 choo
Chapter8 chooChapter8 choo
Chapter8 choookeee
 
Chapter6 huizing
Chapter6 huizingChapter6 huizing
Chapter6 huizingokeee
 

Más de okeee (20)

Week02 answer
Week02 answerWeek02 answer
Week02 answer
 
Dm uitwerkingen wc4
Dm uitwerkingen wc4Dm uitwerkingen wc4
Dm uitwerkingen wc4
 
Dm uitwerkingen wc2
Dm uitwerkingen wc2Dm uitwerkingen wc2
Dm uitwerkingen wc2
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1
 
Dm uitwerkingen wc3
Dm uitwerkingen wc3Dm uitwerkingen wc3
Dm uitwerkingen wc3
 
Dm uitwerkingen wc1
Dm uitwerkingen wc1Dm uitwerkingen wc1
Dm uitwerkingen wc1
 
Dm part03 neural-networks-handout
Dm part03 neural-networks-handoutDm part03 neural-networks-handout
Dm part03 neural-networks-handout
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homework
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508
 
Hcm p137 hilliges
Hcm p137 hilligesHcm p137 hilliges
Hcm p137 hilliges
 
Prob18
Prob18Prob18
Prob18
 
Overfit10
Overfit10Overfit10
Overfit10
 
Dm week01 linreg.handout
Dm week01 linreg.handoutDm week01 linreg.handout
Dm week01 linreg.handout
 
Dm week02 decision-trees-handout
Dm week02 decision-trees-handoutDm week02 decision-trees-handout
Dm week02 decision-trees-handout
 
Dm week01 prob-refresher.handout
Dm week01 prob-refresher.handoutDm week01 prob-refresher.handout
Dm week01 prob-refresher.handout
 
Dm week01 intro.handout
Dm week01 intro.handoutDm week01 intro.handout
Dm week01 intro.handout
 
Dm week01 homework(1)
Dm week01 homework(1)Dm week01 homework(1)
Dm week01 homework(1)
 
Chapter7 huizing
Chapter7 huizingChapter7 huizing
Chapter7 huizing
 
Chapter8 choo
Chapter8 chooChapter8 choo
Chapter8 choo
 
Chapter6 huizing
Chapter6 huizingChapter6 huizing
Chapter6 huizing
 

Último

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 

Último (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Decision tree.10.11

  • 2. A Decision Tree Example The weather data example. ID code Outlook Temperature Humidity Windy Play a b c d e f g h i j k l m n Sunny Sunny Overcast Rainy Rainy Rainy Overcast Sunny Sunny Rainy Sunny Overcast Overcast Rainy Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild High High High High Normal Normal Normal High Normal Normal Normal High Normal High False True False False False True True False False False True True False True No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
  • 3. ~continues Outlook humidity windy yes no yes yes no sunny overcast rainy high normal false true Decision tree for the weather data.
  • 4. The Process of Constructing a Decision Tree • Select an attribute to place at the root of the decision tree and make one branch for every possible value. • Repeat the process recursively for each branch.
  • 5. Which Attribute Should Be Placed at a Certain Node • One common approach is based on the information gained by placing a certain attribute at this node.
  • 6. Information Gained by Knowing the Result of a Decision • In the weather data example, there are 9 instances of which the decision to play is “yes” and there are 5 instances of which the decision to play is “no’. Then, the information gained by knowing the result of the decision is bits. 940 . 0 14 5 log 14 5 14 9 log 14 9                        
  • 7. The General Form for Calculating the Information Gain • Entropy of a decision = P1, P2, …, Pn are the probabilities of the n possible outcomes. n n P P P P P P log log log 2 2 1 1        
  • 8. Information Further Required If “Outlook” Is Placed at the Root Outlook yes yes no no no yes yes yes yes yes yes yes no no sunny overcast rainy . 693 . 0 971 . 0 14 5 0 14 4 971 . 0 14 5 required further n Informatio bits                         
  • 9. Information Gained by Placing Each of the 4 Attributes • Gain(outlook) = 0.940 bits – 0.693 bits = 0.247 bits. • Gain(temperature) = 0.029 bits. • Gain(humidity) = 0.152 bits. • Gain(windy) = 0.048 bits.
  • 10. The Strategy for Selecting an Attribute to Place at a Node • Select the attribute that gives us the largest information gain. • In this example, it is the attribute “Outlook”. Outlook 2 “yes” 3 “no” 4 “yes” 3 “yes” 2 “no” sunny overcast rainy
  • 11. The Recursive Procedure for Constructing a Decision Tree • The operation discussed above is applied to each branch recursively to construct the decision tree. • For example, for the branch “Outlook = Sunny”, we evaluate the information gained by applying each of the remaining 3 attributes. • Gain(Outlook=sunny;Temperature) = 0.971 – 0.4 = 0.571 • Gain(Outlook=sunny;Humidity) = 0.971 – 0 = 0.971 • Gain(Outlook=sunny;Windy) = 0.971 – 0.951 = 0.02
  • 12. • Similarly, we also evaluate the information gained by applying each of the remaining 3 attributes for the branch “Outlook = rainy”. • Gain(Outlook=rainy;Temperature) = 0.971 – 0.951 = 0.02 • Gain(Outlook=rainy;Humidity) = 0.971 – 0.951 = 0.02 • Gain(Outlook=rainy;Windy) =0.971 – 0 = 0.971
  • 13. The Over-fitting Issue • Over-fitting is caused by creating decision rules that work accurately on the training set based on insufficient quantity of samples. • As a result, these decision rules may not work well in more general cases.
  • 14. Example of the Over-fitting Problem in Decision Tree Construction bits 848 . 0 17 9 log 17 9 17 8 log 17 8 20 17 children at the entropy Average bits 993 . 0 20 9 log 20 9 20 11 log 20 11 subroot at the Entropy 2 2 2 2                     11 “Yes” and 9 “No” samples; prediction = “Yes” 8 “Yes” and 9 “No” samples; prediction = “No” 3 “Yes” and 0 “No” samples; prediction = “Yes” Ai=0 Ai=1
  • 15. • Hence, with the binary split, we gain more information. • However, if we look at the pessimistic error rate, i.e. the upper bound of the confidence interval of the error rate, we may get different conclusion. • The formula for the pessimistic error rate is • Note that the pessimistic error rate is a function of the confidence level used.   user. by the specified level confidence the is and , samples, of number the is rate, error observed the is where 1 4 2 1 2 2 2 2 2 c c z n r n z n z n r n r z n z r e         
  • 16. • The pessimistic error rates under 95% confidence are   6598 . 0 17 645 . 1 1 1156 706 . 2 17 17 8 17 17 8 645 . 1 34 645 . 1 17 8 4742 . 0 3 645 . 1 1 36 645 . 1 645 . 1 6 645 . 1 6278 . 0 20 645 . 1 1 1600 706 . 2 20 45 . 0 20 45 . 0 645 . 1 40 645 . 1 45 . 0 2 2 2 17 8 2 2 2 3 0 2 2 2 20 9                   e e e
  • 17. • Therefore, the average pessimistic error rate at the children is • Since the pessimistic error rate increases with the split, we do not want to keep the children. This practice is called “tree pruning”. 6278 . 0 632 . 0 6598 . 0 20 17 4742 . 0 20 3     
  • 18. Tree Pruning based on 2 Test of Independence • We construct the corresponding contingency table Ai= 0 Ai= 1 Yes 3 8 11 No 0 9 9 3 17 20 11 “Yes” and 9 “No” samples; 8 “Yes” and 9 “No” samples; 3 “Yes” and 0 “No samples; Ai=0 Ai=1 15 . 1 20 7 1 9 20 7 1 9 - 9 20 9 3 20 9 3 - 0 20 1 1 17 20 1 1 17 - 8 20 3 11 20 3 11 - 3 statistic The 2 2 2 2 2                                      
  • 19. • Therefore, we should not split the subroot node, if we require that the 2 statistic must be larger than 2 k,0.05 , where k is the degree of freedom of the corresponding contingency table.
  • 20. Constructing Decision Trees based on 2 test of Independence • Using the following example, we can construct a contingency table accordingly. 75 “Yes”s out of 100 samples; Prediction = “Yes” 45 “Yes”s out of 50 samples; 20 “Yes”s out of 25 samples; 10 “Yes”s out of 25 samples; 100 100 50 100 25 100 25 100 25 5 15 5 100 75 45 10 20 2 1 0 No Yes Ai Ai=0 Ai=1 Ai=2
  • 21. • Therefore, we may say that the split is statistically robust. 991 . 5 67 . 22 100 4 1 2 1 100 4 1 2 1 5 100 4 1 4 1 100 4 1 4 1 15 100 4 1 4 1 100 4 1 4 1 5 100 4 3 2 1 100 4 3 2 1 45 100 4 3 4 1 100 4 3 4 1 10 100 4 3 4 1 100 4 3 4 1 20 2 05 . 0 , 2 2 2 2 2 2 2 2                                                                             
  • 22. Assume that we have another attribute Aj to consider Aj=0 Aj=1 Yes 25 50 75 No 0 25 25 25 75 100 75 “Yes” out of 100 samples; 50 “Yes” out of 75 samples; 25 “Yes” out of 25 samples; Aj=0 Aj=1 841 . 3 11 . 11 100 5 7 5 2 100 5 7 5 2 - 25 100 5 7 5 7 100 5 7 5 7 - 50 100 25 5 2 100 25 5 2 - 0 100 5 7 5 2 100 5 7 5 2 - 25 2 05 . 0 , 1 2 2 2 2 2                                         
  • 23. • Now, both Ai and Aj pass our criterion. How should we make our selection? • We can make our selection based on the significance levels of the two contingency tables.         . 10 8 0008 . 0 33 . 3 ) 1 , 0 ( Prob 2 33 . 3 ) 1 , 0 ( Prob 33 . 3 ) 1 , 0 ( Prob ' 11 . 11 ) 1 , 0 ( Prob ) 11 . 11 ( 1 ' 11 . 11 4 2 2 ' , 1 2 1                    N N N N F     
  • 24. • Therefore, Ai is preferred over Aj. . 10 19 . 1 1 1 ) 67 . 22 ( 1 " 67 . 22 5 ) 67 . 22 ( 2 1 2 " , 2 2 2                    e F   
  • 25. • If a subtree is as follows • 2 = 4.543 < 5.991 • In this case, we do not want to carry out the split. 15 “Yes”s out of 20 samples; 9 “Yes”s out of 10 samples; 4 “Yes”s out of 5 samples; 2 “Yes”s out of 5 samples; Termination of Split due to Low Significance level
  • 26. A More Realistic Example and Some Remarks • In the following example, a bank wants to derive a credit evaluation tree for future use based on the records of existing customers. • As the data set shows, it is highly likely that the training data set contains inconsistencies. • Furthermore, some values may be missing. • Therefore, for most cases, it is impossible to derive perfect decision trees, i.e. decision trees with 100% accuracy.
  • 27. ~continues Attributes Class Education Annual Income Age Own House Sex Credit ranking College High Old Yes Male Good High school ----- Middle Yes Male Good High school Middle Young No Female Good College High Old Yes Male Poor College High Old Yes Male Good College Middle Young No Female Good High school High Old Yes Male Poor College Middle Middle ----- Female Good High school Middle Young No Male Poor
  • 28. ~continues • A quality measure of decision trees can be based on the accuracy. There are alternative measures depending on the nature of applications. • Overfitting is a problem caused by making the derived decision tree work accurately for the training set. As a result, the decision tree may work less accurately in the real world.
  • 29. ~continues • There are two situations in which overfitting may occur: • insufficient number of samples at the subroot. • some attributes are highly branched. • A conventional practice for handling missing values is to treat them as possible attribute values. That is, each attribute has one additional attribute value corresponding to the missing value.
  • 30. Alternative Measures of Quality of Decision Trees • The recall rate and precision are two widely used measures. • where C is the set of samples in the class and C’ is the set of samples which the decision tree puts into the class. ' Precision Rate Recall C C C C C C ' '    
  • 31. ~continues • A situation in which the recall rate is the main concern: • “A bank wants to find all the potential credit card customers”. • A situation in which precision is the main concern: • “A bank wants to find a decision tree for credit approval.”