SlideShare una empresa de Scribd logo
1 de 72
Descargar para leer sin conexión
Machine Learning Specialization
Overfitting
in decision trees
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization3	
Review of loan default prediction
©2015-2016 Emily Fox & Carlos Guestrin
Safe
✓	
Risky
✘	
Risky
✘	
Intelligent loan application
review system
Loan
Applications
Machine Learning Specialization4	 ©2015-2016 Emily Fox & Carlos Guestrin
Decision tree review
T(xi) = Traverse decision tree
Loan
Application
Input: xi
ŷi	
Credit?
Safe Term?
Risky Safe
Income?
Term?
Risky Safe
Risky
start
excellent	 poor	
fair	
5 years	3 years	
Low	high	
5 years	3 years
Machine Learning Specialization
Overfitting review
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization8	
Overfitting in logistic regression
©2015-2016 Emily Fox & Carlos Guestrin
Model complexity
Error=
ClassificationError
Overfitting if there exists w*:
•  training_error(w*) > training_error(ŵ)
•  true_error(w*) < true_error(ŵ)
True error
Training error
Machine Learning Specialization9	
Overfitting è
Overconfident predictions
©2015-2016 Emily Fox & Carlos Guestrin
Logistic Regression
(Degree 6)
Logistic Regression
(Degree 20)
Machine Learning Specialization
Overfitting in decision trees
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization13	
Decision stump (Depth 1):
Split on x[1]
©2015-2016 Emily Fox & Carlos Guestrin
Root
18 13
x[1] < -0.07
13 3
x[1] >= -0.07
4 11
x[1]
y values
- +
Machine Learning Specialization14	
Tree depth depth = 1 depth = 2 depth = 3 depth = 5 depth = 10
Training error 0.22 0.13 0.10 0.03 0.00
Decision
boundary
©2015-2016 Emily Fox & Carlos Guestrin
Training error reduces with depth
What happens when we increase depth?
Machine Learning Specialization16	
Deeper trees è lower training error
©2015-2016 Emily Fox & Carlos Guestrin
Tree depth
TrainingError
Depth 10 (training error = 0.0)
Machine Learning Specialization17	
Training error = 0: Is this model perfect?
©2015-2016 Emily Fox & Carlos Guestrin
Depth 10 (training error = 0.0)
NOT PERFECT
Machine Learning Specialization18	
Why training error reduces with depth?
©2015-2016 Emily Fox & Carlos Guestrin
Root
22 18
excellent
9 0
good
9 4
fair
4 14
Loan status:
Safe Risky
Credit?
Tree Training
error
(root) 0.45
split on credit 0.20
Safe Safe Risky Training error
improved by 0.25
because of the split
Split on credit
Machine Learning Specialization19	
Feature split selection algorithm
©2015-2016 Emily Fox & Carlos Guestrin
•  Given a subset of data M (a node in a tree)
•  For each feature hi(x):
1.  Split data of M according to feature hi(x)
2.  Compute classification error split
•  Chose feature h*(x) with lowest
classification error
By design, each split
reduces training error
Machine Learning Specialization21	
Decision trees overfitting
on loan data
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Principle of Occam’s razor:
Simpler trees are better
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization25	
Principle of Occam’s Razor
©2015-2016 Emily Fox & Carlos Guestrin
“Among competing hypotheses, the one with
fewest assumptions should be selected”,
William of Occam, 13th Century
OR
Diagnosis 1: 2 diseases
Two diseases D1 and D2 where
D1 explains S1, D2 explains S2
Diagnosis 2: 1 disease
Disease D3 explains both
symptoms S1 and S2
Symptoms: S1 and S2
SIMPLER
Machine Learning Specialization26	
Occam’s Razor for decision trees
©2015-2016 Emily Fox & Carlos Guestrin
When two trees have similar classification error
on the validation set, pick the simpler one
Complexity Train
error
Validation
error
Simple 0.23 0.24
Moderate 0.12 0.15
Complex 0.07 0.15
Super complex 0 0.18 Overfit
Same validation
error
Machine Learning Specialization27	
Which tree is simpler?
©2015-2016 Emily Fox & Carlos Guestrin
OR
SIMPLER
Machine Learning Specialization29	
Modified tree learning problem
©2015-2016 Emily Fox & Carlos Guestrin
T1(X) T2(X)
T4(X)
Find a “simple” decision tree with low classification error
Complex treesSimple trees
Machine Learning Specialization30	
How do we pick simpler trees?
©2015-2016 Emily Fox & Carlos Guestrin
1.  Early Stopping: Stop learning algorithm
before tree become too complex
2.  Pruning: Simplify tree after
learning algorithm terminates
Machine Learning Specialization
Early stopping for
learning decision trees
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization33	
Deeper trees è
Increasing complexity
©2015-2016 Emily Fox & Carlos Guestrin
Model complexity increases with depth
Depth = 1 Depth = 2 Depth = 10
Machine Learning Specialization
Early stopping condition 1:
Limit the depth of a tree
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization36	
Restrict tree learning to shallow trees?
©2015-2016 Emily Fox & Carlos Guestrin
ClassificationError
Complex
trees
True error
Training error
Simple
trees
Tree depth
max_depth
Machine Learning Specialization37	
Early stopping condition 1:
Limit depth of tree
©2015-2016 Emily Fox & Carlos Guestrin
ClassificationError
Stop tree building when
depth = max_depth
max_depth	
Tree depth
Machine Learning Specialization38	
Picking value for max_depth???
©2015-2016 Emily Fox & Carlos Guestrin
ClassificationError
Validation set or
cross-validation
max_depth	
Tree depth
Machine Learning Specialization
Early stopping condition 2:
Use classification error to
limit depth of tree
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization40	
Decision tree recursion review
©2015-2016 Emily Fox & Carlos Guestrin
Loan status:
Safe Risky
Credit?
Safe
Root
22 18
excellent
16 0
fair
1 2
poor
5 16
Build decision stump
with subset of data
where Credit = poor
Build decision stump
with subset of data
where Credit = fair
Machine Learning Specialization42	
Split selection for credit=poor
©2015-2016 Emily Fox & Carlos Guestrin
No split improves
classification error
è Stop!
Splits for
credit=poor
Classification
error
(no split) 0.45
split on term 0.24
split on income 0.24
Credit?
Safe
Root
22 18
excellent
16 0
fair
1 2
poor
5 16
Loan status:
Safe Risky
Splits for
credit=poor
Classification
error
(no split) 0.24
Machine Learning Specialization43	
Early stopping condition 2:
No split improves classification error
©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Safe Risky
Early stopping
condition 2
Root
22 18
excellent
16 0
fair
1 2
poor
5 16
Splits for
credit=poor
Classification
error
(no split) 0.24
split on term 0.24
split on income 0.24
Build decision stump
with subset of data
where Credit = fair
Loan status:
Safe Risky
Machine Learning Specialization45	
Practical notes about stopping when
classification error doesn’t decrease
1.  Typically, add magic parameter ε
-  Stop if error doesn’t decrease by more than ε
2.  Some pitfalls to this rule (see pruning section)
3.  Very useful in practice
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Early stopping condition 3:
Stop if number of data points
contained in a node is too small
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization47	
Can we trust nodes with very few points?
©2015-2016 Emily Fox & Carlos Guestrin
Root
22 18
excellent
16 0
fair
1 2
poor
5 16
Loan status:
Safe Risky
Credit?
Safe
Stop recursing Only 3 data
points!
Risky
Machine Learning Specialization48	
Early stopping condition 3:
Stop when data points in a node <= Nmin
©2015-2016 Emily Fox & Carlos Guestrin
Root
22 18
excellent
16 0
fair
1 2
poor
5 16
Credit?
Safe Risky
Example: Nmin = 10
Risky
Early stopping
condition 3
Loan status:
Safe Risky
Machine Learning Specialization
Summary of decision trees
with early stopping
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization50	
Early stopping: Summary
©2015-2016 Emily Fox & Carlos Guestrin
1.  Limit tree depth: Stop splitting after a
certain depth
2.  Classification error: Do not consider any
split that does not cause a sufficient
decrease in classification error
3.  Minimum node “size”: Do not split an
intermediate node which contains
too few data points
Machine Learning Specialization52	
Recursion
Stopping
conditions 1 & 2
or
Early stopping
conditions 1, 2 & 3
Greedy decision tree learning
©2015-2016 Emily Fox & Carlos Guestrin
•  Step 1: Start with an empty tree
•  Step 2: Select a feature to split data
•  For each split of the tree:
•  Step 3: If nothing more to,
make predictions
•  Step 4: Otherwise, go to Step 2 &
continue (recurse) on this split
Machine Learning Specialization
Overfitting in Decision Trees:
Pruning
©2015-2016 Emily Fox & Carlos Guestrin
OPTIONAL
Machine Learning Specialization55	
Stopping condition summary
•  Stopping condition:
1.  All examples have the same target value
2.  No more features to split on
•  Early stopping conditions:
1.  Limit tree depth
2.  Do not consider splits that do not cause a
sufficient decrease in classification error
3.  Do not split an intermediate node which
contains too few data points
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Exploring some challenges
with early stopping conditions
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization57	
Challenge with early stopping condition 1
©2015-2016 Emily Fox & Carlos Guestrin
ClassificationError
Complex
trees
True error
Training error
Simple
trees
Tree depth
max_depth	
Hard	to	know	exactly	
when	to	stop
Machine Learning Specialization58	
Is early stopping condition 2 a good idea?
©2015-2016 Emily Fox & Carlos Guestrin
Tree depth
ClassificationError
Stop because of
zero decrease in
classification error
Machine Learning Specialization60	
Early stopping condition 2:
Don’t stop if error doesn’t decrease???
©2015-2016 Emily Fox & Carlos Guestrin
y values
True False
Root
2 2
Error = .
=
Tree Classification error
(root) 0.5
x[1] x[2] y
False False False
False True True
True False True
True True False
y = x[1] xor x[2]
Machine Learning Specialization61	
Consider split on x[1]
©2015-2016 Emily Fox & Carlos Guestrin
y values
True False
Root
2 2
Error = .
=
Tree Classification error
(root) 0.5
Split on x[1] 0.5
True
1 1
False
1 1
x[1]
x[1] x[2] y
False False False
False True True
True False True
True True False
y = x[1] xor x[2]
Machine Learning Specialization62	
Consider split on x[2]
©2015-2016 Emily Fox & Carlos Guestrin
y values
True False
Root
2 2
Error = .
=
Tree Classification error
(root) 0.5
Split on x[1] 0.5
Split on x[2] 0.5
True
1 1
False
1 1
x[2]
Neither features
improve training error…
Stop now???
x[1] x[2] y
False False False
False True True
True False True
True True False
y = x[1] xor x[2]
Machine Learning Specialization63	
Final tree with early stopping condition 2
©2015-2016 Emily Fox & Carlos Guestrin
y values
True False
Root
2 2
True
x[1] x[2] y
False False False
False True True
True False True
True True False
y = x[1] xor x[2]
Tree Classification
error
with early stopping
condition 2
0.5
Machine Learning Specialization64	
Without early stopping condition 2
©2015-2016 Emily Fox & Carlos Guestrin
y values
True False
Root
2 2
True
1 1
False
1 1
x[1]
True
0 1
x[2]
True
1 0
False
1 0
x[2]
False
0 1
True FalseFalse True
Tree Classification
error
with early stopping
condition 2
0.5
without early
stopping condition 2
0.0
x[1] x[2] y
False False False
False True True
True False True
True True False
y = x[1] xor x[2]
Machine Learning Specialization66	
Early stopping condition 2: Pros and Cons
•  Pros:
-  A reasonable heuristic for early stopping to
avoid useless splits
•  Cons:
-  Too short sighted: We may miss out on “good”
splits may occur right after “useless” splits
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization
Tree pruning
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization68	
Two approaches to picking simpler trees
©2015-2016 Emily Fox & Carlos Guestrin
1.  Early Stopping: Stop the learning
algorithm before the tree becomes
too complex
2.  Pruning: Simplify the tree after the
learning algorithm terminates
Complements early stopping
Machine Learning Specialization70	
Pruning: Intuition
Train a complex tree, simplify later
©2015-2016 Emily Fox & Carlos Guestrin
Complex Tree
Simplify
Simpler Tree
Machine Learning Specialization71	
Pruning motivation
©2015-2016 Emily Fox & Carlos Guestrin
ClassificationError
True Error
Training Error
Tree depth
Don’t stop
too early
Complex
tree
Simplify after
tree is built
Simple
tree
Machine Learning Specialization72	
Example 1: Which tree is simpler?
©2015-2016 Emily Fox & Carlos Guestrin
OR
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Term?
Risky Safe
Risky
excellent	 poor	
5 years	
low	high	
5 years	3 years	
Credit?
Start
Safe Safe
excellent	 poor	
Risky
fair	
SIMPLER
Machine Learning Specialization73	
Example 2: Which tree is simpler???
©2015-2016 Emily Fox & Carlos Guestrin
ORCredit?
Start
Safe
Risky
excellent	 poor	
bad	
SafeSafe
Risky
good	 fair	
Term?
Start
3 years	 5 years	
Risky Safe
Machine Learning Specialization74	
Simple measure of complexity of tree
©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Start
Safe
Risky
excellent	 poor	
bad	
SafeSafe
Risky
good	 fair	
L(T) = # of leaf nodes
Machine Learning Specialization75	
Which tree has lower L(T)?
©2015-2016 Emily Fox & Carlos Guestrin
ORCredit?
Start
Safe
Risky
excellent	 poor	
bad	
SafeSafe
Risky
good	 fair	
Term?
Start
3 years	 5 years	
Risky Safe
L(T1) = 5 L(T2) = 2
SIMPLER
Machine Learning Specialization76	 ©2015-2016 Emily Fox & Carlos Guestrin
Balance simplicity & predictive power
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Term?
Risky Safe
Risky
excellent	 poor	
5 years	
low	high	
5 years	3 years	
Risky
Start
Too complex, risk of overfitting
Too simple, high
classification error
Machine Learning Specialization78	
Want to balance:
i.  How well tree fits data
ii.  Complexity of tree
Total cost =
measure of fit + measure of complexity
©2015-2016 Emily Fox & Carlos Guestrin
(classification error)
Large # = bad fit to
training data
Large # = likely to
overfit
want to balance
Desired total quality format
Machine Learning Specialization79	
Consider specific total cost
Total cost =
classification error + number of leaf nodes
©2015-2016 Emily Fox & Carlos Guestrin
Error(T) L(T)
Machine Learning Specialization81	
Balancing fit and complexity
Total cost C(T) = Error(T) + λ L(T)
©2015-2016 Emily Fox & Carlos Guestrin
tuning parameter
If λ=0:
If λ=∞: 	
If λ in between:
Machine Learning Specialization82	
Use total cost to simplify trees
©2015-2016 Emily Fox & Carlos Guestrin
Total quality
based pruning
Complex tree
Simpler tree
Machine Learning Specialization
Tree pruning algorithm
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization85	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Risky
excellent	 poor	
5 years	
low	high	
Pruning Intuition
Tree T
Term?
Risky Safe
5 years	3 years
Machine Learning Specialization86	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Risky
excellent	 poor	
5 years	
low	high	
Term?
Risky Safe
5 years	3 years	
Step 1: Consider a split
Tree T
Candidate for
pruning
Machine Learning Specialization88	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Risky
excellent	 poor	
5 years	
low	high	
Step 2: Compute total cost C(T) of split
C(T) = Error(T) + λ L(T)
Tree Error #Leaves Total
T 0.25
λ = 0.3
Tree T
Term?
Risky Safe
5 years	3 years	
Candidate for
pruning
Machine Learning Specialization89	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Safe Risky
excellent	 poor	
5 years	
low	high	
Step 2: “Undo” the splits on Tsmaller
Replace split
by leaf node?
Tree Tsmaller
C(T) = Error(T) + λ L(T)
Tree Error #Leaves Total
T 0.25 6 0.43
Tsmaller 0.26
λ = 0.3
Machine Learning Specialization90	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Safe Risky
excellent	 poor	
5 years	
low	high	
Prune if total cost is lower: C(Tsmaller) ≤ C(T)
Replace split
by leaf node?
Tree Tsmaller
C(T) = Error(T) + λ L(T)
Tree Error #Leaves Total
T 0.25 6 0.43
Tsmaller 0.26 5 0.41
λ = 0.3
Worse training
error but lower
overall cost
YES!
Machine Learning Specialization92	 ©2015-2016 Emily Fox & Carlos Guestrin
Credit?
Term?
Risky
Start
fair	
3 years	
Safe
Safe
Income?
Safe Risky
excellent	 poor	
5 years	
low	high	
Step 5: Repeat Steps 1-4 for every split
Decide if each
split can be
“pruned”
Machine Learning Specialization93	
Decision tree pruning algorithm
©2015-2016 Emily Fox & Carlos Guestrin
•  Start at bottom of tree T and traverse up,
apply prune_split to each decision node M
•  prune_split(T,M):
1.  Compute total cost of tree T using
C(T) = Error(T) + λ L(T)
2.  Let Tsmaller be tree after pruning subtree
below M
3.  Compute total cost complexity of Tsmaller
C(Tsmaller) = Error(Tsmaller) + λ L(Tsmaller)
4.  If C(Tsmaller) < C(T), prune to Tsmaller
Machine Learning Specialization
Summary of overfitting in
decision trees
©2015-2016 Emily Fox & Carlos Guestrin
Machine Learning Specialization95	
•  Identify when overfitting in decision trees
•  Prevent overfitting with early stopping
-  Limit tree depth
-  Do not consider splits that do not reduce
classification error
-  Do not split intermediate nodes with only
few points
•  Prevent overfitting by pruning complex trees
-  Use a total cost formula that balances
classification error and tree complexity
-  Use total cost to merge potentially complex
trees into simpler ones
©2015-2016 Emily Fox & Carlos Guestrin
What you can do now…
Machine Learning Specialization96	
Thank you to Dr. Krishna Sridhar
Dr. Krishna Sridhar
Staff Data Scientist, Dato, Inc.
©2015-2016 Emily Fox & Carlos Guestrin

Más contenido relacionado

Más de Daniel LIAO (18)

C4.5
C4.5C4.5
C4.5
 
C4.4
C4.4C4.4
C4.4
 
C4.3.1
C4.3.1C4.3.1
C4.3.1
 
Lime
LimeLime
Lime
 
C4.1.2
C4.1.2C4.1.2
C4.1.2
 
C3.7.1
C3.7.1C3.7.1
C3.7.1
 
C3.6.1
C3.6.1C3.6.1
C3.6.1
 
C3.5.1
C3.5.1C3.5.1
C3.5.1
 
C2.3
C2.3C2.3
C2.3
 
C2.6
C2.6C2.6
C2.6
 
C2.2
C2.2C2.2
C2.2
 
C2.5
C2.5C2.5
C2.5
 
C2.4
C2.4C2.4
C2.4
 
C3.1.2
C3.1.2C3.1.2
C3.1.2
 
C3.2.2
C3.2.2C3.2.2
C3.2.2
 
C3.2.1
C3.2.1C3.2.1
C3.2.1
 
C3.1.logistic intro
C3.1.logistic introC3.1.logistic intro
C3.1.logistic intro
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
 

Último

How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 

Último (20)

How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 

C3.4.1

  • 1. Machine Learning Specialization Overfitting in decision trees Emily Fox & Carlos Guestrin Machine Learning Specialization University of Washington ©2015-2016 Emily Fox & Carlos Guestrin
  • 2. Machine Learning Specialization3 Review of loan default prediction ©2015-2016 Emily Fox & Carlos Guestrin Safe ✓ Risky ✘ Risky ✘ Intelligent loan application review system Loan Applications
  • 3. Machine Learning Specialization4 ©2015-2016 Emily Fox & Carlos Guestrin Decision tree review T(xi) = Traverse decision tree Loan Application Input: xi ŷi Credit? Safe Term? Risky Safe Income? Term? Risky Safe Risky start excellent poor fair 5 years 3 years Low high 5 years 3 years
  • 4. Machine Learning Specialization Overfitting review ©2015-2016 Emily Fox & Carlos Guestrin
  • 5. Machine Learning Specialization8 Overfitting in logistic regression ©2015-2016 Emily Fox & Carlos Guestrin Model complexity Error= ClassificationError Overfitting if there exists w*: •  training_error(w*) > training_error(ŵ) •  true_error(w*) < true_error(ŵ) True error Training error
  • 6. Machine Learning Specialization9 Overfitting è Overconfident predictions ©2015-2016 Emily Fox & Carlos Guestrin Logistic Regression (Degree 6) Logistic Regression (Degree 20)
  • 7. Machine Learning Specialization Overfitting in decision trees ©2015-2016 Emily Fox & Carlos Guestrin
  • 8. Machine Learning Specialization13 Decision stump (Depth 1): Split on x[1] ©2015-2016 Emily Fox & Carlos Guestrin Root 18 13 x[1] < -0.07 13 3 x[1] >= -0.07 4 11 x[1] y values - +
  • 9. Machine Learning Specialization14 Tree depth depth = 1 depth = 2 depth = 3 depth = 5 depth = 10 Training error 0.22 0.13 0.10 0.03 0.00 Decision boundary ©2015-2016 Emily Fox & Carlos Guestrin Training error reduces with depth What happens when we increase depth?
  • 10. Machine Learning Specialization16 Deeper trees è lower training error ©2015-2016 Emily Fox & Carlos Guestrin Tree depth TrainingError Depth 10 (training error = 0.0)
  • 11. Machine Learning Specialization17 Training error = 0: Is this model perfect? ©2015-2016 Emily Fox & Carlos Guestrin Depth 10 (training error = 0.0) NOT PERFECT
  • 12. Machine Learning Specialization18 Why training error reduces with depth? ©2015-2016 Emily Fox & Carlos Guestrin Root 22 18 excellent 9 0 good 9 4 fair 4 14 Loan status: Safe Risky Credit? Tree Training error (root) 0.45 split on credit 0.20 Safe Safe Risky Training error improved by 0.25 because of the split Split on credit
  • 13. Machine Learning Specialization19 Feature split selection algorithm ©2015-2016 Emily Fox & Carlos Guestrin •  Given a subset of data M (a node in a tree) •  For each feature hi(x): 1.  Split data of M according to feature hi(x) 2.  Compute classification error split •  Chose feature h*(x) with lowest classification error By design, each split reduces training error
  • 14. Machine Learning Specialization21 Decision trees overfitting on loan data ©2015-2016 Emily Fox & Carlos Guestrin
  • 15. Machine Learning Specialization Principle of Occam’s razor: Simpler trees are better ©2015-2016 Emily Fox & Carlos Guestrin
  • 16. Machine Learning Specialization25 Principle of Occam’s Razor ©2015-2016 Emily Fox & Carlos Guestrin “Among competing hypotheses, the one with fewest assumptions should be selected”, William of Occam, 13th Century OR Diagnosis 1: 2 diseases Two diseases D1 and D2 where D1 explains S1, D2 explains S2 Diagnosis 2: 1 disease Disease D3 explains both symptoms S1 and S2 Symptoms: S1 and S2 SIMPLER
  • 17. Machine Learning Specialization26 Occam’s Razor for decision trees ©2015-2016 Emily Fox & Carlos Guestrin When two trees have similar classification error on the validation set, pick the simpler one Complexity Train error Validation error Simple 0.23 0.24 Moderate 0.12 0.15 Complex 0.07 0.15 Super complex 0 0.18 Overfit Same validation error
  • 18. Machine Learning Specialization27 Which tree is simpler? ©2015-2016 Emily Fox & Carlos Guestrin OR SIMPLER
  • 19. Machine Learning Specialization29 Modified tree learning problem ©2015-2016 Emily Fox & Carlos Guestrin T1(X) T2(X) T4(X) Find a “simple” decision tree with low classification error Complex treesSimple trees
  • 20. Machine Learning Specialization30 How do we pick simpler trees? ©2015-2016 Emily Fox & Carlos Guestrin 1.  Early Stopping: Stop learning algorithm before tree become too complex 2.  Pruning: Simplify tree after learning algorithm terminates
  • 21. Machine Learning Specialization Early stopping for learning decision trees ©2015-2016 Emily Fox & Carlos Guestrin
  • 22. Machine Learning Specialization33 Deeper trees è Increasing complexity ©2015-2016 Emily Fox & Carlos Guestrin Model complexity increases with depth Depth = 1 Depth = 2 Depth = 10
  • 23. Machine Learning Specialization Early stopping condition 1: Limit the depth of a tree ©2015-2016 Emily Fox & Carlos Guestrin
  • 24. Machine Learning Specialization36 Restrict tree learning to shallow trees? ©2015-2016 Emily Fox & Carlos Guestrin ClassificationError Complex trees True error Training error Simple trees Tree depth max_depth
  • 25. Machine Learning Specialization37 Early stopping condition 1: Limit depth of tree ©2015-2016 Emily Fox & Carlos Guestrin ClassificationError Stop tree building when depth = max_depth max_depth Tree depth
  • 26. Machine Learning Specialization38 Picking value for max_depth??? ©2015-2016 Emily Fox & Carlos Guestrin ClassificationError Validation set or cross-validation max_depth Tree depth
  • 27. Machine Learning Specialization Early stopping condition 2: Use classification error to limit depth of tree ©2015-2016 Emily Fox & Carlos Guestrin
  • 28. Machine Learning Specialization40 Decision tree recursion review ©2015-2016 Emily Fox & Carlos Guestrin Loan status: Safe Risky Credit? Safe Root 22 18 excellent 16 0 fair 1 2 poor 5 16 Build decision stump with subset of data where Credit = poor Build decision stump with subset of data where Credit = fair
  • 29. Machine Learning Specialization42 Split selection for credit=poor ©2015-2016 Emily Fox & Carlos Guestrin No split improves classification error è Stop! Splits for credit=poor Classification error (no split) 0.45 split on term 0.24 split on income 0.24 Credit? Safe Root 22 18 excellent 16 0 fair 1 2 poor 5 16 Loan status: Safe Risky Splits for credit=poor Classification error (no split) 0.24
  • 30. Machine Learning Specialization43 Early stopping condition 2: No split improves classification error ©2015-2016 Emily Fox & Carlos Guestrin Credit? Safe Risky Early stopping condition 2 Root 22 18 excellent 16 0 fair 1 2 poor 5 16 Splits for credit=poor Classification error (no split) 0.24 split on term 0.24 split on income 0.24 Build decision stump with subset of data where Credit = fair Loan status: Safe Risky
  • 31. Machine Learning Specialization45 Practical notes about stopping when classification error doesn’t decrease 1.  Typically, add magic parameter ε -  Stop if error doesn’t decrease by more than ε 2.  Some pitfalls to this rule (see pruning section) 3.  Very useful in practice ©2015-2016 Emily Fox & Carlos Guestrin
  • 32. Machine Learning Specialization Early stopping condition 3: Stop if number of data points contained in a node is too small ©2015-2016 Emily Fox & Carlos Guestrin
  • 33. Machine Learning Specialization47 Can we trust nodes with very few points? ©2015-2016 Emily Fox & Carlos Guestrin Root 22 18 excellent 16 0 fair 1 2 poor 5 16 Loan status: Safe Risky Credit? Safe Stop recursing Only 3 data points! Risky
  • 34. Machine Learning Specialization48 Early stopping condition 3: Stop when data points in a node <= Nmin ©2015-2016 Emily Fox & Carlos Guestrin Root 22 18 excellent 16 0 fair 1 2 poor 5 16 Credit? Safe Risky Example: Nmin = 10 Risky Early stopping condition 3 Loan status: Safe Risky
  • 35. Machine Learning Specialization Summary of decision trees with early stopping ©2015-2016 Emily Fox & Carlos Guestrin
  • 36. Machine Learning Specialization50 Early stopping: Summary ©2015-2016 Emily Fox & Carlos Guestrin 1.  Limit tree depth: Stop splitting after a certain depth 2.  Classification error: Do not consider any split that does not cause a sufficient decrease in classification error 3.  Minimum node “size”: Do not split an intermediate node which contains too few data points
  • 37. Machine Learning Specialization52 Recursion Stopping conditions 1 & 2 or Early stopping conditions 1, 2 & 3 Greedy decision tree learning ©2015-2016 Emily Fox & Carlos Guestrin •  Step 1: Start with an empty tree •  Step 2: Select a feature to split data •  For each split of the tree: •  Step 3: If nothing more to, make predictions •  Step 4: Otherwise, go to Step 2 & continue (recurse) on this split
  • 38. Machine Learning Specialization Overfitting in Decision Trees: Pruning ©2015-2016 Emily Fox & Carlos Guestrin OPTIONAL
  • 39. Machine Learning Specialization55 Stopping condition summary •  Stopping condition: 1.  All examples have the same target value 2.  No more features to split on •  Early stopping conditions: 1.  Limit tree depth 2.  Do not consider splits that do not cause a sufficient decrease in classification error 3.  Do not split an intermediate node which contains too few data points ©2015-2016 Emily Fox & Carlos Guestrin
  • 40. Machine Learning Specialization Exploring some challenges with early stopping conditions ©2015-2016 Emily Fox & Carlos Guestrin
  • 41. Machine Learning Specialization57 Challenge with early stopping condition 1 ©2015-2016 Emily Fox & Carlos Guestrin ClassificationError Complex trees True error Training error Simple trees Tree depth max_depth Hard to know exactly when to stop
  • 42. Machine Learning Specialization58 Is early stopping condition 2 a good idea? ©2015-2016 Emily Fox & Carlos Guestrin Tree depth ClassificationError Stop because of zero decrease in classification error
  • 43. Machine Learning Specialization60 Early stopping condition 2: Don’t stop if error doesn’t decrease??? ©2015-2016 Emily Fox & Carlos Guestrin y values True False Root 2 2 Error = . = Tree Classification error (root) 0.5 x[1] x[2] y False False False False True True True False True True True False y = x[1] xor x[2]
  • 44. Machine Learning Specialization61 Consider split on x[1] ©2015-2016 Emily Fox & Carlos Guestrin y values True False Root 2 2 Error = . = Tree Classification error (root) 0.5 Split on x[1] 0.5 True 1 1 False 1 1 x[1] x[1] x[2] y False False False False True True True False True True True False y = x[1] xor x[2]
  • 45. Machine Learning Specialization62 Consider split on x[2] ©2015-2016 Emily Fox & Carlos Guestrin y values True False Root 2 2 Error = . = Tree Classification error (root) 0.5 Split on x[1] 0.5 Split on x[2] 0.5 True 1 1 False 1 1 x[2] Neither features improve training error… Stop now??? x[1] x[2] y False False False False True True True False True True True False y = x[1] xor x[2]
  • 46. Machine Learning Specialization63 Final tree with early stopping condition 2 ©2015-2016 Emily Fox & Carlos Guestrin y values True False Root 2 2 True x[1] x[2] y False False False False True True True False True True True False y = x[1] xor x[2] Tree Classification error with early stopping condition 2 0.5
  • 47. Machine Learning Specialization64 Without early stopping condition 2 ©2015-2016 Emily Fox & Carlos Guestrin y values True False Root 2 2 True 1 1 False 1 1 x[1] True 0 1 x[2] True 1 0 False 1 0 x[2] False 0 1 True FalseFalse True Tree Classification error with early stopping condition 2 0.5 without early stopping condition 2 0.0 x[1] x[2] y False False False False True True True False True True True False y = x[1] xor x[2]
  • 48. Machine Learning Specialization66 Early stopping condition 2: Pros and Cons •  Pros: -  A reasonable heuristic for early stopping to avoid useless splits •  Cons: -  Too short sighted: We may miss out on “good” splits may occur right after “useless” splits ©2015-2016 Emily Fox & Carlos Guestrin
  • 49. Machine Learning Specialization Tree pruning ©2015-2016 Emily Fox & Carlos Guestrin
  • 50. Machine Learning Specialization68 Two approaches to picking simpler trees ©2015-2016 Emily Fox & Carlos Guestrin 1.  Early Stopping: Stop the learning algorithm before the tree becomes too complex 2.  Pruning: Simplify the tree after the learning algorithm terminates Complements early stopping
  • 51. Machine Learning Specialization70 Pruning: Intuition Train a complex tree, simplify later ©2015-2016 Emily Fox & Carlos Guestrin Complex Tree Simplify Simpler Tree
  • 52. Machine Learning Specialization71 Pruning motivation ©2015-2016 Emily Fox & Carlos Guestrin ClassificationError True Error Training Error Tree depth Don’t stop too early Complex tree Simplify after tree is built Simple tree
  • 53. Machine Learning Specialization72 Example 1: Which tree is simpler? ©2015-2016 Emily Fox & Carlos Guestrin OR Credit? Term? Risky Start fair 3 years Safe Safe Income? Term? Risky Safe Risky excellent poor 5 years low high 5 years 3 years Credit? Start Safe Safe excellent poor Risky fair SIMPLER
  • 54. Machine Learning Specialization73 Example 2: Which tree is simpler??? ©2015-2016 Emily Fox & Carlos Guestrin ORCredit? Start Safe Risky excellent poor bad SafeSafe Risky good fair Term? Start 3 years 5 years Risky Safe
  • 55. Machine Learning Specialization74 Simple measure of complexity of tree ©2015-2016 Emily Fox & Carlos Guestrin Credit? Start Safe Risky excellent poor bad SafeSafe Risky good fair L(T) = # of leaf nodes
  • 56. Machine Learning Specialization75 Which tree has lower L(T)? ©2015-2016 Emily Fox & Carlos Guestrin ORCredit? Start Safe Risky excellent poor bad SafeSafe Risky good fair Term? Start 3 years 5 years Risky Safe L(T1) = 5 L(T2) = 2 SIMPLER
  • 57. Machine Learning Specialization76 ©2015-2016 Emily Fox & Carlos Guestrin Balance simplicity & predictive power Credit? Term? Risky Start fair 3 years Safe Safe Income? Term? Risky Safe Risky excellent poor 5 years low high 5 years 3 years Risky Start Too complex, risk of overfitting Too simple, high classification error
  • 58. Machine Learning Specialization78 Want to balance: i.  How well tree fits data ii.  Complexity of tree Total cost = measure of fit + measure of complexity ©2015-2016 Emily Fox & Carlos Guestrin (classification error) Large # = bad fit to training data Large # = likely to overfit want to balance Desired total quality format
  • 59. Machine Learning Specialization79 Consider specific total cost Total cost = classification error + number of leaf nodes ©2015-2016 Emily Fox & Carlos Guestrin Error(T) L(T)
  • 60. Machine Learning Specialization81 Balancing fit and complexity Total cost C(T) = Error(T) + λ L(T) ©2015-2016 Emily Fox & Carlos Guestrin tuning parameter If λ=0: If λ=∞: If λ in between:
  • 61. Machine Learning Specialization82 Use total cost to simplify trees ©2015-2016 Emily Fox & Carlos Guestrin Total quality based pruning Complex tree Simpler tree
  • 62. Machine Learning Specialization Tree pruning algorithm ©2015-2016 Emily Fox & Carlos Guestrin
  • 63. Machine Learning Specialization85 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Risky excellent poor 5 years low high Pruning Intuition Tree T Term? Risky Safe 5 years 3 years
  • 64. Machine Learning Specialization86 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Risky excellent poor 5 years low high Term? Risky Safe 5 years 3 years Step 1: Consider a split Tree T Candidate for pruning
  • 65. Machine Learning Specialization88 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Risky excellent poor 5 years low high Step 2: Compute total cost C(T) of split C(T) = Error(T) + λ L(T) Tree Error #Leaves Total T 0.25 λ = 0.3 Tree T Term? Risky Safe 5 years 3 years Candidate for pruning
  • 66. Machine Learning Specialization89 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Safe Risky excellent poor 5 years low high Step 2: “Undo” the splits on Tsmaller Replace split by leaf node? Tree Tsmaller C(T) = Error(T) + λ L(T) Tree Error #Leaves Total T 0.25 6 0.43 Tsmaller 0.26 λ = 0.3
  • 67. Machine Learning Specialization90 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Safe Risky excellent poor 5 years low high Prune if total cost is lower: C(Tsmaller) ≤ C(T) Replace split by leaf node? Tree Tsmaller C(T) = Error(T) + λ L(T) Tree Error #Leaves Total T 0.25 6 0.43 Tsmaller 0.26 5 0.41 λ = 0.3 Worse training error but lower overall cost YES!
  • 68. Machine Learning Specialization92 ©2015-2016 Emily Fox & Carlos Guestrin Credit? Term? Risky Start fair 3 years Safe Safe Income? Safe Risky excellent poor 5 years low high Step 5: Repeat Steps 1-4 for every split Decide if each split can be “pruned”
  • 69. Machine Learning Specialization93 Decision tree pruning algorithm ©2015-2016 Emily Fox & Carlos Guestrin •  Start at bottom of tree T and traverse up, apply prune_split to each decision node M •  prune_split(T,M): 1.  Compute total cost of tree T using C(T) = Error(T) + λ L(T) 2.  Let Tsmaller be tree after pruning subtree below M 3.  Compute total cost complexity of Tsmaller C(Tsmaller) = Error(Tsmaller) + λ L(Tsmaller) 4.  If C(Tsmaller) < C(T), prune to Tsmaller
  • 70. Machine Learning Specialization Summary of overfitting in decision trees ©2015-2016 Emily Fox & Carlos Guestrin
  • 71. Machine Learning Specialization95 •  Identify when overfitting in decision trees •  Prevent overfitting with early stopping -  Limit tree depth -  Do not consider splits that do not reduce classification error -  Do not split intermediate nodes with only few points •  Prevent overfitting by pruning complex trees -  Use a total cost formula that balances classification error and tree complexity -  Use total cost to merge potentially complex trees into simpler ones ©2015-2016 Emily Fox & Carlos Guestrin What you can do now…
  • 72. Machine Learning Specialization96 Thank you to Dr. Krishna Sridhar Dr. Krishna Sridhar Staff Data Scientist, Dato, Inc. ©2015-2016 Emily Fox & Carlos Guestrin