SlideShare a Scribd company logo
1 of 20
Deepak George
Senior Data Scientist – Machine Learning
Decision Tree Ensembles
Bagging, Random Forest & Gradient Boosting Machines
December 2015
 Education
 Computer Science Engineering – College Of Engineering Trivandrum
 Business Analytics & Intelligence – Indian Institute Of Management Bangalore
 Career
 Mu Sigma
 Accenture Analytics
 Data Science
 1st Prize Best Data Science Project (BAI 5) – IIM Bangalore
 Top 10% (out of 1100) finish Kaggle Coupon Purchase Prediction (Recommender
System)
 SAS Certified Statistical Business Analyst: Regression and Modeling Credentials
 Statistical Learning – Stanford University
 Passion
 Photography, Football, Data Science, Machine Learning
 Contact
 Deepak.george14@iimb.ernet.in
 linkedin.com/in/deepakgeorge7
Copyright @ Deepak George, IIM Bangalore
2
About Me
Copyright @ Deepak George, IIM Bangalore
3
Bias-Variance Tradeoff
Expected test MSE
 Bias
 Error that is introduced by approximating a
complicated relationship, by a much simpler
model.
 Difference between the truth and what you
expect to learn
 Underfitting
 Variance
 Amount by which model would change if we
estimated it using a different training data.
 If a model has high variance then small
changes in the training data can result in
large changes in the model.
 Overfitting
Copyright @ Deepak George, IIM Bangalore
4
Bias-Variance Tradeoff
Underfitting Ideal Learner Overfitting
 Problem: Decision tree have low bias & suffer from high variance
 Goal: Reduce variance of decision trees
 Hint: Given set of n independent observations Z1, . . . , Zn, each
with variance σ2, the variance of the mean of the observations is given
by σ2/n.
 In other words, averaging a set of observations reduces variance.
 Theoretically: Take multiple independent samples S’ from the population
 Fit “bushy”/deep decision trees on each S1,S2…. Sn
 Trees are grown deep and are not pruned
 Variance reduces linearly & Bias remain unchanged
 Practically: We only have one sample/training set & not the population.
 So take bootstrap samples i.e. multiple samples from the
single sample with replacement
 Variance reduces sub-linearly & Bias often increase slightly
because bootstrap samples are correlated.
 Final Classifier: Average of predictions for regression or majority vote
for classification.
 High Variance introduced by deep decision trees are mitigated by
averaging predictions from each decision trees.
Copyright @ Deepak George, IIM Bangalore
5
Bagging
Population
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S1
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S2
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Sn
.
.
.
Samples
Sample
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S1
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
S2
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Sn
.
.
.
Bootstrap Samples
Alice# 14# 0# 1#
Bob# 10# 1# 1#
Carol# 13# 0# 1#
Dave# 8# 1# 0#
Erin# 11# 0# 0#
Frank# 9# 1# 1#
Gena# 8# 0# 0#
James# 11# 1# 1#
Jessica# 14# 0# 1#
Alice# 14# 0# 1#
Amy# 12# 0# 1#
Bob# 10# 1# 1#
Xavier# 9# 1# 0#
Cathy# 9# 0# 1#
Carol# 13# 0# 1#
Eugene# 13# 1# 0#
Rafael# 12# 1# 1#
Dave# 8# 1# 0#
Peter# 9# 1# 0#
Henry# 13# 1# 0#
Erin# 11# 0# 0#
Rose# 7# 0# 0#
Iain# 8# 1# 1#
Paulo# 12# 1# 0#
Margaret# 10# 0# 1#
Frank# 9# 1# 1#
Jill# 13# 0# 0#
Leon# 10# 1# 0#
Sarah# 12# 0# 0#
Gena# 8# 0# 0#
Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
Copyright @ Deepak George, IIM Bangalore
6
Bootstrap sampling
Bootstrap sample
should have same
sample size as the
original sample.
With replacement results
in repetition of values
Bootstrap sample on an
average uses only 2/3 of
the data in the original
sample
Copyright @ Deepak George, IIM Bangalore
7
Random Forest
 Problem: Bagging still have relatively high variance
 Goal: Reduce variance of Bagging
 Solution: Along with sampling of data in Bagging, take samples of features also!
 In other words, in building a random forest, at each split in the tree,
the use only a random subset of features instead of all the features.
 This de-correlates the trees.
 Its mathematically proved that 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 is a good approximate value for
predictor subset size (mtry/max_features).
 Evaluation: A bootstrap sample uses only approximately 2/3 of the observations of original
sample.
 Remaining training data (OOB) are used to estimate error and variable importance
 Hyperparameters are knobs to control bias & variance tradeoff of any
machine learning algorithm.
 Key Hyper parameters
 Max Features – De-correlates the trees
 Number of Trees in the forest – Higher number reduce more variance
Random Forest - Key Hyperparameters
8
Copyright @ Deepak George, IIM Bangalore
Copyright @ Deepak George, IIM Bangalore
9
Random Forest – R Implementation
library(randomForest)
library(MASS) #Contains Boston dataframe
library(caret)
View(Boston)
#Cross Validation
cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T)
#GridSeach
rf.grid <- expand.grid(mtry = 2:13)
set.seed(1861) ## make reproducible here, but not if generating many random samples
#Hyper Parametertuning
rf_tune <-train(medv~.,
data=Boston,
method="rf",
trControl=cv.ctrl,
tuneGrid=rf.grid,
ntree = 1000,
importance = TRUE)
#Cross Validation results
rf_tune
plot(rf_tune)
#Variable Importance
varImp(rf_tune)
plot(varImp(rf_tune), top = 10)
Copyright @ Deepak George, IIM Bangalore
10
Boosting
 Intuition: Ensemble many “weak” classifiers (typically decision trees) to
produce a final “strong” classifier
 Weak classifier  Error rate is only slightly better than random
guessing.
 Boosting is a Forward Stagewise Additive model
 Boosting sequentially apply the weak classifiers one by one to repeatedly
reweighted versions of the data.
 Each new weak learner in the sequence tries to correct the
misclassification/error made by the previous weak learners.
 Initially all of the weights are set to Wi = 1/N
 For each successive step the observation weights are individually
modified and a new weak learner is fitted on the reweighted
observations.
 At step m, those observations that were misclassified by the
classifier Gm−1(x) induced at the previous step have their weights
increased, whereas the weights are decreased for those that were
classified correctly.
 Final “strong” classifier is based on weighted vote of weak classifiers
X1
X2
AdaBoost – Illustration
11Copyright @ Deepak George, IIM Bangalore
Step 1
Input Data
Initially all observations are
assigned equal weight (1/N)
Observations that are
misclassified in the ith
iteration is given higher
weights in the (i+1)th iteration
Observations that are correctly
classified in the ith iteration is
given lower weights in the
(i+1)th iteration
Copyright @ Deepak George, IIM Bangalore
12
Copyright @ Deepak George, IIM Bangalore
Step 2
Step 3
AdaBoost – Illustration
13
Copyright @ Deepak George, IIM Bangalore
Final Ensemble/Model
AdaBoost – Illustration
AdaBoost - Algorithm
14
Copyright @ Deepak George, IIM Bangalore
 Generalization of AdaBoost to work with arbitrary loss functions resulted in GBM.
Gradient Boosting = Gradient Descent + Boosting
 GBM uses gradient descent algorithm which can optimize any differentiable loss
function.
 In Adaboost, ‘shortcomings’ are identified by high-weight data points.
 In Gradient Boosting,“shortcomings” are identified by negative gradients (also
called pseudo residuals).
 In GBM instead of reweighting used in adaboost, each new tree is fit to the
negative gradients of the previous tree.
 Each tree in GBM is a successive gradient descent step.
Gradient Boosting Machines
15
Copyright @ Deepak George, IIM Bangalore
 AdaBoost is equivalent to forward stagewise additive modeling using the
exponential loss function.
Gradient Boosting - Algorithm
16
Copyright @ Deepak George, IIM Bangalore
 GBM has 3 types of hyper parameters
 Tree Structure
 Max depth of the trees - Controls the degree of features
interactions
 Min samples leaf – Minimum number of samples in leaf node.
 Number of Trees
 Shrinkage
 Learning rate - Slows learning by shrinking tree predictions.
 Unlike fitting a single large decision tree to the data, which amounts
to fitting the data hard and potentially overfitting, the boosting
approach instead learns slowly
 Stochastic Gradient Boosting
 SubSample: Select random subset of the training set for fitting each
tree than using the complete training data.
 Max features: Select random subset of features for each tree.
GBM – Key Hyperparameters
17
Copyright @ Deepak George, IIM Bangalore
Copyright @ Deepak George, IIM Bangalore
18
Tree Ensembles- Interpretation
library(xgboost)
library(MASS) #Contains Boston dataframe
library(caret)
#Cross Validation
cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T)
#GridSeach
xgb.grid <- expand.grid(nrounds=1000,eta = c(0.005,0.01,0.05,0.1) ,max_depth = c(4,5,6,7,8))
set.seed(1860)
#Model training
xgb_tune <-train(medv~.,
data=Boston,
method="xgbTree",
trControl=cv.ctrl,
tuneGrid=xgb.grid,
importance = TRUE,
subsample =0.8)
#Cross Validation results
xgb_tune
plot(xgb_tune)
#Variable Importance
plot(varImp(xgb_tune), top = 10)
Copyright @ Deepak George, IIM Bangalore
19
GBM – R Implementation
Copyright @ Deepak George, IIM Bangalore
20
End
Questions ?

More Related Content

What's hot

Modelling and evaluation
Modelling and evaluationModelling and evaluation
Modelling and evaluationeShikshak
 
Exploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubExploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubMartin Bago
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Simplilearn
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3Laila Fatehy
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
1.1 binary tree
1.1 binary tree1.1 binary tree
1.1 binary treeKrish_ver2
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixAndrew Ferlitsch
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision treehktripathy
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Edureka!
 
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)Hannah Song
 

What's hot (20)

Modelling and evaluation
Modelling and evaluationModelling and evaluation
Modelling and evaluation
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
 
Exploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubExploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science Club
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
SLIQ
SLIQSLIQ
SLIQ
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
1.1 binary tree
1.1 binary tree1.1 binary tree
1.1 binary tree
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion Matrix
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
 
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)
Introduction to Data Visualization: History, Concept, Methods (HCI Korea 2014)
 

Viewers also liked

Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 
Gbm.more GBM in H2O
Gbm.more GBM in H2OGbm.more GBM in H2O
Gbm.more GBM in H2OSri Ambati
 
GBM package in r
GBM package in rGBM package in r
GBM package in rmark_landry
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Automated data analysis with Python
Automated data analysis with PythonAutomated data analysis with Python
Automated data analysis with PythonGramener
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostJaroslaw Szymczak
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2Darryl Moore
 
Landscape architecture
Landscape architectureLandscape architecture
Landscape architectureRaima Hashmi
 
Bird Friendly Architecture
Bird Friendly ArchitectureBird Friendly Architecture
Bird Friendly ArchitectureSurya Ramesh
 
Landscape Architect portfolio
Landscape Architect portfolioLandscape Architect portfolio
Landscape Architect portfolioAhmad Al-khalaqi
 
INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1Darryl Moore
 
Vegetation in landscape
Vegetation in landscapeVegetation in landscape
Vegetation in landscapeSaima Iqbal
 

Viewers also liked (20)

Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Inlining Heuristics
Inlining HeuristicsInlining Heuristics
Inlining Heuristics
 
Gbm.more GBM in H2O
Gbm.more GBM in H2OGbm.more GBM in H2O
Gbm.more GBM in H2O
 
XGBoost (System Overview)
XGBoost (System Overview)XGBoost (System Overview)
XGBoost (System Overview)
 
GBM package in r
GBM package in rGBM package in r
GBM package in r
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Automated data analysis with Python
Automated data analysis with PythonAutomated data analysis with Python
Automated data analysis with Python
 
GBM theory code and parameters
GBM theory code and parametersGBM theory code and parameters
GBM theory code and parameters
 
Gradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2REV CITYSCAPES 042.GDJ137.V2
REV CITYSCAPES 042.GDJ137.V2
 
InternationalNov
InternationalNovInternationalNov
InternationalNov
 
Mangalavanam Bird Sanctuary
Mangalavanam Bird SanctuaryMangalavanam Bird Sanctuary
Mangalavanam Bird Sanctuary
 
this is india
this is indiathis is india
this is india
 
Landscape architecture
Landscape architectureLandscape architecture
Landscape architecture
 
GDJ155.14.v2
GDJ155.14.v2GDJ155.14.v2
GDJ155.14.v2
 
Bird Friendly Architecture
Bird Friendly ArchitectureBird Friendly Architecture
Bird Friendly Architecture
 
Landscape Architect portfolio
Landscape Architect portfolioLandscape Architect portfolio
Landscape Architect portfolio
 
INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1INTERNAT 014.GDJ137.V1
INTERNAT 014.GDJ137.V1
 
Vegetation in landscape
Vegetation in landscapeVegetation in landscape
Vegetation in landscape
 

Similar to Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines

Similar to Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines (8)

Ensemble methods.pptx
Ensemble methods.pptxEnsemble methods.pptx
Ensemble methods.pptx
 
Readme
ReadmeReadme
Readme
 
Appendix
AppendixAppendix
Appendix
 
Appendix
AppendixAppendix
Appendix
 
BoD presi w notes-1mar2013
BoD presi w notes-1mar2013BoD presi w notes-1mar2013
BoD presi w notes-1mar2013
 
Volume c
Volume cVolume c
Volume c
 
Volume c
Volume cVolume c
Volume c
 
Volume d
Volume dVolume d
Volume d
 

Recently uploaded

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Recently uploaded (20)

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines

  • 1. Deepak George Senior Data Scientist – Machine Learning Decision Tree Ensembles Bagging, Random Forest & Gradient Boosting Machines December 2015
  • 2.  Education  Computer Science Engineering – College Of Engineering Trivandrum  Business Analytics & Intelligence – Indian Institute Of Management Bangalore  Career  Mu Sigma  Accenture Analytics  Data Science  1st Prize Best Data Science Project (BAI 5) – IIM Bangalore  Top 10% (out of 1100) finish Kaggle Coupon Purchase Prediction (Recommender System)  SAS Certified Statistical Business Analyst: Regression and Modeling Credentials  Statistical Learning – Stanford University  Passion  Photography, Football, Data Science, Machine Learning  Contact  Deepak.george14@iimb.ernet.in  linkedin.com/in/deepakgeorge7 Copyright @ Deepak George, IIM Bangalore 2 About Me
  • 3. Copyright @ Deepak George, IIM Bangalore 3 Bias-Variance Tradeoff Expected test MSE  Bias  Error that is introduced by approximating a complicated relationship, by a much simpler model.  Difference between the truth and what you expect to learn  Underfitting  Variance  Amount by which model would change if we estimated it using a different training data.  If a model has high variance then small changes in the training data can result in large changes in the model.  Overfitting
  • 4. Copyright @ Deepak George, IIM Bangalore 4 Bias-Variance Tradeoff Underfitting Ideal Learner Overfitting
  • 5.  Problem: Decision tree have low bias & suffer from high variance  Goal: Reduce variance of decision trees  Hint: Given set of n independent observations Z1, . . . , Zn, each with variance σ2, the variance of the mean of the observations is given by σ2/n.  In other words, averaging a set of observations reduces variance.  Theoretically: Take multiple independent samples S’ from the population  Fit “bushy”/deep decision trees on each S1,S2…. Sn  Trees are grown deep and are not pruned  Variance reduces linearly & Bias remain unchanged  Practically: We only have one sample/training set & not the population.  So take bootstrap samples i.e. multiple samples from the single sample with replacement  Variance reduces sub-linearly & Bias often increase slightly because bootstrap samples are correlated.  Final Classifier: Average of predictions for regression or majority vote for classification.  High Variance introduced by deep decision trees are mitigated by averaging predictions from each decision trees. Copyright @ Deepak George, IIM Bangalore 5 Bagging Population Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S1 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S2 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### Sn . . . Samples Sample Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S1 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]### S2 Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Sn . . . Bootstrap Samples Alice# 14# 0# 1# Bob# 10# 1# 1# Carol# 13# 0# 1# Dave# 8# 1# 0# Erin# 11# 0# 0# Frank# 9# 1# 1# Gena# 8# 0# 0# James# 11# 1# 1# Jessica# 14# 0# 1# Alice# 14# 0# 1# Amy# 12# 0# 1# Bob# 10# 1# 1# Xavier# 9# 1# 0# Cathy# 9# 0# 1# Carol# 13# 0# 1# Eugene# 13# 1# 0# Rafael# 12# 1# 1# Dave# 8# 1# 0# Peter# 9# 1# 0# Henry# 13# 1# 0# Erin# 11# 0# 0# Rose# 7# 0# 0# Iain# 8# 1# 1# Paulo# 12# 1# 0# Margaret# 10# 0# 1# Frank# 9# 1# 1# Jill# 13# 0# 0# Leon# 10# 1# 0# Sarah# 12# 0# 0# Gena# 8# 0# 0# Patrick# 5# 1# 1# L(h)#=#E(x,y)~P(x,y)[#f(h(x),y)#]###
  • 6. Copyright @ Deepak George, IIM Bangalore 6 Bootstrap sampling Bootstrap sample should have same sample size as the original sample. With replacement results in repetition of values Bootstrap sample on an average uses only 2/3 of the data in the original sample
  • 7. Copyright @ Deepak George, IIM Bangalore 7 Random Forest  Problem: Bagging still have relatively high variance  Goal: Reduce variance of Bagging  Solution: Along with sampling of data in Bagging, take samples of features also!  In other words, in building a random forest, at each split in the tree, the use only a random subset of features instead of all the features.  This de-correlates the trees.  Its mathematically proved that 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟𝑠 is a good approximate value for predictor subset size (mtry/max_features).  Evaluation: A bootstrap sample uses only approximately 2/3 of the observations of original sample.  Remaining training data (OOB) are used to estimate error and variable importance
  • 8.  Hyperparameters are knobs to control bias & variance tradeoff of any machine learning algorithm.  Key Hyper parameters  Max Features – De-correlates the trees  Number of Trees in the forest – Higher number reduce more variance Random Forest - Key Hyperparameters 8 Copyright @ Deepak George, IIM Bangalore
  • 9. Copyright @ Deepak George, IIM Bangalore 9 Random Forest – R Implementation library(randomForest) library(MASS) #Contains Boston dataframe library(caret) View(Boston) #Cross Validation cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T) #GridSeach rf.grid <- expand.grid(mtry = 2:13) set.seed(1861) ## make reproducible here, but not if generating many random samples #Hyper Parametertuning rf_tune <-train(medv~., data=Boston, method="rf", trControl=cv.ctrl, tuneGrid=rf.grid, ntree = 1000, importance = TRUE) #Cross Validation results rf_tune plot(rf_tune) #Variable Importance varImp(rf_tune) plot(varImp(rf_tune), top = 10)
  • 10. Copyright @ Deepak George, IIM Bangalore 10 Boosting  Intuition: Ensemble many “weak” classifiers (typically decision trees) to produce a final “strong” classifier  Weak classifier  Error rate is only slightly better than random guessing.  Boosting is a Forward Stagewise Additive model  Boosting sequentially apply the weak classifiers one by one to repeatedly reweighted versions of the data.  Each new weak learner in the sequence tries to correct the misclassification/error made by the previous weak learners.  Initially all of the weights are set to Wi = 1/N  For each successive step the observation weights are individually modified and a new weak learner is fitted on the reweighted observations.  At step m, those observations that were misclassified by the classifier Gm−1(x) induced at the previous step have their weights increased, whereas the weights are decreased for those that were classified correctly.  Final “strong” classifier is based on weighted vote of weak classifiers
  • 11. X1 X2 AdaBoost – Illustration 11Copyright @ Deepak George, IIM Bangalore Step 1 Input Data Initially all observations are assigned equal weight (1/N) Observations that are misclassified in the ith iteration is given higher weights in the (i+1)th iteration Observations that are correctly classified in the ith iteration is given lower weights in the (i+1)th iteration Copyright @ Deepak George, IIM Bangalore
  • 12. 12 Copyright @ Deepak George, IIM Bangalore Step 2 Step 3 AdaBoost – Illustration
  • 13. 13 Copyright @ Deepak George, IIM Bangalore Final Ensemble/Model AdaBoost – Illustration
  • 14. AdaBoost - Algorithm 14 Copyright @ Deepak George, IIM Bangalore
  • 15.  Generalization of AdaBoost to work with arbitrary loss functions resulted in GBM. Gradient Boosting = Gradient Descent + Boosting  GBM uses gradient descent algorithm which can optimize any differentiable loss function.  In Adaboost, ‘shortcomings’ are identified by high-weight data points.  In Gradient Boosting,“shortcomings” are identified by negative gradients (also called pseudo residuals).  In GBM instead of reweighting used in adaboost, each new tree is fit to the negative gradients of the previous tree.  Each tree in GBM is a successive gradient descent step. Gradient Boosting Machines 15 Copyright @ Deepak George, IIM Bangalore  AdaBoost is equivalent to forward stagewise additive modeling using the exponential loss function.
  • 16. Gradient Boosting - Algorithm 16 Copyright @ Deepak George, IIM Bangalore
  • 17.  GBM has 3 types of hyper parameters  Tree Structure  Max depth of the trees - Controls the degree of features interactions  Min samples leaf – Minimum number of samples in leaf node.  Number of Trees  Shrinkage  Learning rate - Slows learning by shrinking tree predictions.  Unlike fitting a single large decision tree to the data, which amounts to fitting the data hard and potentially overfitting, the boosting approach instead learns slowly  Stochastic Gradient Boosting  SubSample: Select random subset of the training set for fitting each tree than using the complete training data.  Max features: Select random subset of features for each tree. GBM – Key Hyperparameters 17 Copyright @ Deepak George, IIM Bangalore
  • 18. Copyright @ Deepak George, IIM Bangalore 18 Tree Ensembles- Interpretation
  • 19. library(xgboost) library(MASS) #Contains Boston dataframe library(caret) #Cross Validation cv.ctrl <- trainControl(method = "repeatedcv", repeats = 2,number = 5, allowParallel=T) #GridSeach xgb.grid <- expand.grid(nrounds=1000,eta = c(0.005,0.01,0.05,0.1) ,max_depth = c(4,5,6,7,8)) set.seed(1860) #Model training xgb_tune <-train(medv~., data=Boston, method="xgbTree", trControl=cv.ctrl, tuneGrid=xgb.grid, importance = TRUE, subsample =0.8) #Cross Validation results xgb_tune plot(xgb_tune) #Variable Importance plot(varImp(xgb_tune), top = 10) Copyright @ Deepak George, IIM Bangalore 19 GBM – R Implementation
  • 20. Copyright @ Deepak George, IIM Bangalore 20 End Questions ?