SlideShare a Scribd company logo
1 of 33
Deep Dive into
Hyperparameter
Tuning
About Me
Shubhmay Potdar
Sr. Software Engineer @ eQ-Technologic
Contents
1. Introduction to Hyperparameter Tuning
2. Grid and Random Search
3. Sobol Sequences
4. Introduction to Sequential based Model Optimization
a. Bayesian Optimization
b. Tree of Parzen Estimator
5. Evolutionary Algorithms: CMA-ES
6. Particle Based Methods: Particle Swarm Optimization
7. Multi Fidelity Methods: Successive Halving and HyperBand
8. Libraries and Services for Hyperparameter Tuning
9. Future Scope for Research
Hyperparameters
What are hyperparameters ?
In machine learning, a hyperparameters are set of
configurations that are being assigned to the
learning algorithm and whose values cannot be
estimated using data.
1. Depth of tree ( Decision Tree)
2. No. of trees (Random Forest)
3. Regularization Parameters (XGBoost)
4. No. of layers (Deep Neural Network)
Why are they required ?
Good combinations are likely to give the best
results
Define complexity, ability to learn, structure of
the model.
Choosing correct values will help to eliminate
the chances of overfitting and underfitting.
Exploration Problem
Hyperparameter tuning
can be seen as an
exploration problem
The true structure of the
underlying function is
unknown
Aim is to explore as
many region as possible
within some constraints
1 2 3 4
Four Steps in Hyperparam Tuning
Objective Function:
what we want to
minimize, in this case
the validation error of a
machine learning
model with respect to
the hyperparameters
Domain Space:
hyperparameter values
to search over
Optimization algorithm:
method for constructing
the surrogate model and
choosing the next
hyperparameter values
to evaluate
Result history:
stored outcomes from
evaluations of the
objective function
consisting of the
hyperparameters and
validation loss
Grid Search
❖ Select values for each hyperparameter
to test and try all combinations
❖ Expensive to evaluate all combinations
Bergstra, James and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13 (2012): 281-305.
Random Search
❖ Select values randomly for every
hyperparameter
❖ Evaluations are independent, can be
evaluated parallely
❖ Specify distribution of parameters for
effective sampling
Bergstra, James and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13 (2012): 281-305.
Sobol Sequences
Sobol sequence is a low discrepancy
quasi-random sequence
Sobol sequences were designed to cover the
unit hypercube with lower discrepancy than
completely random sampling
Preview SMBO Can we do better than grid and random search ?
Can we have a guided tour in our journey for finding optimal
parameters ?
We know that the cost of evaluation of our training algorithm
is significantly large in most cases
And obviously we are not guaranteed that the given set of
parameters will give the optimal solution
https://pixabay.com/en/light-bulb-ideas-sketch-i-think-487859/
Bayesian
Optimization
Bayesian optimization is a framework
that is useful in following scenarios:
❖ Objective function has no
closed-form
❖ No access to gradients
❖ In presence of noise
❖ It may be expensive to evaluate.
Bayesian Optimization - Main
Components
Surrogate Function:
Needed to approximate the objective
function and chooses to optimize it
according to some acquisition function
Common choices are Gaussian Process,
Random Forest, Gradient Boosted
Machines
Acquisition function:
Helps to select next point for evaluation
Trade off between exploring unknown
regions versus exploiting known regions
Common choices are Expected
Improvement, Upper Confidence Bound,
Probability of Improvement, Thompson
Sampling etc.
Bayesian Optimization - Algorithm
Gaussian Process
Expected Improvement
f∗ - current optimal value
Quantify the improvement over f∗ if we sample a point x - I(x) = max(f∗ − Y, 0)
If f is modelled using GP, where ϕ,Φ are the PDF, CDF of standard normal
distribution, respectively
Challenges
How to design surrogate function that models
the objective function and which is also cheap to
evaluate
How to design the helper function that
guarantee tradeoff between exploration and
exploitation
https://pixabay.com/en/overcoming-stone-roll-slide-strong-2127669/
Drawbacks
❖ Complexity of GP is O(n^3)
❖ Hyperparameters for GP itself
❖ Difficult to parallelize
❖ Can stuck at local minima
Tree of Parzen
Estimator
We tend to explore more in the
region where we got high
percentage of optimal values in our
exploration.
Algorithm
❖ Sample N candidates at random and evaluate model
❖ Divide N candidates into two groups
➢ Group 1 - contains best observations
➢ group 2 - rest all
❖ Evaluate densities of both groups using parzen
window density estimator
❖ Use Expected Improvement as acquisition function
❖ Draw M samples from group 1
❖ Calculate EI = l(x)/g(X) for M samples (Where l(x) is a
probability being in the first group and g(x) is a
probability being in the second group.)
❖ Evaluate model where EI is maximum
❖ Repeat from 2 until no. of iterations get exhausted
Source: http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html
TPE - Algorithm
Source: http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html
Evolutionary Algorithm
❖ Evaluate the objective function at
certain points
❖ Based on the fitness results of the
current solutions, produce the next
generation of candidate solutions
that is more likely to produce even
better results than the current
generation
❖ The iterative process will stop once
the best known solution is
satisfactory for the user
Source: http://blog.otoro.net/2017/10/29/visual-evolution-strategies/
Algorithm 1. Start with N candidates
2. Calculate the fitness score of each
candidate solution
3. Isolates the best 25% of the population in
generation
4. Using only the best solutions, along with
the mean μ​(g)​​ of the current generation
5. Calculate the covariance matrix C(g+1)​ of
the next generation
6. Sample a new set of candidate solutions
using the updated mean μ​(g+1)​​ and
covariance matrix C(g+1)
CMA-ES
Schaffer-2D Function Rastrigin-2D Function
Source: http://blog.otoro.net/2017/10/29/visual-evolution-strategies/
Particle Swarm Optimization
❖ heuristic optimization technique
❖ simulates a set of particles that are moving around in the search space
❖ for hyperparameter search, position of a particle represents a set of
hyperparameters and its movement is influenced by the goodness of the
objective function value
Particle Swarm
Optimization
Algorithm
Particle Swarm Optimization
Source: https://pyswarms.readthedocs.io/en/latest/examples/visualization.html
Multi-Fidelity
Optimization
❖ Idea is to be replace full
evaluation with cheap
approproximations
➢ using subset of data
➢ cross validations on few folds
➢ few iteration of algorithm
❖ Reject significantly worst
performing configuration
Hyperband ❖ Employs pure exploration approach
❖ The idea is to try a large number of
random configurations
❖ By computing more efficiently, it tries at
more hyperparameter configurations
❖ Most of the algorithms are iterative in
machine learning,
❖ If we are running a set of parameters, and
the progress looks terrible, it might be a
good idea to quit and just try a new set of
hyperparameters
Successive Halving
❖ One way to implement such a scheme
called successive halving
❖ First try out N hyperparameter settings for
some fixed amount of time T
❖ Keep the N/2 best performing algorithms
and run for time 2T
❖ Repeating this procedure log2(M) times,
we end up with N/M configurations run
for MT time
Source: https://pdfs.semanticscholar.org/2442/ad6a385b9bcfcdca09b28e74b122eba8fdac.pdf
max_iter = 81
eta = 3
B = 5*max_iter
S = 4
n_i r_i
S = 3
n_i r_i
S = 2
n_i r_i
S = 1
n_i r_i
S = 0
n_i r_i
81 1 27 3 9 9 6 27 5
27 3 9 9 3 27 2 81
9 9 3 27 1 81
3 27 1 81
1 81
Suggestions If all hyperparameters are real-valued and one can only
afford a few dozen function evaluations, we recommend the
use of a Gaussian process-based Bayesian optimization
For large and conditional configuration spaces we suggest
either the random forest-based SMAC or TPE due to their
proven strong performance
For purely real-valued spaces and relatively cheap objective
functions, for which we can afford more than hundreds of
evaluations,use CMA-ES
Library Optunity - https://optunity.readthedocs.io/en/latest/
Deap - https://github.com/DEAP/deap
Smac3 - https://github.com/automl/SMAC3
Tune - https://ray.readthedocs.io/en/latest/tune.html
GPyOpt - https://sheffieldml.github.io/GPyOpt/
Scikit-optimize - https://scikit-optimize.github.io/
Hyperopt - https://github.com/hyperopt/hyperopt
Hyperband - https://github.com/zygmuntz/hyperband
Thanks

More Related Content

What's hot

Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for ClassificationPrakash Pimpale
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
Hyperparameter Optimization with Hyperband Algorithm
Hyperparameter Optimization with Hyperband AlgorithmHyperparameter Optimization with Hyperband Algorithm
Hyperparameter Optimization with Hyperband AlgorithmDeep Learning Italia
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentationAyanaRukasar
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptxCodingWorld5
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent methodSanghyuk Chun
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning TechniquesBabu Priyavrat
 
Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxShubham Jaybhaye
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionalityNikhil Sharma
 

What's hot (20)

Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Regularization
RegularizationRegularization
Regularization
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Hyperparameter Optimization with Hyperband Algorithm
Hyperparameter Optimization with Hyperband AlgorithmHyperparameter Optimization with Hyperband Algorithm
Hyperparameter Optimization with Hyperband Algorithm
 
Xgboost
XgboostXgboost
Xgboost
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
House Price Prediction.pptx
House Price Prediction.pptxHouse Price Prediction.pptx
House Price Prediction.pptx
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptx
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 

Similar to Deep Dive into Hyperparameter Tuning

Meta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationMeta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationPriyatham Bollimpalli
 
Deep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyJason Tsai
 
Advanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowAdvanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowDatabricks
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? HackerEarth
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...PyData
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16MLconf
 
Ensemble hybrid learning technique
Ensemble hybrid learning techniqueEnsemble hybrid learning technique
Ensemble hybrid learning techniqueDishaSinha9
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Jisu Han
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneXiaoweiJiang7
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OSri Ambati
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 

Similar to Deep Dive into Hyperparameter Tuning (20)

Meta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter OptimizationMeta Machine Learning: Hyperparameter Optimization
Meta Machine Learning: Hyperparameter Optimization
 
Deep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical MethodologyDeep Learning: Chapter 11 Practical Methodology
Deep Learning: Chapter 11 Practical Methodology
 
Advanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflowAdvanced Hyperparameter Optimization for Deep Learning with MLflow
Advanced Hyperparameter Optimization for Deep Learning with MLflow
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...Towards automating machine learning: benchmarking tools for hyperparameter tu...
Towards automating machine learning: benchmarking tools for hyperparameter tu...
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
 
ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
Ensemble hybrid learning technique
Ensemble hybrid learning techniqueEnsemble hybrid learning technique
Ensemble hybrid learning technique
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 

Recently uploaded

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Deep Dive into Hyperparameter Tuning

  • 2. About Me Shubhmay Potdar Sr. Software Engineer @ eQ-Technologic
  • 3. Contents 1. Introduction to Hyperparameter Tuning 2. Grid and Random Search 3. Sobol Sequences 4. Introduction to Sequential based Model Optimization a. Bayesian Optimization b. Tree of Parzen Estimator 5. Evolutionary Algorithms: CMA-ES 6. Particle Based Methods: Particle Swarm Optimization 7. Multi Fidelity Methods: Successive Halving and HyperBand 8. Libraries and Services for Hyperparameter Tuning 9. Future Scope for Research
  • 4. Hyperparameters What are hyperparameters ? In machine learning, a hyperparameters are set of configurations that are being assigned to the learning algorithm and whose values cannot be estimated using data. 1. Depth of tree ( Decision Tree) 2. No. of trees (Random Forest) 3. Regularization Parameters (XGBoost) 4. No. of layers (Deep Neural Network) Why are they required ? Good combinations are likely to give the best results Define complexity, ability to learn, structure of the model. Choosing correct values will help to eliminate the chances of overfitting and underfitting.
  • 5. Exploration Problem Hyperparameter tuning can be seen as an exploration problem The true structure of the underlying function is unknown Aim is to explore as many region as possible within some constraints
  • 6. 1 2 3 4 Four Steps in Hyperparam Tuning Objective Function: what we want to minimize, in this case the validation error of a machine learning model with respect to the hyperparameters Domain Space: hyperparameter values to search over Optimization algorithm: method for constructing the surrogate model and choosing the next hyperparameter values to evaluate Result history: stored outcomes from evaluations of the objective function consisting of the hyperparameters and validation loss
  • 7. Grid Search ❖ Select values for each hyperparameter to test and try all combinations ❖ Expensive to evaluate all combinations Bergstra, James and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13 (2012): 281-305.
  • 8. Random Search ❖ Select values randomly for every hyperparameter ❖ Evaluations are independent, can be evaluated parallely ❖ Specify distribution of parameters for effective sampling Bergstra, James and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” Journal of Machine Learning Research 13 (2012): 281-305.
  • 9. Sobol Sequences Sobol sequence is a low discrepancy quasi-random sequence Sobol sequences were designed to cover the unit hypercube with lower discrepancy than completely random sampling
  • 10. Preview SMBO Can we do better than grid and random search ? Can we have a guided tour in our journey for finding optimal parameters ? We know that the cost of evaluation of our training algorithm is significantly large in most cases And obviously we are not guaranteed that the given set of parameters will give the optimal solution https://pixabay.com/en/light-bulb-ideas-sketch-i-think-487859/
  • 11. Bayesian Optimization Bayesian optimization is a framework that is useful in following scenarios: ❖ Objective function has no closed-form ❖ No access to gradients ❖ In presence of noise ❖ It may be expensive to evaluate.
  • 12. Bayesian Optimization - Main Components Surrogate Function: Needed to approximate the objective function and chooses to optimize it according to some acquisition function Common choices are Gaussian Process, Random Forest, Gradient Boosted Machines Acquisition function: Helps to select next point for evaluation Trade off between exploring unknown regions versus exploiting known regions Common choices are Expected Improvement, Upper Confidence Bound, Probability of Improvement, Thompson Sampling etc.
  • 15. Expected Improvement f∗ - current optimal value Quantify the improvement over f∗ if we sample a point x - I(x) = max(f∗ − Y, 0) If f is modelled using GP, where ϕ,Φ are the PDF, CDF of standard normal distribution, respectively
  • 16. Challenges How to design surrogate function that models the objective function and which is also cheap to evaluate How to design the helper function that guarantee tradeoff between exploration and exploitation https://pixabay.com/en/overcoming-stone-roll-slide-strong-2127669/
  • 17. Drawbacks ❖ Complexity of GP is O(n^3) ❖ Hyperparameters for GP itself ❖ Difficult to parallelize ❖ Can stuck at local minima
  • 18. Tree of Parzen Estimator We tend to explore more in the region where we got high percentage of optimal values in our exploration.
  • 19. Algorithm ❖ Sample N candidates at random and evaluate model ❖ Divide N candidates into two groups ➢ Group 1 - contains best observations ➢ group 2 - rest all ❖ Evaluate densities of both groups using parzen window density estimator ❖ Use Expected Improvement as acquisition function ❖ Draw M samples from group 1 ❖ Calculate EI = l(x)/g(X) for M samples (Where l(x) is a probability being in the first group and g(x) is a probability being in the second group.) ❖ Evaluate model where EI is maximum ❖ Repeat from 2 until no. of iterations get exhausted Source: http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html
  • 20. TPE - Algorithm Source: http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html
  • 21. Evolutionary Algorithm ❖ Evaluate the objective function at certain points ❖ Based on the fitness results of the current solutions, produce the next generation of candidate solutions that is more likely to produce even better results than the current generation ❖ The iterative process will stop once the best known solution is satisfactory for the user Source: http://blog.otoro.net/2017/10/29/visual-evolution-strategies/
  • 22. Algorithm 1. Start with N candidates 2. Calculate the fitness score of each candidate solution 3. Isolates the best 25% of the population in generation 4. Using only the best solutions, along with the mean μ​(g)​​ of the current generation 5. Calculate the covariance matrix C(g+1)​ of the next generation 6. Sample a new set of candidate solutions using the updated mean μ​(g+1)​​ and covariance matrix C(g+1)
  • 23. CMA-ES Schaffer-2D Function Rastrigin-2D Function Source: http://blog.otoro.net/2017/10/29/visual-evolution-strategies/
  • 24. Particle Swarm Optimization ❖ heuristic optimization technique ❖ simulates a set of particles that are moving around in the search space ❖ for hyperparameter search, position of a particle represents a set of hyperparameters and its movement is influenced by the goodness of the objective function value
  • 26. Particle Swarm Optimization Source: https://pyswarms.readthedocs.io/en/latest/examples/visualization.html
  • 27. Multi-Fidelity Optimization ❖ Idea is to be replace full evaluation with cheap approproximations ➢ using subset of data ➢ cross validations on few folds ➢ few iteration of algorithm ❖ Reject significantly worst performing configuration
  • 28. Hyperband ❖ Employs pure exploration approach ❖ The idea is to try a large number of random configurations ❖ By computing more efficiently, it tries at more hyperparameter configurations ❖ Most of the algorithms are iterative in machine learning, ❖ If we are running a set of parameters, and the progress looks terrible, it might be a good idea to quit and just try a new set of hyperparameters
  • 29. Successive Halving ❖ One way to implement such a scheme called successive halving ❖ First try out N hyperparameter settings for some fixed amount of time T ❖ Keep the N/2 best performing algorithms and run for time 2T ❖ Repeating this procedure log2(M) times, we end up with N/M configurations run for MT time Source: https://pdfs.semanticscholar.org/2442/ad6a385b9bcfcdca09b28e74b122eba8fdac.pdf
  • 30. max_iter = 81 eta = 3 B = 5*max_iter S = 4 n_i r_i S = 3 n_i r_i S = 2 n_i r_i S = 1 n_i r_i S = 0 n_i r_i 81 1 27 3 9 9 6 27 5 27 3 9 9 3 27 2 81 9 9 3 27 1 81 3 27 1 81 1 81
  • 31. Suggestions If all hyperparameters are real-valued and one can only afford a few dozen function evaluations, we recommend the use of a Gaussian process-based Bayesian optimization For large and conditional configuration spaces we suggest either the random forest-based SMAC or TPE due to their proven strong performance For purely real-valued spaces and relatively cheap objective functions, for which we can afford more than hundreds of evaluations,use CMA-ES
  • 32. Library Optunity - https://optunity.readthedocs.io/en/latest/ Deap - https://github.com/DEAP/deap Smac3 - https://github.com/automl/SMAC3 Tune - https://ray.readthedocs.io/en/latest/tune.html GPyOpt - https://sheffieldml.github.io/GPyOpt/ Scikit-optimize - https://scikit-optimize.github.io/ Hyperopt - https://github.com/hyperopt/hyperopt Hyperband - https://github.com/zygmuntz/hyperband