Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Automating Machine Learning
Is it feasible?
Manuel Martin Salvador
Smart Technology Research Group
Bournemouth University
...
Index
1. Recent life-changing applications of Machine Learning
2. Multicomponent Predictive Systems (MCPS)
3. Automating t...
Recent life-changing
applications of
Machine Learning
Gene Discovery
Source: http://msgeneticslab.med.ubc.ca/gene-discovery/
Dessa Sadovnick and Carles Vilariño-Güell
Universit...
Microsoft Seeing AI
Source: https://www.youtube.com/watch?v=R2mC-NUAmMk
Autonomous Vehicles
Source: https://www.youtube.com/watch?v=dk3oc1Hr62g
Instant Translation
Source: https://www.skype.com/en/features/skype-translator/
Multicomponent
Predictive Systems
Predictive Modelling
Labelled
Data
Supervised
Learning
Algorithm
Predictive
Model
Classification and Regression
Data is imperfect
Missing
Values
Noise
High
dimensionality
Outliers
Question Mark: http://commons.wikimedia.org/wiki/File:...
Multicomponent Predictive System (MCPS)
Data Postprocessing PredictionsPreprocessing
Predictive
Model
Multicomponent Predictive System (MCPS)
Preprocessing
Data
Predictive
Model
Postprocessing Predictions
Preprocessing
Prepr...
How to model MCPS?
Function composition: Not enough for modelling parallel paths.
Directed Acyclic Graph: Not enough to mo...
Petri net
Mathematical modelling language invented in 1939 by Carl Adam Petri
token
place
transition
arc
N = (P,T,F)
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosisPatient
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Example of Petri net
Reception Waiting
Room
Check in
Consulting
Room
Exit
Call in
Examination
and diagnosis
Petri nets can be more complex
Source: http://bit.ly/1XZQhYZ
Modelling MCPS as Petri net
A Petri net is an MCPS iff all the following conditions apply:
The Petri net is a workflow net...
Hierarchical MCPS with parallel paths
dummy dummy
i o
Hierarchical MCPS with parallel paths
dummy dummy
i o
Random
Feature
Selection
RandomSubspace
Decision
Tree
Mean
Any questions so far?
Automating the composition
and optimisation of MCPS
Algorithm Selection
What are the best algorithms to process my data?
Hyperparameter Optimisation
How to tune the hyperparameters to get the best performance?
CASH problem for MCPS
Combined Algorithm Selection and Hyperparameter configuration problem
k-fold cross validation
Object...
Search space
PREV
NEW
FULL
Predictor Meta-Predictor
Predictor Meta-Predictor
Predictor Meta-Predictor
Missing
Value
Handli...
Optimisation strategies
Grid search: exhaustive exploration of the whole search space. Not feasible in high
dimensional sp...
Auto-WEKA for MCPS
WEKA methods as search space
One-click black box
Data + Time Budget → MCPS
Our contribution
● Recursive...
Evaluated strategies
1. WEKA-Def: All the predictors and meta-predictors are run using WEKA’s
default hyperparameter value...
Experiments
21 datasets (classification problems)
Budget: 30 CPU-hours (per run)
25 runs with different seeds
Timeout: 30 ...
Training and testing process
Holdout error (% misclassification)
Convergence analysis
10-fold CV error of best solutions over time (each color is a different run/seed)
MCPS similarity analysis
Weight for the i-th transition
Hamming distance at the i-th transition
Low error variance and
hig...
MCPS similarity analysis: clustering
Waveform dataset and SMAC strategy
SMAC: Sequential Model-based Algorithm Configuration.
Auto-WEKA: toolbox including random search, SMAC and TPE for WEKA
pr...
Any questions so far?
Adapting MCPS
to changing environments
Maintaining an MCPS
Data distribution can change over time and affect predictions
External factors (e.g. weather condition...
Training and testing process
1. Training data is provided
2. Best MCPS found is selected
3. New batch of unlabelled
data r...
Evaluated strategies
Datasets from chemical production processes
Average classification error (%)
Average classification error per batch (%)
Baseline
Batch
Batch+SMAC
Cumulative
Cumulative+SMAC
drierthermalox
Batch adapt...
MCPS similarity analysis
Batch+SMAC Cumulative+SMAC
catalyst catalyst
Same components, only
hyperparameters are
adapted
La...
Conclusion and future work
Automatic machine learning is becoming a reality. There is a variety of open-source
software bu...
Thanks!
Publications with Marcin Budka and Bogdan Gabrys:
● “Towards automatic composition of Multicomponent Predictive Sy...
Próxima SlideShare
Cargando en…5
×

Automating Machine Learning - Is it feasible?

1.204 visualizaciones

Publicado el

Facing a machine learning problem for the first time can be overwhelming. Hundreds of methods exist for tackling problems such as classification, regression or clustering. Selecting the appropriate method is challenging, specially if no much prior knowledge is known. In addition, most models require to optimise a number of hyperparameters to perform well. Preparing the data for the learning algorithm is also a labour-intensive process that includes cleaning outliers and imperfections, feature selection, data transformation like PCA and more. A workflow connecting preprocessing methods and predictive models is called a multicomponent predictive system (MCPS). This talk introduces the problem of automating the composition and optimisation of MCPSs and also how they can be adapted in changing environments.​

Publicado en: Ciencias
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Automating Machine Learning - Is it feasible?

  1. 1. Automating Machine Learning Is it feasible? Manuel Martin Salvador Smart Technology Research Group Bournemouth University June 2nd, 2016
  2. 2. Index 1. Recent life-changing applications of Machine Learning 2. Multicomponent Predictive Systems (MCPS) 3. Automating the composition and optimisation of MCPS 4. Adapting MCPS to changing environments 5. Conclusion and future work
  3. 3. Recent life-changing applications of Machine Learning
  4. 4. Gene Discovery Source: http://msgeneticslab.med.ubc.ca/gene-discovery/ Dessa Sadovnick and Carles Vilariño-Güell University of British Columbia A mutation in NR1H3 protein can trigger Multiple Sclerosis
  5. 5. Microsoft Seeing AI Source: https://www.youtube.com/watch?v=R2mC-NUAmMk
  6. 6. Autonomous Vehicles Source: https://www.youtube.com/watch?v=dk3oc1Hr62g
  7. 7. Instant Translation Source: https://www.skype.com/en/features/skype-translator/
  8. 8. Multicomponent Predictive Systems
  9. 9. Predictive Modelling Labelled Data Supervised Learning Algorithm Predictive Model
  10. 10. Classification and Regression
  11. 11. Data is imperfect Missing Values Noise High dimensionality Outliers Question Mark: http://commons.wikimedia.org/wiki/File:Question_mark_road_sign,_Australia.jpg Noise: http://www.flickr.com/photos/benleto/3223155821/ Outliers: http://commons.wikimedia.org/wiki/File:Diagrama_de_caixa_com_outliers_and_whisker.png 3D plot: http://salsahpc.indiana.edu/plotviz/
  12. 12. Multicomponent Predictive System (MCPS) Data Postprocessing PredictionsPreprocessing Predictive Model
  13. 13. Multicomponent Predictive System (MCPS) Preprocessing Data Predictive Model Postprocessing Predictions Preprocessing Preprocessing Predictive Model Predictive Model
  14. 14. How to model MCPS? Function composition: Not enough for modelling parallel paths. Directed Acyclic Graph: Not enough to model process state. Petri net: Very flexible and robust mathematical background. Expressivepower Y = h(g(f(X))) f g hX Y f g hX Y
  15. 15. Petri net Mathematical modelling language invented in 1939 by Carl Adam Petri token place transition arc N = (P,T,F)
  16. 16. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosisPatient
  17. 17. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  18. 18. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  19. 19. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  20. 20. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  21. 21. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  22. 22. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  23. 23. Example of Petri net Reception Waiting Room Check in Consulting Room Exit Call in Examination and diagnosis
  24. 24. Petri nets can be more complex Source: http://bit.ly/1XZQhYZ
  25. 25. Modelling MCPS as Petri net A Petri net is an MCPS iff all the following conditions apply: The Petri net is a workflow net. The Petri net is well-handled and acyclic. The places P{i,o} have only a single input and a single output. The Petri net is 1-sound. The Petri net is safe. All the transitions with multiple inputs or outputs are AND-join or AND-split, respectively.
  26. 26. Hierarchical MCPS with parallel paths dummy dummy i o
  27. 27. Hierarchical MCPS with parallel paths dummy dummy i o Random Feature Selection RandomSubspace Decision Tree Mean
  28. 28. Any questions so far?
  29. 29. Automating the composition and optimisation of MCPS
  30. 30. Algorithm Selection What are the best algorithms to process my data?
  31. 31. Hyperparameter Optimisation How to tune the hyperparameters to get the best performance?
  32. 32. CASH problem for MCPS Combined Algorithm Selection and Hyperparameter configuration problem k-fold cross validation Objective function (e.g. classification error) HyperparametersMCPSs Training dataset Validation dataset Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proc. of the 19th ACM SIGKDD. (2013) 847–855 Martin Salvador M., Budka M., Gabrys B.: Automatic composition and optimisation of multicomponent predictive systems. IEEE Transactions on Knowledge and Data Engineering. under review - available at http://bit.ly/automatic-mcps-paper (submitted on 01/04/2016)
  33. 33. Search space PREV NEW FULL Predictor Meta-Predictor Predictor Meta-Predictor Predictor Meta-Predictor Missing Value Handling Outlier Detection and Handling Data Transformatio n Dimensionality Reduction Sampling Hyperparameters PREV NEW FULL 756 1186 1564
  34. 34. Optimisation strategies Grid search: exhaustive exploration of the whole search space. Not feasible in high dimensional spaces. Random search: explores the search space randomly during a given time. Bayesian optimisation: assumes that there is a function between the hyperparameters and the objective and try to explore the most promising parts of the search space. Hutter, F., Hoos, H. H., & Leyton- Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507–523.
  35. 35. Auto-WEKA for MCPS WEKA methods as search space One-click black box Data + Time Budget → MCPS Our contribution ● Recursive extension of complex hyperparameters in the search space. ● Composition and optimisation of MCPSs (including WEKA filters, predictors and meta-predictors) https://github.com/dsibournemouth/autoweka
  36. 36. Evaluated strategies 1. WEKA-Def: All the predictors and meta-predictors are run using WEKA’s default hyperparameter values. 2. Random search: The search space is randomly explored. 3. SMAC: Sequential Model-based Algorithm Configuration incrementally builds a Random Forest as surrogate model. 4. TPE: Tree-structure Parzen Estimation uses Gaussian Processes to incrementally build a surrogate model. Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential Model-Based Optimization for General Algorithm Configuration. Learning and Intelligent Optimization, 6683 LNCS, 507–523. J. Bergstra, R. Bardenet, Y. Bengio, and B. Kegl, Algorithms for Hyper-Parameter Optimization. in Advances in NIPS 24, 2011, pp. 1–9.
  37. 37. Experiments 21 datasets (classification problems) Budget: 30 CPU-hours (per run) 25 runs with different seeds Timeout: 30 minutes Memout: 3GB RAM
  38. 38. Training and testing process
  39. 39. Holdout error (% misclassification)
  40. 40. Convergence analysis 10-fold CV error of best solutions over time (each color is a different run/seed)
  41. 41. MCPS similarity analysis Weight for the i-th transition Hamming distance at the i-th transition Low error variance and high MCPS similarity Low error variance and low MCPS similarity High error variance and low MCPS similarity For FULL search space
  42. 42. MCPS similarity analysis: clustering Waveform dataset and SMAC strategy
  43. 43. SMAC: Sequential Model-based Algorithm Configuration. Auto-WEKA: toolbox including random search, SMAC and TPE for WEKA predictors. Auto-WEKA for MCPS: extension of Auto-WEKA for MCPSs. Auto-Sklearn: toolbox for automating scikit-learn. Spearmint: python library for Bayesian optimisation with Gaussian Processes. Hyperopt: python library for random search and TPE. HPOLib: common interface for SMAC, Spearmint and Hyperopt. Available software for Bayesian optimisation
  44. 44. Any questions so far?
  45. 45. Adapting MCPS to changing environments
  46. 46. Maintaining an MCPS Data distribution can change over time and affect predictions External factors (e.g. weather conditions, new regulations) Internal factors (e.g. quality of materials, equipment deterioration) Source: INFER project
  47. 47. Training and testing process 1. Training data is provided 2. Best MCPS found is selected 3. New batch of unlabelled data requires prediction 4. MCPS generates predictions 5. True labels are provided 6. Predictive accuracy is reported 7. MCPS is adapted using the last batch of labelled data
  48. 48. Evaluated strategies
  49. 49. Datasets from chemical production processes
  50. 50. Average classification error (%)
  51. 51. Average classification error per batch (%) Baseline Batch Batch+SMAC Cumulative Cumulative+SMAC drierthermalox Batch adaptation doesn’t help! :( Batch adaptation does help! :)
  52. 52. MCPS similarity analysis Batch+SMAC Cumulative+SMAC catalyst catalyst Same components, only hyperparameters are adapted Large difference between batches
  53. 53. Conclusion and future work Automatic machine learning is becoming a reality. There is a variety of open-source software but also commercial products (e.g. SigOpt and IBM Watson) Domain expert is still playing a crucial role (e.g. defining the search space) Smart techniques to reduce the search space are needed Maintaining MCPSs in a production environment is key for success Gap in adaptive surrogate models for Bayesian optimisation methods
  54. 54. Thanks! Publications with Marcin Budka and Bogdan Gabrys: ● “Towards automatic composition of Multicomponent Predictive Systems” - HAIS 2016 (published) http://bit.ly/towards-mcps-paper ● “Automatic composition and optimisation of Multicomponent Predictive Systems” - IEEE TKDE (under review) http://bit.ly/automatic-mcps-paper ● “Adapting Multicomponent Predictive Systems using hybrid adaptation Strategies with Auto-WEKA in process industry” - AutoML at ICML 2016 (accepted) http://bit.ly/adapting-mcps-paper ● “Effects of change propagation resulting from adaptive preprocessing in Multicomponent Predictive Systems” - KES 2016 (accepted) http://bit.ly/change-propagation-mcps-paper Slides available in http://www.slideshare.net/draxus Contact: Manuel Martin Salvador msalvador@bournemouth.ac.uk

×