Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Scalable Automatic Machine Learning with H2O

170 visualizaciones

Publicado el

In this presentation, Parul Pandey, will provide a history and overview of the field of “Automatic Machine Learning” (AutoML), followed by a detailed look inside H2O’s open source AutoML algorithm. H2O AutoML provides an easy-to-use interface which automates data pre-processing, training and tuning a large selection of candidate models (including multiple stacked ensemble models for superior model performance). The result of the AutoML run is a “leaderboard” of H2O models which can be easily exported for use in production. AutoML is available in all H2O interfaces (R, Python, Scala, web GUI) and due to the distributed nature of the H2O platform, can scale to very large datasets. The presentation will end with a demo of H2O AutoML in R and Python, including a handful of code examples to get you started using automatic machine learning on your own projects.

Parul's Bio:
Parul is a Data Science Evangelist here at H2O.ai. She combines Data Science, evangelism and community in her work. Her emphasis is to spread the information about H2O and Driverless AI to as many people as possible, She is also an active writer and has contributed towards various national and international publications.

Publicado en: Tecnología
  • Sé el primero en comentar

Scalable Automatic Machine Learning with H2O

  1. 1. Scalable Automatic Machine Learning with H2O Parul Pandey Data Science Evangelist, H2O.ai
  2. 2. What is H2O? H2O.ai, the company H2O, the platform • • • Founded in 2012 Advised by Stanford Professors Hastie, Tibshirani & Boyd Headquarters: Mountain View, California, USA • • • Open Source Software (Apache 2.0 Licensed) R, Python, Scala, Java and Web Interfaces Distributed Machine Learning Algorithms for Big Data
  3. 3. H2OTools
  4. 4. H2O in Industry
  5. 5. Agenda • H2O Platform • Automatic Machine Learning (AutoML) • H2O AutoML Overview • Demo
  6. 6. H2O Platform
  7. 7. H2O Machine Learning Platform • Open source, distributed (multi-core + multi-node) implementations of cutting edge ML algorithms. • Core algorithms written in high performance Java. • APIs available in R, Python, Scala; web GUI. • Easily deploy models to production as pure Java code. • Works on Hadoop, Spark, AWS, your laptop, etc.
  8. 8. H2O Machine Learning Features • Supervised & unsupervised machine learning algos (GBM, RF,DNN, GLM, Stacked Ensembles, etc.) • Imputation, normalization & auto one-hot-encoding • Automatic early stopping • Cross-validation, grid search & random search • Variable importance, model evaluation metrics, plots
  9. 9. Intro to A utomatic Machine Learning
  10. 10. Aspects of Automatic Machine Learning Data Prep Model Generation Ensembles
  11. 11. H2O’s Auto ML
  12. 12. H2O AutoML • Basic data pre-processing (as in all H2O algos). • Trains a Random grid of algorithms like GBMs, DNNs, GLMs, etc. using a carefully chosen hyper-parameter space. • Individual models are tuned using cross-validation. • Two Stacked Ensembles are trained (“All Models” ensemble & a lightweight “Best of Family” ensemble). • Returns a sorted “Leaderboard” of all models. • All models can be easily exported to production.
  13. 13. https://www.h2o.ai/blog/a-deep-dive-into-h2os-automl/
  14. 14. Random G r id Search & Stacking • Random Grid Search combined with Stacked Ensembles is a powerful combination. • Ensembles perform particularly well if the models they are based on (1) are individually strong, and (2) make uncorrelated errors. • Stacking usesa second-level metalearning algorithm to find the optimal combination of base learners.
  15. 15. Who is it for?
  16. 16. H 2 O A utoML in R
  17. 17. H2O AutoML in Python
  18. 18. H 2 O A utoML in Flow GUI
  19. 19. H 2 O A utoML Leaderboard Example Leaderboard for binary classification
  20. 20. H2O Auto ML Tutorial
  21. 21. Learn H2O AutoML! • Docs: https://tinyurl.com/h2o-automl-docs • R& Py tutorials:https://tinyurl.com/h2o-automl-tutorials • Blog: A Deep dive into H2O’s AutoML
  22. 22. H2O Resources • Documentation: http://docs.h2o.ai • Tutorials: https://github.com/h2oai/h2o-tutorials • Slidedecks: https://github.com/h2oai/h2o-meetups • Videos: https://www.youtube.com/user/0xdata • Stack Overflow: https://stackoverflow.com/tags/h2o • Google Group: https://tinyurl.com/h2ostream • Gitter: http://gitter.im/h2oai/h2o-3 • Events & Meetups: http://h2o.ai/events
  23. 23. Contribute to H2O! Get in touch over email, Gitter or JIRA. https://github.com/h2oai/h2o-3/blob/master/CONTRIBUTING.md
  24. 24. Thank you!

×