Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Introducción al Aprendizaje Automatico con H2O-3 (1)

757 visualizaciones

Publicado el

En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.

  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Introducción al Aprendizaje Automatico con H2O-3 (1)

  1. 1. 11 Machine Learning a Escala Sept 29th, 2020 Franklin Velasquez Technical Marketing Engineer and Academic Program Manager renga-260827183/ Introducción al Aprendizaje Automático (Machine Learning) con H2O-3
  2. 2. 2 is the open source leader in AI and Machine Learning Democratize AI for Everyone
  3. 3. 3 Democratizing AI Our mission to use AI for Good permeates into everything we do Trusted Partner Impact/SocialCommunity
  4. 4. 4 Founded in Silicon Valley 2012 Funding: $147M | Series D Investors: Goldman Sachs, Ping An, Wells Fargo, NVIDIA, Nexus Ventures We are Established We Make World-class AI Technology We are Global H2O Open Source Machine Learning H2O Driverless AI: Automatic Machine Learning H2O Q: AI platform for business users Mountain View, NYC, London, Paris, Ottawa, Prague, Chennai, Singapore 240 1K 20K 180K Universities Companies Using H2O Open Source Meetup Members Best AI Team Snapshot We are Passionate about Customers 4X customers, 2 years, all industries, all continents Aetna/CVS, Allergan, AT&T, Capital One, CBA, Citi, Coca Cola, Bradesco, Disney, Franklin Templeton, Genentech, Kaiser Permanente, Lego, Merck, Pepsi, Reckitt Benckiser, Roche
  5. 5. 5 Spans Industries and Use Cases Wholesale / Commercial Banking • Know Your Customers (KYC) • Anti-Money Laundering (AML) Card / Payments Business • Transaction frauds • Collusion fraud • Real-time targeting • Credit risk scoring • In-context promotion Retail Banking • Deposit fraud • Customer churn prediction • Auto-loan Financial Services • Early cancer detection • Product recommendations • Personalized prescription matching • Medical claim fraud detection • Flu season prediction • Drug discovery • ER and hospital management • Remote patient monitoring • Medical test predictions Healthcare and Life Science • Predictive maintenance • Avoidable truck-rolls • Customer churn prediction • Improved customer viewing experience • Master data management • In-context promotions • Intelligent ad placements • Personalized program recommendations Telecom • Funnel predictions • Personalized ads • Fraud detection • Next best offer • Next best action • Customer segmentation • Customer churn • Customer recommendations • Ad predictions and fraud Marketing and RetailMarketing and Retail Save Time. Save Money. Gain a Competitive Edge.
  6. 6. 66 Our Team is Made up of the World’s Leading Data Scientists Your projects are backed by 10% of the World’s Data Science Grandmasters and a Team of Experts who are relentless in solving your critical problems.
  7. 7. 7 Gartner 2020: is a Visionary in Two MQs New MQ for 2020 Strengths: 1. Automation 2. Ease of Use & Explainability 3. Excellent Customer Support 2020 Cloud AI for Developer Services MQ 2020 Data Science and Machine Learning MQ Named a Visionary, with the strongest “Completeness of Vision” in the entire quadrant Strengths: 1. Automation 2. Explainability 3. High-Performance ML Components
  8. 8. 8 • Automatic feature engineering, ML training and interpretability, from ingest to deployment • Open and Extensible AutoML • User licenses on a per seat basis annually • GUI-based interface, along with R & Python API, for end-to-end data science • A new and innovated platform to make your own AI apps • Rapid & Easy SDK to build interactive, low latency AI apps • Easy and intuitive platform to have AI answer your question The Platform In-memory, distributed machine learning algorithms with H2O Flow GUI Open Source H2O open source engine integration with Spark H2O Driverless AI H2O Q • 100% open source – Apache V2 Licensed • Enterprise support subscriptions • Interface using R, Python for ML training on massive datasets H2O ModelOps • AI deployment platform built for DevOps and MLOps • Scalable to support high throughput and low latency model scoring environments • Comprehensive model monitoring Highly flexible and scalable model deployment and monitoring platform. App Marketplace
  9. 9. 9 Introducción a H2O-3
  10. 10. 10 H2O Open Source AI Platform Rapid Model Deployment Cloud IntegrationAcceleration • Highly portable models deployed in Java (POJO) and Model Object Optimized (MOJO) • Automated and streamlined scoring service deployment with Rest API • Distributed in-memory computing platform • Distributed algorithms Big Data EcosystemOpen Source Flexible Interface Scalability and Performance Smart and Fast Algorithms H2 O Flow100% open source Distributed in-memory machine learning with linear scalability
  11. 11. 11 H2O Machine Learning Features • Supervised & Unsupervised machine learning algorithms – GBM, RF, DNN, GLMStack Ensembles, AutoML, etc. • Imputation, normalization and auto one-hot-encoding • Automatic early stopping • Automatic ML at Scale • Cross-validation, grid search and random search • Variable importance, model evaluation metrics, plots DRF XRT GBM XGBoost GLM DNN Stacked Ensemble
  12. 12. 12 Supervised Learning Statistical Analysis Decision Tree Ensembles Unsupervised Learning Clustering Dimensionality Reduction Anomaly Detection Multilayer Perceptron Deep Learning Stacking Aggregator H2O Machine Learning Methods Neural Networks AutoML Term Embeddings
  13. 13. 13 H2O Distributed Computing • Multi-node cluster with shared memory model • All computations in memory • Each node sees only some rows of the data • No limit on cluster size • Distributed data frames (collection of vectors) • Columns are distributed (across nodes) arrays • Works just like R’s data.frame or Python Pandas DataFrame H2O Cluster H2O Frame
  14. 14. 14 Python Interface Overview Action Pandas or scikit-learn H2O Reading data pandas.read_csv(data_path) h2o.import_file(data_path) Summarizing data pandas_frame.describe() h2o_frame.describe() Summary statistics pandas_frame.mean() h2o_frame.mean() Combining rows pandas.concat(list[frame1,frame2]) h2o_frame.rbind(h2o_frame2) Combining columns pandas.concat(list[frame1,frame2],axis = 1) h2o_frame.cbind(h2o_frame2) Data selection pandas_frame[:, :] h2o_frame[:, :] Transforming columns np.log(pandas_frame[x]) np.sqrt(pandas_frame[x]) h2o_frame[x].log() h2o_frame[x].sqrt() Building Random Forest model = RandomForestClassifier(n_estimators = 100) model =, y_frame) model = H2ORandomForestClassifier(n_trees = 100) model = model.train(x, y, train_frame) Model Prediction model.predict model.predict Model Metrics metrics.auc metrics = model.model_performance(frame) metrics.auc()
  15. 15. 15 STEP 1 Python user h2o_df = h2o.import_file(“../data/allyears2k.csv”) Reading Data into H2O with Python
  16. 16. 16 H2 O H2 O H2 O data.csv HTTP call to H2O cluster H2O ClusterInitiate distributed ingest HDFS/S3/Local File/URL STEP 2 2.2 2.3 2.4 Python h2o.import_file() 2.1 Python function call Reading Data into H2O with Python
  17. 17. 17 H2 O H2 O H2 O Python STEP 3 Cluster IP Cluster Port Pointer to Data Return pointer to dataframe 3.3 3.4 3.1h2o_df object created in Python data.csv h2o_df H2 O Frame 3.2 Distributed H2 O Frame H2O Cluster Reading Data into H2O with Python HDFS/S3/Local File/URL
  18. 18. 18 H2O Open Source Architecture Clusters Model Object Optimized (MOJO)
  19. 19. 1919 DEMO
  20. 20. 20 ¿Preguntas?
  21. 21. CONFIDENTIA Gracias
  22. 22. 22 Learning Center What? • Self paced H2O-3 tutorials and Driverless AI Tutorials • Instructor led courses – AI and ML Foundations (Free) • Knowledge Achievement: Badges Aquarium • Cloud learning environments • Driverless AI, H2O-3, Sparkling Water, DataTable
  23. 23. 23 Resources H2O-3 Documentation - ml Python Module - 4/docs-website/h2o-py/docs/intro.html R Module - index.html