Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Accelerate ML Deployment with H2O Driverless AI on AWS

372 visualizaciones

Publicado el

This slide was presented by Dmitry Baev, Pratap Ramamurthy and Karthik Kannappan at our AWS DevDay in Toronto, Canada on July 17, 2019

Publicado en: Software
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Accelerate ML Deployment with H2O Driverless AI on AWS

  1. 1. Accelerate ML Deployment with H2O Driverless AI on AWS
  2. 2. Confidential2 Company Founded in Silicon Valley in 2012 Funded: $75M Investors: Wells Fargo, NVIDIA, Nexus Ventures, Paxion Ventures Products • H2O Open Source Machine Learning (14,000 organizations) • H2O Driverless AI – Automatic Machine Learning Team 130 AI experts (Expert data scientists, Kaggle Grandmasters, Distributed Computing, Visualization) Global Mountain View, NYC, London, Prague, India Overview
  3. 3. Confidential3 Growing Worldwide Open Source Community 14,000 Companies Using H2O 155,000 Data Scientists 120K Meetup Members H2O World – NYC, London, SF Thousands attending live and online
  4. 4. Confidential4 Product Suite Automatic feature engineering, machine learning and interpretability • 100% open source – Apache V2 licensed • Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) • Enterprise support subscriptions • Enterprise software • Built for domain users, analysts and data scientists – GUI-based interface for end-to-end data science • Fully automated machine learning from ingest to deployment • User licenses on a per seat basis (annual subscription) H2O AI open source engine integration with Spark Lightning fast machine learning on GPUs In-memory, distributed machine learning algorithms with H2O Flow GUI Open Source
  5. 5. Confidential5 Driverless AI is ideal for Enterprise AI Time Time to Insights Slow Talent Lack of AI Talent Trust Lack of Trust in AI ~100 Data Science Experts in the World Time for a Data Scientist to Build a Model Months Explainable AI ? Data is a Team Sport
  6. 6. Confidential6 Driverless AI Delivers AI for Enterprise Time Time to Insight Talent Kaggle Grandmasters Expert Data Scientists at GPU Accelerated ML Automatic Pipelines Months to Hours Trust Explainability and Transparency Machine Learning Interpretability Auto Doc Auto Visualization
  7. 7. Confidential7 Supervised Learning Age Income Last Month Payment Default 47 $183,342 Yes False 29 $ 84,823 No True 58 $ 95,853 Yes False 63 $ 43,824 Yes True Training Data Age Income Last Month Payment Default 61 $ 73,679 Yes 73 $ 54,428 No 59 $ 90,453 Yes 43 $ 83,041 Yes Test Data What’s the pattern? Can we create model to guess ‘Default’?
  8. 8. Confidential8 Supervised Learning Techniques Regression: How much will a customers spend? Classification: Will a customer make a purchase? Yes or No X y xi xj yes no
  9. 9. Confidential9 Step 1 Import and Explore Data Step 5 Model Deployment Step 2 Feature Engineering Step 4 Final Model Selection Step 3 Model Building / Tuning Machine Learning Workflow Machine Learning Iterative Process From Data to Deployment
  10. 10. Confidential10 Confidential10 Features Target Data Quality and Transformation Modeling Table Model Building Model Data Integration + Challenges in the Machine Learning Workflow Weeks or even Months per Model Optimization Highly Iterative Process • Insight – Visualization • Cross Validation • Feature Engineering • Model Selection • Hyper Parameter Optimization • Feature Selection • Ensemble • Understanding/Interpreting the results • Deploy/Productionize
  11. 11. Confidential11 H2O Driverless AI Delivers Automatic Machine Learning Test Drive for Driverless AI Automatic AI and ML in a single platform Performs the function of an expert data scientist Delivers insights and interpretability Provides easy to understand results and visualizations Confidential11
  12. 12. Confidential12 Confidential12 • Automatic Visualization • Automatic Feature Engineering • Automatic Model and Ensemble Selection • Machine Learning Recipes – Time Series – NLP • GPU Acceleration • Machine Learning Interpretability (MLI) • Scoring Pipeline and Deployment • Trouble Shooting and Docs Driverless AI Platform Capabilities
  13. 13. Confidential13 Confidential13 H2O Driverless AI – How it Works SQL Local Amazon S3 HDFS X Y Automatic Model Optimization Automatic Scoring Pipeline Machine learning Interpretability Deploy Low-latency Scoring to Production Modelling Dataset Model Recipes • i.i.d. data • Time-series • More on the way Advanced Feature Engineering Algorithm Model Tuning+ + Survival of the Fittest Understand the data shape, outliers, missing values, etc. Powered by GPU Acceleration 1 Drag and Drop Data 2 Automatic Visualization Use best practice model recipes and the power of high performance computing to iterate across thousands of possible models including advanced feature engineering and parameter tuning 3 Automatic Model Optimization Deploy ultra-low latency Python or Java Automatic Scoring Pipelines that include feature transformations and models 4 Automatic Scoring Pipelines Bring data in from cloud, big data and desktop systems Google BigQuery Azure Blob Storage Snowflake Model Documentation
  14. 14. Confidential14 Driverless AI Components Automatic Visualization Machine Learning Interpretability Machine Learning Experimentation Project * Management ML Recipes Scoring Pipeline and Deployment Feature Engineering
  15. 15. Confidential15 Driverless AI Powered By Open Source RuleFit FTRL GLM
  16. 16. Confidential16 2 Months for Grandmasters – 2 Hours for Driverless AI Single Run, Fully Automated: 2h on DGX Station! 6h on PC Driverless AI: 10th Place in Private LB at Kaggle (Out of 2,926) Driverless AI: Top 10 in BNP Paribas Kaggle Competition
  17. 17. Confidential17 Confidential17 MLI – Machine Learning Interpretation Gain Confidence in Models before Deploying them!
  18. 18. Confidential18 Confidential18 Scoring pipeline of Driverless AI models Scoring Pipeline and Deployment DriverlessAI instance Restful endpoint in EC2 Restful endpoint in Lambda Sagemaker
  19. 19. Confidential19 Confidential19 Scoring pipeline in Amazon Sagemaker Download MOJO Upload docker image into ECR Create a Sagemaker model Deploy endpoint
  20. 20. Confidential20 Driverless AI is Across Industries Insurance Healthcare Manufacturing Retail Ad Tech / MarTech Financial Services
  21. 21. Confidential21 Industry Use Cases Save Time. Save Money. Gain a Competitive Advantage. Wholesale / Commercial Banking • Know Your Customers (KYC) • Anti-Money Laundering (AML) Card / Payments Business • Transaction frauds • Collusion fraud • Real-time targeting • Credit risk scoring • In-context promotion Retail Banking • Deposit fraud • Customer churn prediction • Auto-loan Financial Services • Early cancer detection • Product recommendations • Personalized prescription matching • Medical claim fraud detection • Flu season prediction • Drug discovery • ER and hospital management • Remote patient monitoring • Medical test predictions Healthcare • Predictive maintenance • Avoidable truck-rolls • Customer churn prediction • Improved customer viewing experience • Master data management • In-context promotions • Intelligent ad placements • Personalized program recommendations Telecom • Funnel predictions • Personalized ads • Credit scoring • Fraud detection • Next best offer • Next best customer • Smart profiling • Prediction • Customer recommendations • Ad predictions and spend Marketing and Retail
  22. 22. Confidential22 H2O Driverless AI Delivers Value in Every Industry Near Perfect Scores Healthcare Increased customer satisfaction 2.5X Performance Marketing Outperforms alternative digital marketing +6% Accuracy Financial Services Matched 10 years of machine learning expertise 1 Month Savings Manufacturing Accurately predicting supply chain Customer Case Studies “Driverless AI is giving amazing results in terms of feature and model performance.” “Driverless AI helped us gain an edge with our Intelligent Marketing Cloud for our clients. AI to do AI, truly is improving our system on a daily basis.” “H2O Driverless AI feature engineering is better than anything I've seen out there right now. And the scoring pipeline generation is probably one of the bigger pluses for me. These features alone have provided us with a true competitive edge in agile manufacturing. It's a massive time saver.” Venkatesh Ramanathan Senior Data Scientist, PayPal Martin Stein Chief Product Officer, G5 Dr. Robert Coop AI and ML Manager, Stanley Black & Decker Bharath Sudarshan Director of Data Science, Armada Health “Driverless AI powers our data science team to operate efficiently and experiment at scale… with this latest innovation, we have the opportunity to impact care at large.”
  23. 23. Confidential23 Online Chat to ask questions, discuss use cases, give feedback and more • Join the Community Slack Workspace today! – • Click: • You will receive an email with login details and next steps • Check out Community Guide for more info: – Community Slack Workspace
  24. 24. Confidential24 Interactive Demo
  25. 25. Confidential25 Register for a Test Drive of DAI on AWS
  26. 26. Confidential26 Create Your Account and Launch DAI
  27. 27. Confidential27 Thank You