SlideShare una empresa de Scribd logo
1 de 24
Bring Your Own Recipes
Make Your Own AI
Confidential2 Confidential2
• aquarium.h2o.ai
• H2O.ai’s software-as-a-service platform for training and initial
exploration
• Recommended for use as a training, workshops and tutorials
• Driverless AI Test Drive
• https://github.com/h2oai/tutorials/blob/master/DriverlessAI/Test-
Drive/test-drive.md
• Your data will disappear after the time period
• Run as many times as needed
H2O Aquarium 1
2
3
Confidential3
Make Your Own AI: Agenda
• Where does BYOR fit into Driverless AI?
• What are custom recipes?
• Tutorial: Using custom recipes
• What does it take to write a recipe?
• Example deep dive with the experts
Confidential4
Key Capabilities of H2O Driverless AI
• Automatic Feature Engineering
• Automatic Visualization
• Machine Learning Interpretability (MLI)
• Automatic Scoring Pipelines
• Natural Language Processing
• Time Series Forecasting
• Flexibility of Data & Deployment
• NVIDIA GPU Acceleration
• Bring-Your-Own Recipes
Confidential5
Driverless AI Across Industries
Confidential6
The Workflow of Driverless AI
SQL
HDFS
X Y
Automatic Model Optimization
Automatic
Scoring Pipeline
Deploy
Low-latency
Scoring to
Production
Modelling
Dataset
Model Recipes
• i.i.d. data
• Time-series
• More on the way
Advanced
Feature
Engineering
Algorithm
Model
Tuning+ +
Survival of the Fittest
1 Drag and Drop Data
2 Automatic Visualization
4 Automatic Model Optimization
5 Automatic Scoring Pipelines
Snowflake
Model
Documentation
 Upload your own recipe(s)
Transformations Algorithms Scorers
3 Bring Your Own Recipes
 Driverless AI executes automation on your recipes
Feature engineering, model selection, hyper-parameter tuning,
overfitting protection
 Driverless AI automates
model scoring and
deployment using your
recipes
Amazon S3
Google BigQuery
Azure Blog Storage
Confidential7
What is a Recipe…
• Machine Learning Pipelines’ model prepped data to solve a business question
• Transformations are done on the original data to ensure it’s clean and most predictive
• Additional datasets may be brought in to add insights
• The data is modeled using an algorithm to find the optimal rules to solve the problem
• We determine the best model by using a specific metric, or scorer
• BYOR stands for Bring Your Own Recipe and it allows domain scientists to solve their
problems faster and with more precision by adding their expertise in the form of Python
code snippets
• By providing your own custom recipes, you can gain control over the optimization choices
that Driverless AI makes to best solve your machine learning problems
Confidential8
• Flexibility, extensibility and customizations built into the Driverless AI
platform
• New open source recipes built by the data science community, curated by
Kaggle Grand Masters @ H2O.ai
• Data scientists can focus on domain-specific functions to build
customizations
• 1-click upload of your recipes – models, scorers and transformations
• Driverless AI treats custom recipes as first-class citizens in the automatic
machine learning workflow
• Every business can have a recipe cookbook for collaborative data
science within their organization
…and Why Do You Care?
Confidential9
https://h2oai.github.io/tutorials/
Confidential10 Confidential10
• aquarium.h2o.ai
• H2O.ai’s software-as-a-service platform for training and initial
exploration
• Recommended for use as a training, workshops and tutorials
• Driverless AI Test Drive
• https://github.com/h2oai/tutorials/blob/master/DriverlessAI/Test-
Drive/test-drive.md
• Your data will disappear after the time period
• Run as many times as needed
H2O Aquarium 1
2
3
Confidential11
The Writing Recipes Process
• First write and test idea on
sample data before wrapping as
a recipe
• Download the Driverless AI
Recipes Repository for easy
access to examples
• Use the Recipe Templates to
ensure you have all required
components
https://github.com/h2oai/driverlessai-recipes
Confidential12
What does it take to write a custom recipe?
• Somewhere to write .py files
• To use or test your recipe you need Driverless AI 1.7.0 or later
• BYOR is not available in the current LTS release series (1.6.X)
• To test your code locally you need
• Python 3.6, numpy, datatable, & the Driverless AI python client
• Python development environment such as PyCharm or Spyder
• To write recipes you need
• The ability to write python code
Confidential13
The Testing Recipes Process
• Upload to Driverless AI to
automatically test on sample data
or
• Use the DAI Python or R client to
automate this process
or
• Test locally using a dummy
version of the RecipeTransformer
class we will be extending
Confidential14
What if I get stuck writing a custom recipe?
• Use error messages and stack traces from Driverless AI & your python development
environment to try to pinpoint what is causing the problem
• Write to the Driverless AI Experiment Logs (Example in Advanced Options below)
• Read the FAQ & look the templates: https://github.com/h2oai/driverlessai-recipes
• Follow along with the tutorial (Coming Soon): https://h2oai.github.io/tutorials/
• Ask on the community channel: https://www.h2o.ai/community/
Confidential15
Build Your Own Recipe
Full customization of the entire ML Pipeline through scikit-learn Python API
Custom Feature Engineering – fit_transform & transform
• Custom statistical transformations and embeddings for numbers, categories,
text, date/time, time-series, image, audio, zip, lat/long, ICD, ...
Custom Optimization Functions – f(id, actual, predicted, weight)
• Ranking, Pricing, Yield Scoring, Cost/Reward, any Business Metrics
Custom ML Algorithms – fit & predict
• Access to ML ecosystem: H2O-3, sklearn, Keras, PyTorch, CatBoost, etc.
Confidential16
https://h2oai.github.io/tutorials/
Confidential17 Confidential17
Dive into H2O
https://www.eventbrite.com/e/dive-into-h2o-new-york-tickets-76351721053
Confidential18
More details:
FAQ / Architecture Diagram etc.
https://github.com/h2oai/driverlessai-recipes
Confidential19
Bring Your Own Recipes
• What is BYOR?
• Building a Transformer
• Building a Scorer
• Building a Model Algorithm
• Advanced Options
• Writing Recipes Help
Confidential20
Advanced Options: Importing Packages
• Install and use the exact version of the
exact package you need for your recipe
• _global_modules_needed_by_name
• Use before class definition for when there are
multiple recipes in one file that need the
package
• _modules_needed_by_name
• Use in the class definition
"""Row-by-row similarity between two text columns based
on FuzzyWuzzy"""
# https://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-
string-matching-in-python/
# https://github.com/seatgeek/fuzzywuzzy
from h2oaicore.transformer_utils import
CustomTransformer
import datatable as dt
import numpy as np
_global_modules_needed_by_name = ['nltk==3.4.3']
import nltk
Confidential21
Advanced Options: Similar Recipes
• Extend your custom recipes when there
are multiple options or similar methods
and you want all of them to be tested
class FuzzyQRatioTransformer(FuzzyBaseTransformer,
CustomTransformer):
_method = "QRatio"
class FuzzyWRatioTransformer(FuzzyBaseTransformer,
CustomTransformer):
_method = "WRatio"
class
ZipcodeTypeTransformer(ZipcodeLightBaseTransformer,
CustomTransformer):
def get_property_name(self, value):
return 'zip_code_type'
class
ZipcodeCityTransformer(ZipcodeLightBaseTransformer,
CustomTransformer):
def get_property_name(self, value):
return 'city'
Confidential22
Advanced Options: Recipe Parameters
• set_default_params
• Parameters of models or transformers
• Access in functions with self.params
from h2oaicore.systemutils import physical_cores_count
class ExtraTreesModel(CustomModel):
_display_name = "ExtraTrees"
_description = "Extra Trees Model based on sklearn"
def set_default_params(self, accuracy=None,
time_tolerance=None, interpretability=None, **kwargs):
self.params = dict(
random_state=kwargs.get("random_state", 1234)
, n_estimators=min(kwargs.get("n_estimators", 100), 1000)
, criterion="gini" if self.num_classes >= 2 else "mse"
, n_jobs=self.params_base.get('n_jobs', max(1,
physical_cores_count)))
Confidential23
Advanced Options: Recipe Parameters
• mutate_params
• Random permutations of parameter
options for transformers and models
• Can get the options chosen in final
model from Auto Doc
class ExtraTreesModel(CustomModel):
_display_name = "ExtraTrees"
_description = "Extra Trees Model based on sklearn"
def mutate_params(self, accuracy=10, **kwargs):
if accuracy > 8:
estimators_list = [100, 200, 300, 500, 1000, 2000]
elif accuracy >= 5:
estimators_list = [50, 100, 200, 300, 400, 500]
else:
estimators_list = [10, 50, 100, 150, 200, 250, 300]
# Modify certain parameters for tuning
self.params["n_estimators"] =
int(np.random.choice(estimators_list))
self.params["criterion"] = np.random.choice(["gini", "entropy"]) if
self.num_classes >= 2 
else np.random.choice(["mse", "mae"])
Confidential24
Advanced Options: Writing to Logs
• Leave notes in the
experiment logs
from h2oaicore.systemutils import make_experiment_logger,
loggerinfo, loggerwarning
...
if self.context and self.context.experiment_id:
logger = make_experiment_logger(
experiment_id=self.context.experiment_id,
tmp_dir=self.context.tmp_dir,
experiment_tmp_dir=self.context.experiment_tmp_dir
)
...
loggerinfo(logger, "Prophet will use {} workers for
fitting".format(n_jobs))

Más contenido relacionado

La actualidad más candente

From Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsFrom Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsDatabricks
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2OSri Ambati
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Sri Ambati
 
Driverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabDriverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabSri Ambati
 
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...Sri Ambati
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Sri Ambati
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeaveworks
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneSri Ambati
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Databricks
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoSri Ambati
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceDatabricks
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleRobb Boyd
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningLviv Startup Club
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI WorkshopSri Ambati
 
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Weaveworks
 

La actualidad más candente (20)

From Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational AssistantsFrom Chatbots to Augmented Conversational Assistants
From Chatbots to Augmented Conversational Assistants
 
Machine Learning with H2O
Machine Learning with H2OMachine Learning with H2O
Machine Learning with H2O
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
 
Driverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on LabDriverless AI - Intro + Interactive Hands-on Lab
Driverless AI - Intro + Interactive Hands-on Lab
 
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...
Jakub Hava, H2O.ai - Productionizing Apache Spark Models using H2O - H2O Worl...
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Weave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any KubernetesWeave GitOps - continuous delivery for any Kubernetes
Weave GitOps - continuous delivery for any Kubernetes
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
Productionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness MarketplaceProductionizing Machine Learning in Our Health and Wellness Marketplace
Productionizing Machine Learning in Our Health and Wellness Marketplace
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine Learning
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
H2O Driverless AI Workshop
H2O Driverless AI WorkshopH2O Driverless AI Workshop
H2O Driverless AI Workshop
 
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
 

Similar a Get Started with Driverless AI Recipes - Hands-on Training

KrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdfKrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdfQA or the Highway
 
Magento 2 Workflows
Magento 2 WorkflowsMagento 2 Workflows
Magento 2 WorkflowsRyan Street
 
The Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With PuppetThe Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With PuppetMike Merideth
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP StackLorna Mitchell
 
Introduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitIntroduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitJennifer Davis
 
Test Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorksTest Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorksGemunu Priyadarshana
 
Beyond Domino Designer
Beyond Domino DesignerBeyond Domino Designer
Beyond Domino DesignerPaul Withers
 
The Art & Zen of Managing Nagios with Puppet
The Art & Zen of Managing Nagios with PuppetThe Art & Zen of Managing Nagios with Puppet
The Art & Zen of Managing Nagios with PuppetVictorOps
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleSri Ambati
 
Emerging chef patterns and practices
Emerging chef patterns and practicesEmerging chef patterns and practices
Emerging chef patterns and practicesOwain Perry
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAlberto Molina Coballes
 
Release Management with Visual Studio Team Services and Office Dev PnP
Release Management with Visual Studio Team Services and Office Dev PnPRelease Management with Visual Studio Team Services and Office Dev PnP
Release Management with Visual Studio Team Services and Office Dev PnPPetter Skodvin-Hvammen
 
Continuous Integration at Mollie
Continuous Integration at MollieContinuous Integration at Mollie
Continuous Integration at Molliewillemstuursma
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)Matt Bernhardt
 
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit EuropeAutomation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit EuropeAppDynamics
 
DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly
DevOpsGuys - DevOps Automation - The Good, The Bad and The UglyDevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly
DevOpsGuys - DevOps Automation - The Good, The Bad and The UglyDevOpsGroup
 

Similar a Get Started with Driverless AI Recipes - Hands-on Training (20)

KrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdfKrishnaToolComparisionPPT.pdf
KrishnaToolComparisionPPT.pdf
 
Magento 2 Workflows
Magento 2 WorkflowsMagento 2 Workflows
Magento 2 Workflows
 
The Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With PuppetThe Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With Puppet
 
Tool up your lamp stack
Tool up your lamp stackTool up your lamp stack
Tool up your lamp stack
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP Stack
 
Introduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitIntroduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen Summit
 
Test Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorksTest Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorks
 
Beyond Domino Designer
Beyond Domino DesignerBeyond Domino Designer
Beyond Domino Designer
 
The Art & Zen of Managing Nagios with Puppet
The Art & Zen of Managing Nagios with PuppetThe Art & Zen of Managing Nagios with Puppet
The Art & Zen of Managing Nagios with Puppet
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
 
Emerging chef patterns and practices
Emerging chef patterns and practicesEmerging chef patterns and practices
Emerging chef patterns and practices
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. Ansible
 
Release Management with Visual Studio Team Services and Office Dev PnP
Release Management with Visual Studio Team Services and Office Dev PnPRelease Management with Visual Studio Team Services and Office Dev PnP
Release Management with Visual Studio Team Services and Office Dev PnP
 
Continuous Integration at Mollie
Continuous Integration at MollieContinuous Integration at Mollie
Continuous Integration at Mollie
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platform
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)
 
Ci2
Ci2Ci2
Ci2
 
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit EuropeAutomation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
 
DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly
DevOpsGuys - DevOps Automation - The Good, The Bad and The UglyDevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly
DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly
 

Más de Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

Más de Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Get Started with Driverless AI Recipes - Hands-on Training

  • 1. Bring Your Own Recipes Make Your Own AI
  • 2. Confidential2 Confidential2 • aquarium.h2o.ai • H2O.ai’s software-as-a-service platform for training and initial exploration • Recommended for use as a training, workshops and tutorials • Driverless AI Test Drive • https://github.com/h2oai/tutorials/blob/master/DriverlessAI/Test- Drive/test-drive.md • Your data will disappear after the time period • Run as many times as needed H2O Aquarium 1 2 3
  • 3. Confidential3 Make Your Own AI: Agenda • Where does BYOR fit into Driverless AI? • What are custom recipes? • Tutorial: Using custom recipes • What does it take to write a recipe? • Example deep dive with the experts
  • 4. Confidential4 Key Capabilities of H2O Driverless AI • Automatic Feature Engineering • Automatic Visualization • Machine Learning Interpretability (MLI) • Automatic Scoring Pipelines • Natural Language Processing • Time Series Forecasting • Flexibility of Data & Deployment • NVIDIA GPU Acceleration • Bring-Your-Own Recipes
  • 6. Confidential6 The Workflow of Driverless AI SQL HDFS X Y Automatic Model Optimization Automatic Scoring Pipeline Deploy Low-latency Scoring to Production Modelling Dataset Model Recipes • i.i.d. data • Time-series • More on the way Advanced Feature Engineering Algorithm Model Tuning+ + Survival of the Fittest 1 Drag and Drop Data 2 Automatic Visualization 4 Automatic Model Optimization 5 Automatic Scoring Pipelines Snowflake Model Documentation  Upload your own recipe(s) Transformations Algorithms Scorers 3 Bring Your Own Recipes  Driverless AI executes automation on your recipes Feature engineering, model selection, hyper-parameter tuning, overfitting protection  Driverless AI automates model scoring and deployment using your recipes Amazon S3 Google BigQuery Azure Blog Storage
  • 7. Confidential7 What is a Recipe… • Machine Learning Pipelines’ model prepped data to solve a business question • Transformations are done on the original data to ensure it’s clean and most predictive • Additional datasets may be brought in to add insights • The data is modeled using an algorithm to find the optimal rules to solve the problem • We determine the best model by using a specific metric, or scorer • BYOR stands for Bring Your Own Recipe and it allows domain scientists to solve their problems faster and with more precision by adding their expertise in the form of Python code snippets • By providing your own custom recipes, you can gain control over the optimization choices that Driverless AI makes to best solve your machine learning problems
  • 8. Confidential8 • Flexibility, extensibility and customizations built into the Driverless AI platform • New open source recipes built by the data science community, curated by Kaggle Grand Masters @ H2O.ai • Data scientists can focus on domain-specific functions to build customizations • 1-click upload of your recipes – models, scorers and transformations • Driverless AI treats custom recipes as first-class citizens in the automatic machine learning workflow • Every business can have a recipe cookbook for collaborative data science within their organization …and Why Do You Care?
  • 10. Confidential10 Confidential10 • aquarium.h2o.ai • H2O.ai’s software-as-a-service platform for training and initial exploration • Recommended for use as a training, workshops and tutorials • Driverless AI Test Drive • https://github.com/h2oai/tutorials/blob/master/DriverlessAI/Test- Drive/test-drive.md • Your data will disappear after the time period • Run as many times as needed H2O Aquarium 1 2 3
  • 11. Confidential11 The Writing Recipes Process • First write and test idea on sample data before wrapping as a recipe • Download the Driverless AI Recipes Repository for easy access to examples • Use the Recipe Templates to ensure you have all required components https://github.com/h2oai/driverlessai-recipes
  • 12. Confidential12 What does it take to write a custom recipe? • Somewhere to write .py files • To use or test your recipe you need Driverless AI 1.7.0 or later • BYOR is not available in the current LTS release series (1.6.X) • To test your code locally you need • Python 3.6, numpy, datatable, & the Driverless AI python client • Python development environment such as PyCharm or Spyder • To write recipes you need • The ability to write python code
  • 13. Confidential13 The Testing Recipes Process • Upload to Driverless AI to automatically test on sample data or • Use the DAI Python or R client to automate this process or • Test locally using a dummy version of the RecipeTransformer class we will be extending
  • 14. Confidential14 What if I get stuck writing a custom recipe? • Use error messages and stack traces from Driverless AI & your python development environment to try to pinpoint what is causing the problem • Write to the Driverless AI Experiment Logs (Example in Advanced Options below) • Read the FAQ & look the templates: https://github.com/h2oai/driverlessai-recipes • Follow along with the tutorial (Coming Soon): https://h2oai.github.io/tutorials/ • Ask on the community channel: https://www.h2o.ai/community/
  • 15. Confidential15 Build Your Own Recipe Full customization of the entire ML Pipeline through scikit-learn Python API Custom Feature Engineering – fit_transform & transform • Custom statistical transformations and embeddings for numbers, categories, text, date/time, time-series, image, audio, zip, lat/long, ICD, ... Custom Optimization Functions – f(id, actual, predicted, weight) • Ranking, Pricing, Yield Scoring, Cost/Reward, any Business Metrics Custom ML Algorithms – fit & predict • Access to ML ecosystem: H2O-3, sklearn, Keras, PyTorch, CatBoost, etc.
  • 17. Confidential17 Confidential17 Dive into H2O https://www.eventbrite.com/e/dive-into-h2o-new-york-tickets-76351721053
  • 18. Confidential18 More details: FAQ / Architecture Diagram etc. https://github.com/h2oai/driverlessai-recipes
  • 19. Confidential19 Bring Your Own Recipes • What is BYOR? • Building a Transformer • Building a Scorer • Building a Model Algorithm • Advanced Options • Writing Recipes Help
  • 20. Confidential20 Advanced Options: Importing Packages • Install and use the exact version of the exact package you need for your recipe • _global_modules_needed_by_name • Use before class definition for when there are multiple recipes in one file that need the package • _modules_needed_by_name • Use in the class definition """Row-by-row similarity between two text columns based on FuzzyWuzzy""" # https://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy- string-matching-in-python/ # https://github.com/seatgeek/fuzzywuzzy from h2oaicore.transformer_utils import CustomTransformer import datatable as dt import numpy as np _global_modules_needed_by_name = ['nltk==3.4.3'] import nltk
  • 21. Confidential21 Advanced Options: Similar Recipes • Extend your custom recipes when there are multiple options or similar methods and you want all of them to be tested class FuzzyQRatioTransformer(FuzzyBaseTransformer, CustomTransformer): _method = "QRatio" class FuzzyWRatioTransformer(FuzzyBaseTransformer, CustomTransformer): _method = "WRatio" class ZipcodeTypeTransformer(ZipcodeLightBaseTransformer, CustomTransformer): def get_property_name(self, value): return 'zip_code_type' class ZipcodeCityTransformer(ZipcodeLightBaseTransformer, CustomTransformer): def get_property_name(self, value): return 'city'
  • 22. Confidential22 Advanced Options: Recipe Parameters • set_default_params • Parameters of models or transformers • Access in functions with self.params from h2oaicore.systemutils import physical_cores_count class ExtraTreesModel(CustomModel): _display_name = "ExtraTrees" _description = "Extra Trees Model based on sklearn" def set_default_params(self, accuracy=None, time_tolerance=None, interpretability=None, **kwargs): self.params = dict( random_state=kwargs.get("random_state", 1234) , n_estimators=min(kwargs.get("n_estimators", 100), 1000) , criterion="gini" if self.num_classes >= 2 else "mse" , n_jobs=self.params_base.get('n_jobs', max(1, physical_cores_count)))
  • 23. Confidential23 Advanced Options: Recipe Parameters • mutate_params • Random permutations of parameter options for transformers and models • Can get the options chosen in final model from Auto Doc class ExtraTreesModel(CustomModel): _display_name = "ExtraTrees" _description = "Extra Trees Model based on sklearn" def mutate_params(self, accuracy=10, **kwargs): if accuracy > 8: estimators_list = [100, 200, 300, 500, 1000, 2000] elif accuracy >= 5: estimators_list = [50, 100, 200, 300, 400, 500] else: estimators_list = [10, 50, 100, 150, 200, 250, 300] # Modify certain parameters for tuning self.params["n_estimators"] = int(np.random.choice(estimators_list)) self.params["criterion"] = np.random.choice(["gini", "entropy"]) if self.num_classes >= 2 else np.random.choice(["mse", "mae"])
  • 24. Confidential24 Advanced Options: Writing to Logs • Leave notes in the experiment logs from h2oaicore.systemutils import make_experiment_logger, loggerinfo, loggerwarning ... if self.context and self.context.experiment_id: logger = make_experiment_logger( experiment_id=self.context.experiment_id, tmp_dir=self.context.tmp_dir, experiment_tmp_dir=self.context.experiment_tmp_dir ) ... loggerinfo(logger, "Prophet will use {} workers for fitting".format(n_jobs))

Notas del editor

  1. Took ~ 6 minutes w/o pre-warming /data/Smalldata/gbm_test/titanic.csv /data/Kaggle/. CreditCard/CreditCard-train.csv‎ This is not uptodate: https://github.com/h2oai/tutorials/blob/master/DriverlessAI/aquarium/aquarium.md
  2. DAI quick overview Types of problems we can handle: TS, NLP, bi, multi, regress New engineered features to get new value out of your data Not a black box!!! MLI & Autodoc Production ready code (including all data transformations) Recipes to augment this process with your business knowledge
  3. Driverless AI is platform that is applicable across industries General purpose It’s not build for a single vertical or use case, but can be used for a wide range of basically all supervised problems Name a few use cases & industries Domain Scientists and SMEs are king when it comes to knowing their data and how to use it Combine together – turbo charge time to solution This is where recipes come in, we allow this expert knowledge to be added in to Driverless AI which refines the process for an individual use case Horizontal not vertical, core capabilities are agnostics, specific datasets use cases can be refined by domain expertise Meant to save time, not replace people but augment them to make them more efficient and provide guidance on how to operate on the data
  4. At a very high level, here’s how Driverless AI works: Ingest data from any data source: Hadoop, Snowflake, S3 object storage, Google BigQuery – Driverless AI is agnostic about the data source. Use Automatic Visualization and its various plots, graphics and charts to look at the data, and understand the data shape, outliers, missing values and so on. This is where a data scientist can quickly spot things such as bias in the data. Based on the problem type, Driverless AI will use recipes to do advanced feature engineering (automatically), while the model continues to iterate across thousands of choices, does parameter tuning, and looks for the best fit of the model. Finally, another amazing feature of Driverless AI is that it can build an automatic scoring pipeline, which means it can generate Python and Java code to deploy low latency scoring of that model into production. Imagine taking that scored model and propagating it across every edge device – on smart phones, or in cars, to continuously generate value. Through this process, Machine Learning Interpretability gives the data scientist the reason codes and insight into what model was generated and which features were used to build the model. Automatic documentation gives one an in-depth explanation of the entire feature engineering process. This satisfies that desire to have trust in AI with explainability. This entire process is done through a graphical user interface, making it easy for even a novice data scientist to be productive immediately. Of course, acceleration to achieve faster time to insight is important, and an IBM Power System server with GPUs (such as the AC922) will give the highest level of acceleration to gain results and insights faster. The slide makes reference to IID (Independent and Identically Distributed) data. This refers to data where the individual rows (or observations) are essentially independent of each other, unlike time series data where there is a relationship between rows in the dataset as they are collected over time. For example, whether a customer applying for a new mortgage is likely to default on that mortgage at some point in the future would use IID data such as age, income, current net worth and so on to make the prediction. For time series data, think of a utilities company; for example, a utility company tracks resource utilization over the course of a day, weeks and years where trends of usage over time periods are a factor in the prediction process.
  5. Recipes: bring in your own domain knowledge Bring in existing IP - reuse existing IP on top of the engine Newer data scientists can use their senior’s IP ->
  6. Took ~ 6 minutes w/o pre-warming /data/Smalldata/gbm_test/titanic.csv /data/Kaggle/. CreditCard/CreditCard-train.csv‎ This is not uptodate: https://github.com/h2oai/tutorials/blob/master/DriverlessAI/aquarium/aquarium.md