SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Accelerate Your Machine
Learning Pipeline with
AutoML and MLflow
Elena Boiarskaia
Senior Data Scientist at H2O.ai
Eric Gudgion
Senior Principal Solutions Architect at H2O.ai
Agenda
§ Challenges with ML
§ Integration solution
§ Data Scientist workflow
§ Live Demo
§ Dev Ops workflow
Challenges with Machine Learning
▪ Feature selection
▪ Feature
transformation
▪ Select which
features
transformers work
▪ Reuse engineered
features
▪ Select scoring
metric
▪ Algorithm selection
▪ Hyperparameter
tuning
▪ Ensemble methods
• Model Training
• Feature Engineering
▪ Track experiments
▪ Select which model
to deploy
▪ Deploy models in
variety of
environments
• Model Deployment
▪ User friendly
interface
▪ Build visualizations
and dashboards
▪ Update visuals with
model predictions
• Presenting Results
Solution via Integration
Driverless AI:
▪ Automated feature
selection
▪ Automated feature
engineering
▪ Custom feature
transformers,
models and scoring
metrics
Driverless AI:
▪ Genetic algorithm
▪ Algorithm selection
▪ Hyperparameter
tuning
▪ Explainability
▪ Stand alone model
object (MOJO)
• Model Training
• Feature Engineering
Mlflow:
▪ Track experiments
▪ Reproducible
projects
▪ Model deployment
▪ Model registry
• Model Deployment
Wave:
▪ Rapid app
development
▪ Python based
▪ Create dashboards
and visualizations
▪ Realtime apps
connected to
models and data
sources
• Presenting Results
Data Scientist Workflow
Example Notebook
Workflow
Update Update table with Databricks Delta
Score Score new data with Driverless AI model on Databricks
Log Log Driverless AI model in MLFlow
Train Send data to Driverless AI and train model
Prepare Prepare data with Spark on Databricks
Manage Store and manage data with Databricks Delta
Live Demo
Example Wave App Workflow
Import Data
Set up Driverless AI experiment
Automatically generate Databricks notebook to run and log experiment
Send notebook to a Databricks cluster to
run
Trains Driverless AI
model
Logs model in MLFlow
Dev Ops Workflow
Model Deployment Questions to Consider
• Batch
• Real-time
Type
• SLA
• Data size
• Concurrent requests
Scoring Latency
• Errors
• Metrics
Monitoring
Import Notebook
• Auto generated
• Integrate with existing
pipeline
• Easy 4 Step process
• Can schedule notebook
via UI
Simple Template Example
• Library provided for
different
environments
• Executes score using
C++ code
Internal Scorer
• Invoke call from
Databricks Worker nodes
• Notebook friendly API
• Create DataFrame
• Specify model to score
• Returns DataFrame with
predictions
Predictions
(MOJO)
Data
Warehouse
Example Execution
• Fast…
• Data loaded into
DataFrame
• Model specified
External Rest API Scorer
• Invoke call from Databricks
Worker nodes
• Notebook friendly API
• Create DataFrame
• Specify model to score
• Call Endpoint
• Call multiple models in one
call
• Returns DataFrame per model
• Handy to test new vs. old model
Predictions
(MOJO)
Data
Warehouse
Other Model
Users
Rest Endpoint
Example Execution
• Rest Endpoint runs
in the customers
VPC
• Feeds into Model
Monitoring
External Batch Scorer
• Use Databricks as a JDBC
data source
• Supports Delta Tables
and any Tables
accessible with SQL
• Connects via JDBC
connection
Predictions
(MOJO)
Data
Warehouse
Batch Job
CPU CPU CPU
Saving Results
§ Update existing table or
Insert into a new table
§ Update method
▪ Single row
▪ Bulk Upload
§ Catching runtime data
changes
Conclusion
Databricks and H2O.ai integration offers:
• End to end pipeline from data management to model
deployment
• Highly scalable model training and scoring
• Leverage advanced automated ML with Driverless AI
• Advanced feature engineering and feature selection
• Highly accurate and explainable model
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Más contenido relacionado

La actualidad más candente

Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 

La actualidad más candente (20)

Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene Polonichko
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
 
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflowAutomatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Azure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopAzure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshop
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptxEncrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
Encrypting and Protecting Your Data in Neo4j(Jeff_Tallman).pptx
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Getting Started with Databricks SQL Analytics
Getting Started with Databricks SQL AnalyticsGetting Started with Databricks SQL Analytics
Getting Started with Databricks SQL Analytics
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
Wix's ML Platform
Wix's ML PlatformWix's ML Platform
Wix's ML Platform
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
Databricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI InitiativesDatabricks + Snowflake: Catalyzing Data and AI Initiatives
Databricks + Snowflake: Catalyzing Data and AI Initiatives
 

Similar a Accelerate Your ML Pipeline with AutoML and MLflow

Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
Lviv Startup Club
 

Similar a Accelerate Your ML Pipeline with AutoML and MLflow (20)

With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?With Automated ML, is Everyone an ML Engineer?
With Automated ML, is Everyone an ML Engineer?
 
Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
Vitalii Bondarenko and Eugene Berko "Cloud AI Platform as an accelerator of e...
 
Predicting Flights with Azure Databricks
Predicting Flights with Azure DatabricksPredicting Flights with Azure Databricks
Predicting Flights with Azure Databricks
 
05 entity framework
05 entity framework05 entity framework
05 entity framework
 
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
[DSC Europe 23] Petar Zecevic - ML in Production on Databricks
 
A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
 
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
How to JavaOne 2016 - Generate Customized Java 8 Code from Your Database [TUT...
 
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
JavaOne2016 - How to Generate Customized Java 8 Code from Your Database [TUT4...
 
Testing Tools Online Training.pdf
Testing Tools Online Training.pdfTesting Tools Online Training.pdf
Testing Tools Online Training.pdf
 
Machine learning
Machine learningMachine learning
Machine learning
 
Testing Tools course training institute hyderabad – Best software training in...
Testing Tools course training institute hyderabad – Best software training in...Testing Tools course training institute hyderabad – Best software training in...
Testing Tools course training institute hyderabad – Best software training in...
 
Deep-Dive to Application Insights
Deep-Dive to Application Insights Deep-Dive to Application Insights
Deep-Dive to Application Insights
 
GCP Deployment- Vertex AI
GCP Deployment- Vertex AIGCP Deployment- Vertex AI
GCP Deployment- Vertex AI
 
Productionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflowProductionalizing Models through CI/CD Design with MLflow
Productionalizing Models through CI/CD Design with MLflow
 
Micro strategy 7i
Micro strategy 7iMicro strategy 7i
Micro strategy 7i
 
OrigoDB - take the red pill
OrigoDB - take the red pillOrigoDB - take the red pill
OrigoDB - take the red pill
 
Bringing DevOps to the Database
Bringing DevOps to the DatabaseBringing DevOps to the Database
Bringing DevOps to the Database
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Azure Application insights - An Introduction
Azure Application insights - An IntroductionAzure Application insights - An Introduction
Azure Application insights - An Introduction
 

Más de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Último

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 

Último (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 

Accelerate Your ML Pipeline with AutoML and MLflow

  • 1. Accelerate Your Machine Learning Pipeline with AutoML and MLflow Elena Boiarskaia Senior Data Scientist at H2O.ai Eric Gudgion Senior Principal Solutions Architect at H2O.ai
  • 2. Agenda § Challenges with ML § Integration solution § Data Scientist workflow § Live Demo § Dev Ops workflow
  • 3. Challenges with Machine Learning ▪ Feature selection ▪ Feature transformation ▪ Select which features transformers work ▪ Reuse engineered features ▪ Select scoring metric ▪ Algorithm selection ▪ Hyperparameter tuning ▪ Ensemble methods • Model Training • Feature Engineering ▪ Track experiments ▪ Select which model to deploy ▪ Deploy models in variety of environments • Model Deployment ▪ User friendly interface ▪ Build visualizations and dashboards ▪ Update visuals with model predictions • Presenting Results
  • 4. Solution via Integration Driverless AI: ▪ Automated feature selection ▪ Automated feature engineering ▪ Custom feature transformers, models and scoring metrics Driverless AI: ▪ Genetic algorithm ▪ Algorithm selection ▪ Hyperparameter tuning ▪ Explainability ▪ Stand alone model object (MOJO) • Model Training • Feature Engineering Mlflow: ▪ Track experiments ▪ Reproducible projects ▪ Model deployment ▪ Model registry • Model Deployment Wave: ▪ Rapid app development ▪ Python based ▪ Create dashboards and visualizations ▪ Realtime apps connected to models and data sources • Presenting Results
  • 6. Example Notebook Workflow Update Update table with Databricks Delta Score Score new data with Driverless AI model on Databricks Log Log Driverless AI model in MLFlow Train Send data to Driverless AI and train model Prepare Prepare data with Spark on Databricks Manage Store and manage data with Databricks Delta
  • 8. Example Wave App Workflow Import Data Set up Driverless AI experiment Automatically generate Databricks notebook to run and log experiment Send notebook to a Databricks cluster to run Trains Driverless AI model Logs model in MLFlow
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 15. Model Deployment Questions to Consider • Batch • Real-time Type • SLA • Data size • Concurrent requests Scoring Latency • Errors • Metrics Monitoring
  • 16. Import Notebook • Auto generated • Integrate with existing pipeline • Easy 4 Step process • Can schedule notebook via UI
  • 17. Simple Template Example • Library provided for different environments • Executes score using C++ code
  • 18. Internal Scorer • Invoke call from Databricks Worker nodes • Notebook friendly API • Create DataFrame • Specify model to score • Returns DataFrame with predictions Predictions (MOJO) Data Warehouse
  • 19. Example Execution • Fast… • Data loaded into DataFrame • Model specified
  • 20. External Rest API Scorer • Invoke call from Databricks Worker nodes • Notebook friendly API • Create DataFrame • Specify model to score • Call Endpoint • Call multiple models in one call • Returns DataFrame per model • Handy to test new vs. old model Predictions (MOJO) Data Warehouse Other Model Users Rest Endpoint
  • 21. Example Execution • Rest Endpoint runs in the customers VPC • Feeds into Model Monitoring
  • 22. External Batch Scorer • Use Databricks as a JDBC data source • Supports Delta Tables and any Tables accessible with SQL • Connects via JDBC connection Predictions (MOJO) Data Warehouse Batch Job CPU CPU CPU
  • 23.
  • 24. Saving Results § Update existing table or Insert into a new table § Update method ▪ Single row ▪ Bulk Upload § Catching runtime data changes
  • 25. Conclusion Databricks and H2O.ai integration offers: • End to end pipeline from data management to model deployment • Highly scalable model training and scoring • Leverage advanced automated ML with Driverless AI • Advanced feature engineering and feature selection • Highly accurate and explainable model
  • 26. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.