SlideShare a Scribd company logo
1 of 18
Download to read offline
Intro to ML-Ops
- Presented by Avinash Patil,
DevOps and Budding ML-Ops
“ Machine Learning means Building a model from example inputs to make
data-driven predictions vs. following strictly static program instructions. ”
Machine Learning Workflow
An orchestrated and repeatable pattern which systematically transforms and
processes information to create prediction solutions.
1
Asking
the
right
question
?
3
Selecting
the
Algorithm
4
Training
the
m
odel
2
Preparing
Data
5
Testing
the
m
odel
What is ML-Ops
★ MLOps is about building a scalable team ML Researcher,
Data Engineer , Product Managers, DevOps.
★ Extension of DevOps to ML as first class citizen.
★ Infrastructure and tooling to Productionize ML
Software Engineering
Developer OperationsMachine Learning
ML-Ops
Continuous Delivery for Machine Learning (CD4ML) :
a software engineering approach in which a cross-functional team produces machine learning
applications based on code, data, and models in small and safe increments that can be reproduced and
reliably released at any time, in short adaptation cycles
Challenges in Typical Organization
Common functional silos in large organizations can create barriers, stifling the ability to automate the end-to-end process of
deploying ML applications to production
I. Organizational Challenges : Different teams, Handover is like throw over the wall
II. Technical Challenges: How to make the process reproducible and auditable. Because these teams use different
tools and follow different workflows, it becomes hard to automate it end-to-end.
Technical Components of CD4ML
1. Discoverable and Accessible Data : Data Pipeline, Collect and make data available as “Data Lake”
2. Reproducible Model Training : ML Pipeline : Split data into Training and Validation Set.
3. Model Serving: Embedded model / Model published as Service / Model Published as Data
4. Testing and Quality in Machine Learning : Validating Data Schemas ,Component Integration, Model Quality, Model
Bias and Fairness
5. Experiments Tracking: Version control the data and git versioning of data science experiments
6. Model Deployment: Train the model to make significant decisions
7. Continuous Delivery Orchestration: Provision and execute ML Pipeline, releases and automate governance
stages
8. Model Monitoring and Observability: Integrate tools for log aggregation, metrics and ML models behavioral data.
Discover and Accessible Data:
★ Gather data from your core transactional systems
★ Also bring in data sources from outside your organization
★ Organize data volumes as Data Lake or Collection of Real-time data streams
★ Data Pipeline : Transform , Cleanup and De-normalize multiple files
★ Use Amazon S3 / Google Cloud Storage
★ Version Control the derived/transformed data as an artifact.
Reproducible Model Training
★ Process that takes data and code as input, and produces a trained ML model
as the output. This process usually involves data cleaning and pre-processing,
feature engineering, model and algorithm selection, model optimization and
evaluation.
Model Serving
★ Embedded Model: When Model artifact is packaged together with consuming application. E.g.
Serialize object file {Pickle in Python}, MLeap as common to Tensorflow, Sci-kit learn Models
★ Models Deployed as Separate Service: Model is decoupled and wrapped in service and can be used
by consuming applications and also easy to upgrade the release versions, as it is distinct service, it
may introduce some latency. E.g. Wrap your model for deployment into their MLaaS such AWS
Sagemaker
★ Model Published as Data: Model is also treated and published independently, but the consuming
application will ingest it as data at runtime. We have seen this used in streaming/real-time scenarios
where the application can subscribe to events that are published whenever a new model version is
released, and ingest them into memory while continuing to predict using the previous version.
E.g. Apache Spark Model Serving through REST API
Testing and Quality in ML
★ Validating Data
★ Validating Component Integration
★ Validating Model Quality
★ Validating Model Fairness and Bias
Experiment Tracking
★ As ML model is research centric, Data Scientists conducts new experiments
to analyse data
★ Track experiments to version control philosophy
★ Integrate branches of experiments with Training Model
★ DVC and MLFlow Tracking can be used
Model Deployment
★ Multiple Models : Publishing APIs for different models for predicting
consumer applications
★ Shadow Models: Replace a version in Production with current one as Shadow
Model
★ Competing Models: Complex and managing multiple versions of models in
production like A/B test and routing choices based to make statistically
significant decisions
★ Online Learning Model: Model to make online, real-time decisions and
continuously improve performance with the sequential arrival of data
Continuous Delivery Orchestration
★ Model automated and manual ML governance stages into our deployment pipeline, to help detect
model bias, fairness, or to introduce explainability for humans to decide if the model should further
progress towards production or not.
★ Machine Learning Pipeline: to perform model training and evaluation within the GoCD agent, as well
as executing the basic threshold test to decide if the model can be promoted or not. If the model is
good, we perform a dvc push command to publish it as an artifact.
★ Application Deployment Pipeline: to build and test the application code, to fetch the promoted model
from the upstream pipeline using dvc pull, to package a new combined artifact that contains the
model and the application as a Docker image, and to deploy them to a Kubernetes production
cluster.
Model Monitoring and Observability
★ Model inputs: what data is being fed to the models, giving visibility into any training-serving skew.
Model outputs: what predictions and recommendations are the models making from these inputs, to
understand how the model is performing with real data.
★ Model interpretability outputs: metrics such as model coefficients, ELI5, or LIME outputs that allow
further investigation to understand how the models are making predictions to identify potential
overfit or bias that was not found during training.
★ Model outputs and decisions: what predictions our models are making given the production input
data, and also which decisions are being made with those predictions. Sometimes the application
might choose to ignore the model and make a decision based on predefined rules (or to avoid future
bias).
★ User action and rewards: based on further user action, we can capture reward metrics to
understand if the model is having the desired effect. For example, if we display product
recommendations, we can track when the user decides to purchase the recommended product as a
reward.
★ Model fairness: analysing input data and output predictions against known features that could bias,
such as race, gender, age, income groups, etc.
End to End CD4ML Process
Practical Example:
References :
➢ https://mlflow.org
➢ https://martinfowler.com/articles/cd4ml.html
➢ https://github.com/ThoughtWorksInc/cd4ml-workshop
➢ https://www.slideshare.net/ThoughtWorks/continuous-delivery-for-machine-l
earning-198815316
➢ https://dvc.org/
➢ https://mleap-docs.combust.ml/getting-started/

More Related Content

What's hot

Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
Databricks
 

What's hot (20)

MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleMLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlow
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Managing the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflowManaging the Complete Machine Learning Lifecycle with MLflow
Managing the Complete Machine Learning Lifecycle with MLflow
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Machine Learning Operations & Azure
Machine Learning Operations & AzureMachine Learning Operations & Azure
Machine Learning Operations & Azure
 
MLOps with Azure DevOps
MLOps with Azure DevOpsMLOps with Azure DevOps
MLOps with Azure DevOps
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)MLOps with serverless architectures (October 2018)
MLOps with serverless architectures (October 2018)
 
Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
 

Similar to Ml ops intro session

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
vitm11
 

Similar to Ml ops intro session (20)

AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Managing the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflowManaging the Machine Learning Lifecycle with MLflow
Managing the Machine Learning Lifecycle with MLflow
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
MLOPS By Amazon offered and free download
MLOPS By Amazon offered and free downloadMLOPS By Amazon offered and free download
MLOPS By Amazon offered and free download
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
 
A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)A survey on Machine Learning In Production (July 2018)
A survey on Machine Learning In Production (July 2018)
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023
 
How to use continual learning in your ML models
How to use continual learning in your ML modelsHow to use continual learning in your ML models
How to use continual learning in your ML models
 
Cnvrg webinar continual learning
Cnvrg webinar   continual learningCnvrg webinar   continual learning
Cnvrg webinar continual learning
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Ml ops intro session

  • 1. Intro to ML-Ops - Presented by Avinash Patil, DevOps and Budding ML-Ops
  • 2. “ Machine Learning means Building a model from example inputs to make data-driven predictions vs. following strictly static program instructions. ”
  • 3. Machine Learning Workflow An orchestrated and repeatable pattern which systematically transforms and processes information to create prediction solutions. 1 Asking the right question ? 3 Selecting the Algorithm 4 Training the m odel 2 Preparing Data 5 Testing the m odel
  • 4. What is ML-Ops ★ MLOps is about building a scalable team ML Researcher, Data Engineer , Product Managers, DevOps. ★ Extension of DevOps to ML as first class citizen. ★ Infrastructure and tooling to Productionize ML Software Engineering Developer OperationsMachine Learning ML-Ops
  • 5. Continuous Delivery for Machine Learning (CD4ML) : a software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles
  • 6. Challenges in Typical Organization Common functional silos in large organizations can create barriers, stifling the ability to automate the end-to-end process of deploying ML applications to production I. Organizational Challenges : Different teams, Handover is like throw over the wall II. Technical Challenges: How to make the process reproducible and auditable. Because these teams use different tools and follow different workflows, it becomes hard to automate it end-to-end.
  • 7. Technical Components of CD4ML 1. Discoverable and Accessible Data : Data Pipeline, Collect and make data available as “Data Lake” 2. Reproducible Model Training : ML Pipeline : Split data into Training and Validation Set. 3. Model Serving: Embedded model / Model published as Service / Model Published as Data 4. Testing and Quality in Machine Learning : Validating Data Schemas ,Component Integration, Model Quality, Model Bias and Fairness 5. Experiments Tracking: Version control the data and git versioning of data science experiments 6. Model Deployment: Train the model to make significant decisions 7. Continuous Delivery Orchestration: Provision and execute ML Pipeline, releases and automate governance stages 8. Model Monitoring and Observability: Integrate tools for log aggregation, metrics and ML models behavioral data.
  • 8. Discover and Accessible Data: ★ Gather data from your core transactional systems ★ Also bring in data sources from outside your organization ★ Organize data volumes as Data Lake or Collection of Real-time data streams ★ Data Pipeline : Transform , Cleanup and De-normalize multiple files ★ Use Amazon S3 / Google Cloud Storage ★ Version Control the derived/transformed data as an artifact.
  • 9. Reproducible Model Training ★ Process that takes data and code as input, and produces a trained ML model as the output. This process usually involves data cleaning and pre-processing, feature engineering, model and algorithm selection, model optimization and evaluation.
  • 10. Model Serving ★ Embedded Model: When Model artifact is packaged together with consuming application. E.g. Serialize object file {Pickle in Python}, MLeap as common to Tensorflow, Sci-kit learn Models ★ Models Deployed as Separate Service: Model is decoupled and wrapped in service and can be used by consuming applications and also easy to upgrade the release versions, as it is distinct service, it may introduce some latency. E.g. Wrap your model for deployment into their MLaaS such AWS Sagemaker ★ Model Published as Data: Model is also treated and published independently, but the consuming application will ingest it as data at runtime. We have seen this used in streaming/real-time scenarios where the application can subscribe to events that are published whenever a new model version is released, and ingest them into memory while continuing to predict using the previous version. E.g. Apache Spark Model Serving through REST API
  • 11. Testing and Quality in ML ★ Validating Data ★ Validating Component Integration ★ Validating Model Quality ★ Validating Model Fairness and Bias
  • 12. Experiment Tracking ★ As ML model is research centric, Data Scientists conducts new experiments to analyse data ★ Track experiments to version control philosophy ★ Integrate branches of experiments with Training Model ★ DVC and MLFlow Tracking can be used
  • 13. Model Deployment ★ Multiple Models : Publishing APIs for different models for predicting consumer applications ★ Shadow Models: Replace a version in Production with current one as Shadow Model ★ Competing Models: Complex and managing multiple versions of models in production like A/B test and routing choices based to make statistically significant decisions ★ Online Learning Model: Model to make online, real-time decisions and continuously improve performance with the sequential arrival of data
  • 14. Continuous Delivery Orchestration ★ Model automated and manual ML governance stages into our deployment pipeline, to help detect model bias, fairness, or to introduce explainability for humans to decide if the model should further progress towards production or not. ★ Machine Learning Pipeline: to perform model training and evaluation within the GoCD agent, as well as executing the basic threshold test to decide if the model can be promoted or not. If the model is good, we perform a dvc push command to publish it as an artifact. ★ Application Deployment Pipeline: to build and test the application code, to fetch the promoted model from the upstream pipeline using dvc pull, to package a new combined artifact that contains the model and the application as a Docker image, and to deploy them to a Kubernetes production cluster.
  • 15. Model Monitoring and Observability ★ Model inputs: what data is being fed to the models, giving visibility into any training-serving skew. Model outputs: what predictions and recommendations are the models making from these inputs, to understand how the model is performing with real data. ★ Model interpretability outputs: metrics such as model coefficients, ELI5, or LIME outputs that allow further investigation to understand how the models are making predictions to identify potential overfit or bias that was not found during training. ★ Model outputs and decisions: what predictions our models are making given the production input data, and also which decisions are being made with those predictions. Sometimes the application might choose to ignore the model and make a decision based on predefined rules (or to avoid future bias). ★ User action and rewards: based on further user action, we can capture reward metrics to understand if the model is having the desired effect. For example, if we display product recommendations, we can track when the user decides to purchase the recommended product as a reward. ★ Model fairness: analysing input data and output predictions against known features that could bias, such as race, gender, age, income groups, etc.
  • 16. End to End CD4ML Process
  • 18. References : ➢ https://mlflow.org ➢ https://martinfowler.com/articles/cd4ml.html ➢ https://github.com/ThoughtWorksInc/cd4ml-workshop ➢ https://www.slideshare.net/ThoughtWorks/continuous-delivery-for-machine-l earning-198815316 ➢ https://dvc.org/ ➢ https://mleap-docs.combust.ml/getting-started/