Bodywork - GitOps for Machine Learning

Bodywork – GitOps for Machine Learning
Continuous deployment for data science
and machine learning teams
Alex Ioannides, March 2021

1.Continuous deployment in machine learning
2.Why machine learning projects are hard to deploy
3.What’s Bodywork and how does it help?
4.Arriving at GitOps
5.Case studies and demos
AGENDA

• Co-founder of Bodywork Machine Learning, the creators of Bodywork.
• ML engineer for Oracle AI Apps division, focusing on MLOps.
• Built the ML functions for Perfect Channel and LiveMore Capital.
• 8 years in financial services – Credit Suisse, Standard Bank, Moodys.
• PhD in computational neuroscience from UCL.
• Recovering theoretical physicist (going back a long time now…).
ABOUT ME

I am not a software engineer, and this is not going to be a lecture on
DevOps best-practices for data scientists and ML engineers.
I have, however, worked with some brilliant developers and learnt a lot
from them in the process. This talk will be centered from a ML
engineer’s frame-of-reference – i.e., what are useful best practices for
ML and why.
ABOUT ME

CI / CD
CONTINUOUS DEPLOYMENT IN ML

Continuous Integration

Aside - Writing tests for ML Systems
Example Unit Tests for ML:
• Feature transformation pipelines yield expected results.
• Training routines yield ‘valid’ models for ‘valid’ data.
• Unseen categories and/or outliers are handled gracefully.
Example Integration Tests for ML:
• You can read-from and write-to object storge or your model registry,
feature store, etc.
• Requests to scoring services with REST APIs yield expected responses.
➡️ “Effective Testing for Machine Learning Systems” by Jeremy Jordan

Continuous Deployment
?

Continuous Deployment – push models to cloud object storage

Continuous Deployment – push models to a registry

Continuous Deployment – push container images to a cloud platform

Continuous Deployment – push container images to a Kubernetes cluster

Aside – Docker and Kubernetes are ideal for building ML Systems
Docker:
• Reproducible environments.
• Compose ML pipelines using containers as building-blocks.
Kubernetes:
• Provides all the resources you could want for building a MLOps
platform – e.g., jobs, services and easy networking.
• Resilience and horizontal-scaling are built-in from the bottom-up.
➡️ “Deploying Python ML Models with Flask, Docker and Kubernetes" by Me

Example – serve a pre-trained model via a microservice with a REST API
WHY ML PROJECTS ARE HARD TO DEPLOY

WHY ML PROJECTS ARE HARD TO DEPLOY
Example – serve a pre-trained model via a microservice with a REST API
• Multiple points of failure.
• Requires a more-than-basic understanding of Docker and Linux.
• Needs experience with container orchestration (e.g., Kubernetes).
• Doesn’t scale easily.
• Maintaining Docker images is an extra responsibility for ML engineers.
• This is not Machine Learning - this is DevOps.

WHAT’S BODYWORK AND HOW DOES IT HELP?

Continuous Deployment – from GitHub to Kubernetes with Bodywork

Tackle deployment problems head-on and separately from project codebase:
• Create a generic Linux container with Python and Git installed.
• Use Git to pull project code into the cluster environment and then
dynamically install requirements - removes the need to build, push and
manage container images on a project-by-project basis.
• Each stage runs a Python executable defining a task – either as a
discrete batch job or the deployment of a long-running service.
• Combine multiple stages into workflows, using a workflow-controller.

Manage complex pipelines and service topologies, with high concurrency.

Use existing code with Bodywork’s project format – no new APIs to learn!

Enables CI/CD for ML projects.

ARRIVING AT GITOPS
➡️ www.gitops.tech

ARRIVING AT GITOPS
“The core idea of GitOps is having a Git repository that always contains
declarative descriptions of the infrastructure currently desired in the
production environment and an automated process to make the production
environment match the described state in the repository.
If you want to deploy a new application or update an existing one, you
only need to update the repository - the automated process handles
everything else. It’s like having cruise control for managing your
applications in production.”

(1) Serving a model via a microservice with a REST API
CASE STUDIES AND DEMOS
➡️ https://github.com/bodywork-ml/bodywork-serve-model-project

(2) Train-and-serve ML pipeline
➡️ https://github.com/bodywork-ml/bodywork-ml-pipeline-project

(3) Jupyter pipelines
➡️ https://github.com/bodywork-ml/bodywork-jupyter-pipeline-project

(4) ML Dashboards with Plotly
➡️ https://github.com/bodywork-ml/bodywork-ml-dashboard-project

GET IN TOUCH
github.com/bodywork-ml
bodywork.readthedocs.io/en/latest/
www.bodyworkml.com
@BodyworkML

Bodywork - GitOps for Machine Learning

Bodywork - GitOps for Machine Learning

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Bodywork - GitOps for Machine Learning

Similar a Bodywork - GitOps for Machine Learning (20)

Último

Último (20)

Bodywork - GitOps for Machine Learning

Notas del editor