This talk was jointly organized by BSPIN and ASQ Bengaluru LMC.
It covered the following details-
- Industry 4.0
- Agile , CI/CD, DevOps
- DevOps and MLOps
- Evolution of MLOPS
- MLOps Capabilities
- AI Platform Pipelines
- Training and Tuning AI Platform
- Case Study
1. Introducing
MLOps
Anish Cheriyan
Vimal Das K,
Inputs from Jayaraj J
9th
April, 2022.
Event organized jointly by BSPIN and
ASQ Bengaluru LMC
Image-
https://medium.com/analytics-vidhya/applications-and-types-of
-machine-learning-c177a844bf38
2. Agenda
★ Industry 4.0
★ Agile , CI/CD, DevOps
★ DevOps and MLOps
★ Evolution of MLOPS
★ MLOps Capabilities
★ AI Platform Pipelines
★ Training and Tuning AI Platform
★ Case Study
3. BSPIN has been active since 1992 with the support of individuals and organizations in Bangalore.
BSPIN’s Mission is to help the Indian Software industry to achieve breakthrough in software quality
and productivity by active practice enabled by collaborations, learning, sharing and innovating from
the practitioners’ level.
BSPIN (Bangalore SPIN) is currently the largest operational SPIN across the globe. More details
about BSPIN is available on www.bspin.org
For MEMBERSHIP-
https://bspin.org/?page_id=1480#!/SignUp/Up
membership@bspin.org
4. ● ASQ is a global community of people passionate about quality, who use the tools, their ideas and expertise
to make our world work better. ASQ: The Global Voice of Quality.
● ASQ is a global organization with members in more than 130 countries. Headquartered in Milwaukee,
Wisconsin, we also operate centers in Mexico, India, and China. Our Society consists of member-led
communities that help members connect with other quality professionals and practitioners, advance their
knowledge and careers, and grow as thought leaders.
For MEMBERSHIP-
https://asq.org.in/membership/
5. Self Driving Car
AI usage for Cancer Detection
https://www.drugtargetreview.com/news/34555/ai-system-detects-canc
er-tumours-missed-by-conventional-diagnostics/
Image-
https://medium.com/analytics-vidhya/applications-and-types-of
-machine-learning-c177a844bf38
6. 2018- Self-driving Uber car that hit and killed woman did not recognize
that pedestrians jaywalk
https://www.nbcnews.com/tech/tech-news/self-driving-uber-car-hit-killed-woman-did-not-recognize-n1079281
7. Industry 4.0 and ABCs
Image Reference- https://hrishikeshiyengar.wordpress.com/2021/01/31/components-of-industry-4-0-the-heart-and-soul/
9. What is MLOps?
An approach, like
DevOps,
developed in the
context of ML
engineering
Unifies ML System
Development and
Operations
Standardized
Processes and
Technology
Capabilities for
building,deploying,
& operationalizing
ML systems rapidly
and reliably
13. Challenges of
Practical
Applications
of ML
Avoiding training-serving
skews that are due to
inconsistencies in data,
Handling concerns about
model fairness and
adversarial attacks.
Maintaining the veracity of
models by continuously
retraining
Performing ongoing
experimentation of new
data sources,
Preparing and maintaining
high-quality data for
training ML models.
Tracking models in
production to detect
performance degradation.
05
01
02 03
04
Challenges
14. Benefits of MLOps
Shorter development
cycles, and as a
result, shorter time
to market.
Better
collaboration
between teams.
Increased
reliability,
performance,
scalability, and
security of ML
systems.
Streamlined
operational and
governance
processes.
Increased return
on investment of
ML projects.
15. Relationship of Data
Engineering, ML Engineering,
and Application Engineering.
• Data engineering involves ingesting, integrating, curating, and
refining data to facilitate a broad spectrum of operational
tasks, data analytics tasks, and ML tasks.
• ML models are built and deployed in production using curated
data that is usually created by the data engineering team.
19. Experimentation
Lets data scientists and
ML researchers
collaboratively perform
EDA, create prototype
model architectures, and
implement training
routines.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
20. Data Processing
Lets you prepare and
transform large amounts
of data for ML at scale in
ML development, in
continuous training
pipelines, and in
prediction serving.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
21. Model training
lets you efficiently and
cost-effectively run
powerful algorithms for
training ML models
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
22. Model evaluation
lets you assess the
effectiveness of your
model, interactively during
experimentation and
automatically in
production.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
23. Model serving
lets you deploy and serve
your models in production
environments.
Key functionalities in
model serving include
support for near-real-time,
low latency prediction,
logging etc…
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
24. Online
experimentation
lets you understand how
newly trained models
perform in production
settings compared to the
current models before you
release the new model to
production.
25. Model monitoring
lets you track the efficiency
and effectiveness of the
deployed models in
production to ensure
predictive quality and
business continuity.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
26. ML pipelines
lets you instrument,
orchestrate, and automate
complex ML training and
prediction pipelines in
production.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
27. Model registry
lets you govern the lifecycle
of the ML models in a central
repository. This ensures the
quality of the production
models and enables model
discovery.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
28. Dataset & feature
repository
lets you unify the
definition and the storage
of the ML data assets.
Helps data scientists and
ML researchers save time
on data preparation and
feature engineering
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
29. ML metadata &
artifact repository
Enables reproducibility and
debugging of complex ML
tasks and pipelines.
Metadata about ML artifacts
such as descriptive statistics,
data schemas, trained
models, and evaluation
results are tracked in it.
Image: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
31. Problem Statement
Automatically classify
customer support
ticket and route to the
right support agent
Automated customer support routing
Data Selection &
Exploration
- Ticket Info
- Trip info
- Customer info
Feature engineering
- Ticket message
- Time after trip
- …
Model prototyping &
validation
- learning-to-rank
approach
- retrieval-based
pointwise ranking
Training pipeline
- Batch jobs
- Yarn/Mesos cluster
Data pipeline
- Data transformation
- Kafka (pub/sub)
- Samza (stream
processing)
- Cassandra (for
training)
Model Refresh
- Model tuning
- Model evaluation
- Model validation
Service integration
- Offline mode
- Online mode
- Library mode
CI/CD pipeline
- Dynamic model
loading
- Artifacts Validation
- Serving validations
Online
experimentation
- A/B testing
Monitoring
integration
- Kafka
- Kibana
(dashboards)
Model monitoring
- RMSLE
- RMSE
- R-suqared
Data & feature
repository
- HDFS (data)
- Cassandra (feature
metadata)
Public Launch
- Gradual rollout
- Online
/Batch/Embedded
inference
Model repository
- HDFS (zip archive)
- Cassandra (model
metadata)
ML Development Training operationalization
Continuous training
Prediction serving Model deployment
Model monitoring
33. Take Away
Delivering business value through ML is not only about building the best ML model for
the use case at hand, but also about building an integrated ML system that operates
continuously to adapt to changes in the dynamics of the business environment.
Such an ML system involves
❑ Collecting, processing, and managing ML datasets and features;
❑ Training, and evaluating models at scale;
❑ Serving the model for predictions;
❑ Monitoring the model performance in production; and
❑ Tracking model metadata and artifacts.
… and MLOps enables building such an ML System.
35. References
• Practitioners guide to MLOps:A framework for continuous delivery and automation of machine learning.Khalid Salama,
Jarek Kazmierczak, Donna Schut, Google Cloud White Paper, 2021
•
• Engineering MLOps: Rapidly build, test, and manage production-ready machine learning life cycles at scale-
Emmanuel Raj