For this talk, I will be discussing about various approaches to accelerate deep learning solutions from notebooks or research environment to production environment and how these solutions can be transformed as an enterprise level end to end Deep Learning Solution, which can be consumed as a service by any software application, with a practical use-case example.
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to Production
1. Enterprise DL - Accelerating Deep
Learning Solutions to Production
Aditya Bhattacharya
Lead ML Engineer, West Pharmaceuticals
AI Researcher, MUST Research
2. About Me My Associations
My Interests
• Lead ML Engineer, West Pharmaceuticals
• AI Researcher, MUST Research
- ADITYA BHATTACHARYA
Vision Text Speech
3. Objectives of this discussion
Discussions on accelerating DL solutions from notebook or research
environment to production environment
Discussions on making DL solutions scalable and sustainable
?
5. Topics to be discussed
• Typical Data Science Workflow and impact of deep learning solutions
• Why do we need a scalable solution?
• Importance of Process Pipelines
• Importance of an API Layer and User Interface for a scalable solution
• Deep Learning As A Service
• How to make the solution sustainable?
• Importance of Monitoring Layer and Model Performance Metrics
• Feedback mechanism based on confidence interval
6. Typical Data Science Workflow
1. Business Understanding
2. Data Mining/Collection Process
3. Data Cleaning
4. Exploratory Data Analysis
5. Feature Engineering
6. Predictive Modelling
7. Data Visualization and Model Metrics
7. Impact of deep learning solutions
Why do we go for DL solution knowing some of its drawbacks?
Why not classical ML approach?
• Classical ML approaches requires a lot of research on the dataset and efforts for
feature engineering
• When dealing with unstructured data, classical ML techniques require a lot of
cleaner dataset for higher accuracy
• Accuracy of the models are usually not good enough with classical ML approach
and not comparable with human level performance
In short,
DL techniques are far more accurate and reliable and easier to implement
particularly with unstructured data.
Image Generation
Image Classification Flow
Neural Style Transfer
Neural Network
8. Why do we need a scalable solution?
• All organizations invest a lot on data science, machine learning and deep
learning based research to improve their internal process, enhance their
external experience and improve their existing products and solutions.
• All organization want to make data and analytics driven progress.
• Deep Learning and AI solutions will become a basic expectation of all digital
products and services in the near future.
Hence DL solutions should be moved from research environment to production
environment and should be baked seamlessly within products and services.
10. Process Pipelines
• Data Pipeline –
For better accuracy, all DL models require continuous flow of high volume of data at high velocity.
So, the analytics layer, requires a well established data pipeline for continuous synchronization of
data from the data layer. Also, the data layer can have multiple data sources (both structured as well
as unstructured), so continuous data flow to the analytics layer can only be achieved using data
pipelines.
11. Process Pipelines
• Deployment Pipeline:
The output of the analytics layer is usually the predictive model in case of a deep learning solution (which is
nothing but a file containing either the learned weights and biases of the trained model or the model
configurations). Now these trained model “files” should be stored in a cloud based storage, so that next time,
retraining process is not required. This is done through deployment pipelines.
• Application Integration Pipelines:
This is typically the API endpoints that can access the model “files” and generate predictions or results on the
run-time when called.
12. Deep Learning As A Service
DL as a Service will only be possible through API endpoints that any
application can consume
• Importance of exposing model results through API
• The API Layer makes sure that there is no tight coupling between the analytics layer
and the application layer
• Any time, the model can be re-trained or updated, and still the running service in
production will not get affected.
• Importance of a user interface to consume the service
• An AI product is incomplete without an user interface which can tap the API endpoints
and fetch results from the analytics layer.
• The User Interface can be a hardware interface, software interface or even now voice
interface!
14. Sustainable Solution
• Monitoring Layer
• Model Performance at Production
Performance Evaluation Metrics
Model Versioning
Confidence Intervals
• Feedback Layer
• Rule Based Actions Triggered based on production metrics
Over-fitting or under-fitting problem
Re-train model with more data
Hyper Parameter Tuning
Improvement in Feature Engineering
Cost and resource optimization
Scrap off the model and build a new one!
15. Monitoring Layer
Model Performance at Production
oPerformance Evaluation Metrics
Accuracy
Precision and Recall
F1-score
AUC – ROC Score
(Which one to consider?)
oModel Versioning – How to keep track of historical model performance?
oConfidence Intervals – Deciding the threshold metric score based on which the feedback loop
functions
oA/B Testing – Statistical comparison between different versions of the model at production
Monitoring
Layer
Model Version Storage Link
AUC
Score
Confidence
Interval
Deployment
Date
CNN_Simple_v1 www.mycloudstoragelink.com 0.75 (-0.1, 0.1) 01-01-2020
LeNet_5_v1 www.mycloudstoragelink.com 0.80 (-0.05, 0.05) 01-02-2020
LeNet_5_v2 www.mycloudstoragelink.com 0.82 (-0.05, 0.05) 01-03-2020
ResNet_v1 www.mycloudstoragelink.com 0.95 (-0.02, 0.02) 01-04-2020
16. Feedback Layer
Why do we need a feedback loop?
• Whenever the production metric score falls below the confidence
interval, there has to be a feedback mechanism to trigger certain
necessary actions
Feedback
Layer
Time
Accuracy
Max
Within CI
The model performance is expected to vary and
even gradually decrease over time
Typical feedback actions to improve robustness of model:
Over-fitting or under-fitting problem
Re-train model with more data
Hyper Parameter Tuning
Improvement in Feature Engineering
Cost and resource optimization
Scrap off the model and build a new one!
17. The complete picture
User Interface Layer
Middleware
API Layer
Analytics LayerData Layer
Monitoring
Layer
Feedback
Layer
18. • Lead ML Engineer,
West Pharmaceuticals
• AI Researcher, MUST Research
- ADITYA BHATTACHARYA
Questions?
- Want to connect over LinkedIn ?
- Or email me at: aditya.bhattacharya2016@gmail.com