SlideShare a Scribd company logo
1 of 28
Continuous Learning Systems:
Building ML systems that learn from
their mistakes
This work was done when the authors were at Freshworks
Anuj Gupta
Head of Machine Learning, Vahan
Saurabh Arora, Satyam Saxena, Navaneethan Santhanam
Agenda
1. Understanding the Problem Statement
● Background
● Metrics that matter
● Observations
1. Solution v1.0
2. Issues
1. Solution v2.0
a. Building Feedback loop
b. Global + local
1. Results
2. Conclusions and Way Forward
Background
● Customer Support on social is now must for all B2C brands.
● Ex: @AppleSupport, @AmazonHelp, @BofA_Help.
● Twitter, Facebook have launched dedicated features for this.
● Most CRM suites support Customer Service@social
Metrics that matter
● Owing to public nature of conversations, brands
care about 2 things:
a. Reply fast
b. Reply well
Both these contribute to how a brand is perceived.
● To measure (a), 2 key metrics are:
a. Average First Response Time (AFRT)
b. Average Response Time (ART)
● Many of our customers (CS team of brands) had pretty high AFRT/ART
● Ask: Reduce AFRT/ART
● Traffic on brand’s social channel is not just questions or requests. Its lot more than that!
Observations
✅
❌
❌
❌
Actionable
Noise/spam
Observations
● The average number of replies sent per agent per day was relatively low. (~12-15). Yet the
ART/FRT were pretty high.
● Of the total inbound traffic on support handles, only a fraction of tickets were being replied to.
typically ~ 5% - 40%.
● In between 2 messages that were responded to, lot of
messages that were not being responded to (~3-30)
Most of time going in finding finding
actionable conversations
Solution v1.0
• Noise filter for CS@social
• Model it as (binary) classification problem.
• Acquire good quality dataset.
• Engineer features – there are some very good indicators.
Actionable Noise/Spam
• Train-test-tune, ~75% accuracy. Deploy
Issues
*within couple of weeks of deployment
● Performance varied across brands.
● While for some brands the model worked very well, for some it did very badly.
● As time* went by even the models that performed well, started doing badly.
• Our data was changing
Behind the Scene
Non-stationary distributions
A stationary process is time-independent ~ the averages remain more or less the constant.
• World of CS@social is not just Black(noise) and White (actionable).
• It also has a spectrum of grey in between:
a. “Hi”, “Hello”, “Good mornings”
b. “Any new offers today”
c. “The recent ad you launched is very good. Keep it up”
d. Quizzes, engagement posts
• Some brands respond to such traffic, some do not.
• Noise and actionable were merely 2 extremes of this spectrum.
• Definition of noise and actionable was not consistent across various brands.
• Boundary (in the grey region) separating noise from actionable varies from brand to brand.
• A single common classifier for all is doomed to fail!
Behind the Scene
In Nutshell
• Based on last few slides, degradation in model performance shouldn’t come
as surprise
• One model fits all is not going to work.
• Non-stationary distributions is not just specific to twitter data. In general, it is
found in other domains as well:
o Monitoring & Anomaly detection (one-class classification) in adversarial setting
o Recommendations (where the user preferences are continuously changing; evolving labels)
o Stock market predictions (concept drift; evolving distributions).
• Build per brand model to have brand specific learning.
• Learn from mistakes: In our system, by looking at what messages are being
replied to and what not, we know (with a small delay), if the classification done
by the system is right or wrong.
• The model is not utilizing these signals to improve.
• If feedback is utilized well:
• With time adapt to brands definition of noise and actionable.
• Adapt to variations/changes in features
Towards Solution: Exploration
Incorporate feedback
• Frequently retrain your model on the updated data and deploy the same.
o Training, testing, fine-tuning – 45K models.
Compute heavy. Doesn’t scale at all .
o Loose all old learnings
• Keep learning from feedback: Model adapts to the new incoming data.
What worked for us
Global Model
Batch trained
Large Corpus
No short term updates
Local model
Fast learner
Short term updates
● 2 models - Global + Local
● Global model is common for all
brands
○ Trained on large dataset
● 1 Local model per brand
Local
• Goals
o Improves with feedback.
• Desired properties
o Fast learner (light compute)
▪ Incorporates most feedbacks successfully
(After model update, if the same data point is presented, it must correctly predict its class label.)
o Avoid catastrophic forgetting
(After model update, if the last N data point is presented, it should predict its class label with higher accuracy.)
Building feedback loop
ML model
<Tweet, Yp>
<Tweet, YT>
If YT ≠ Yp
Tweet
Works fine if the velocity of
feedback data is high (don’t
have to wait long to accumulate
a mini-batch of feedbacks).
Many applications don’t have
high velocity.
Very few data point - can skew
the model
mini-batches Instant feedback, tiny-
batches
Possible Approaches to incorporate feedback
Building feedback loop
• We model a feedback point <Tweet, YT> as a datapoint presented to local model
in online setting.
• Thus, a bunch of feedbacks = incoming data stream
• We used a Online learning.
• Online learning:
Data is modeled as stream.
Model makes a prediction (YP), when presented with data point (X).
Environment reveals the correct class label (YT)
If YP ≠ YT, update the model with <X, YT>
Online Algorithms
http://scikit-learn.org/stable/auto_examples/linear_model/plot_sgd_comparison.html
Crammer’s PA-II
• Dataset – 150K tweets, time sequenced
• Feedback incorporation improves accuracy:
o Trained (offline batch mode) model on first 100K data points.
o On test set (last 50k data points) it gave 75% accuracy (offline batch mode)
o Then ran the model on test data (50k data points) in online fashion
Model made a total 9028 mistakes.
These mistakes were instantaneously fed into the local model as feedback.
This gives a accuracy ~82 % across the test set.
○ We gained ~7% accuracy by incorporating feedback.
Results of Local :
Improving accuracy
# of test points
We also tested the local by feeding it with wrong feedbacks.
Combining global and local
• Scores from both global and local, combined to get a single score and apply
threshold to arrive at a prediction.
• We got an accuracy of ~82%
Global
Local
combined
score
# of test points
Pros:
• Improved running accuracy
• Personalization : The notion of spam varies from brand to brand. Some
brands treat ‘Hi’, ‘Hello’ as spam while others treat them as actionable. By
learning from feedback, the model adapts to the notions of the brand.
• Local is light-weight, fast thus easy to boot-strap, deploy and scale.
Cons:
● Local can overfit to feedback, thus become biased.
● Need to monitor biasness.
● Reset local as when it becomes biased.
Future Work
• Instead of a single global, have vertical specific global
• Try other online algorithms
• Handle drift
• Not incorporate every feedback? Update on most important ones.
References
1. “Online Passive-Aggressive Algorithms” - Crammer et al., JMLR 2006
2. “The learning behind gmail priority inbox” – Aberdeen et al., LCCC: NIPS Workshop 2010
3. “Learning with drift detection” – Gama et al., BSAI 2004
4. "Adaptive regularization of weight vectors." ” - Crammer et al., ANIPS 2009
5. LIBOL - A Library for Online Learning Algorithms. https://github.com/LIBOL/LIBOL
Thank You
Please feel free to reach out post this talk or on the interwebs.
@anujgupta82
https://www.linkedin.com/in/anujgupta-82/

More Related Content

What's hot

Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...Natalia Díaz Rodríguez
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningMadhu Sanjeevi (Mady)
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Machine Learning Exposed!
Machine Learning Exposed!Machine Learning Exposed!
Machine Learning Exposed!javafxpert
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Gülden Bilgütay
 
How Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
How Artificial Intelligence & Machine Learning Are Transforming Modern MarketingHow Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
How Artificial Intelligence & Machine Learning Are Transforming Modern MarketingCleverTap
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksJonathan Mugan
 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaAndre Pemmelaar
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentationehtshamelahi
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Modeltaeseon ryu
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetAmazon Web Services
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to KerasJohn Ramey
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017StampedeCon
 

What's hot (20)

Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep Learning
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Machine Learning Exposed!
Machine Learning Exposed!Machine Learning Exposed!
Machine Learning Exposed!
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
How Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
How Artificial Intelligence & Machine Learning Are Transforming Modern MarketingHow Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
How Artificial Intelligence & Machine Learning Are Transforming Modern Marketing
 
How Artificial Intelligence & Machine Learning Are Transforming Modern Market...
How Artificial Intelligence & Machine Learning Are Transforming Modern Market...How Artificial Intelligence & Machine Learning Are Transforming Modern Market...
How Artificial Intelligence & Machine Learning Are Transforming Modern Market...
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
 
Generating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in juliaGenerating Sequences with Deep LSTMs & RNNS in julia
Generating Sequences with Deep LSTMs & RNNS in julia
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentation
 
A Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer ModelA Multiscale Visualization of Attention in the Transformer Model
A Multiscale Visualization of Attention in the Transformer Model
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
Introduction to Keras
Introduction to KerasIntroduction to Keras
Introduction to Keras
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 

Similar to Building ML Systems That Learn From Mistakes

Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesAnuj Gupta
 
LPP application and problem formulation
LPP application and problem formulationLPP application and problem formulation
LPP application and problem formulationKarishma Chaudhary
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
Lessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender SystemsLessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender Systemschrisalvino
 
Lean Startup + Story Mapping = Awesome Products Faster
Lean Startup + Story Mapping = Awesome Products FasterLean Startup + Story Mapping = Awesome Products Faster
Lean Startup + Story Mapping = Awesome Products FasterBrad Swanson
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Lean startups en el mundo real ejemplos y metricas
Lean startups en el mundo real  ejemplos y metricasLean startups en el mundo real  ejemplos y metricas
Lean startups en el mundo real ejemplos y metricasSoftware Guru
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
Startup Product Development
Startup Product DevelopmentStartup Product Development
Startup Product DevelopmentAaron Stannard
 
An Introduction To Software Development - Final Review
An Introduction To Software Development - Final ReviewAn Introduction To Software Development - Final Review
An Introduction To Software Development - Final ReviewBlue Elephant Consulting
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech TalkDoug Chang
 
Pin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 octPin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 octSteven Martin
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission TeamsDashlane
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...Edge AI and Vision Alliance
 
3 Challenges in Customer Feedback Classification
3 Challenges in Customer Feedback Classification3 Challenges in Customer Feedback Classification
3 Challenges in Customer Feedback ClassificationVan Huy
 

Similar to Building ML Systems That Learn From Mistakes (20)

Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
 
LPP application and problem formulation
LPP application and problem formulationLPP application and problem formulation
LPP application and problem formulation
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Lessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender SystemsLessons learned from Large Scale Real World Recommender Systems
Lessons learned from Large Scale Real World Recommender Systems
 
Lean Startup + Story Mapping = Awesome Products Faster
Lean Startup + Story Mapping = Awesome Products FasterLean Startup + Story Mapping = Awesome Products Faster
Lean Startup + Story Mapping = Awesome Products Faster
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Lean startups en el mundo real ejemplos y metricas
Lean startups en el mundo real  ejemplos y metricasLean startups en el mundo real  ejemplos y metricas
Lean startups en el mundo real ejemplos y metricas
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Startup Product Development
Startup Product DevelopmentStartup Product Development
Startup Product Development
 
An Introduction To Software Development - Final Review
An Introduction To Software Development - Final ReviewAn Introduction To Software Development - Final Review
An Introduction To Software Development - Final Review
 
L'Oreal Tech Talk
L'Oreal Tech TalkL'Oreal Tech Talk
L'Oreal Tech Talk
 
Pin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 octPin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 oct
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission Teams
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
Lean UX
Lean UXLean UX
Lean UX
 
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
 
3 Challenges in Customer Feedback Classification
3 Challenges in Customer Feedback Classification3 Challenges in Customer Feedback Classification
3 Challenges in Customer Feedback Classification
 

More from Anuj Gupta

Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisAnuj Gupta
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLPAnuj Gupta
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer ConnectAnuj Gupta
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
Representation Learning for NLP
Representation Learning for NLPRepresentation Learning for NLP
Representation Learning for NLPAnuj Gupta
 

More from Anuj Gupta (8)

Sarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysisSarcasm Detection: Achilles Heel of sentiment analysis
Sarcasm Detection: Achilles Heel of sentiment analysis
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLP
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
 
Representation Learning for NLP
Representation Learning for NLPRepresentation Learning for NLP
Representation Learning for NLP
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Building ML Systems That Learn From Mistakes

  • 1. Continuous Learning Systems: Building ML systems that learn from their mistakes This work was done when the authors were at Freshworks Anuj Gupta Head of Machine Learning, Vahan Saurabh Arora, Satyam Saxena, Navaneethan Santhanam
  • 2. Agenda 1. Understanding the Problem Statement ● Background ● Metrics that matter ● Observations 1. Solution v1.0 2. Issues 1. Solution v2.0 a. Building Feedback loop b. Global + local 1. Results 2. Conclusions and Way Forward
  • 3. Background ● Customer Support on social is now must for all B2C brands. ● Ex: @AppleSupport, @AmazonHelp, @BofA_Help. ● Twitter, Facebook have launched dedicated features for this. ● Most CRM suites support Customer Service@social
  • 4. Metrics that matter ● Owing to public nature of conversations, brands care about 2 things: a. Reply fast b. Reply well Both these contribute to how a brand is perceived. ● To measure (a), 2 key metrics are: a. Average First Response Time (AFRT) b. Average Response Time (ART)
  • 5. ● Many of our customers (CS team of brands) had pretty high AFRT/ART ● Ask: Reduce AFRT/ART ● Traffic on brand’s social channel is not just questions or requests. Its lot more than that!
  • 7. Observations ● The average number of replies sent per agent per day was relatively low. (~12-15). Yet the ART/FRT were pretty high. ● Of the total inbound traffic on support handles, only a fraction of tickets were being replied to. typically ~ 5% - 40%. ● In between 2 messages that were responded to, lot of messages that were not being responded to (~3-30) Most of time going in finding finding actionable conversations
  • 8. Solution v1.0 • Noise filter for CS@social • Model it as (binary) classification problem. • Acquire good quality dataset. • Engineer features – there are some very good indicators. Actionable Noise/Spam • Train-test-tune, ~75% accuracy. Deploy
  • 9. Issues *within couple of weeks of deployment ● Performance varied across brands. ● While for some brands the model worked very well, for some it did very badly. ● As time* went by even the models that performed well, started doing badly.
  • 10. • Our data was changing Behind the Scene Non-stationary distributions A stationary process is time-independent ~ the averages remain more or less the constant.
  • 11. • World of CS@social is not just Black(noise) and White (actionable). • It also has a spectrum of grey in between: a. “Hi”, “Hello”, “Good mornings” b. “Any new offers today” c. “The recent ad you launched is very good. Keep it up” d. Quizzes, engagement posts • Some brands respond to such traffic, some do not. • Noise and actionable were merely 2 extremes of this spectrum. • Definition of noise and actionable was not consistent across various brands. • Boundary (in the grey region) separating noise from actionable varies from brand to brand. • A single common classifier for all is doomed to fail! Behind the Scene
  • 12. In Nutshell • Based on last few slides, degradation in model performance shouldn’t come as surprise • One model fits all is not going to work. • Non-stationary distributions is not just specific to twitter data. In general, it is found in other domains as well: o Monitoring & Anomaly detection (one-class classification) in adversarial setting o Recommendations (where the user preferences are continuously changing; evolving labels) o Stock market predictions (concept drift; evolving distributions).
  • 13. • Build per brand model to have brand specific learning. • Learn from mistakes: In our system, by looking at what messages are being replied to and what not, we know (with a small delay), if the classification done by the system is right or wrong. • The model is not utilizing these signals to improve. • If feedback is utilized well: • With time adapt to brands definition of noise and actionable. • Adapt to variations/changes in features Towards Solution: Exploration
  • 14. Incorporate feedback • Frequently retrain your model on the updated data and deploy the same. o Training, testing, fine-tuning – 45K models. Compute heavy. Doesn’t scale at all . o Loose all old learnings • Keep learning from feedback: Model adapts to the new incoming data.
  • 15. What worked for us Global Model Batch trained Large Corpus No short term updates Local model Fast learner Short term updates ● 2 models - Global + Local ● Global model is common for all brands ○ Trained on large dataset ● 1 Local model per brand
  • 16. Local • Goals o Improves with feedback. • Desired properties o Fast learner (light compute) ▪ Incorporates most feedbacks successfully (After model update, if the same data point is presented, it must correctly predict its class label.) o Avoid catastrophic forgetting (After model update, if the last N data point is presented, it should predict its class label with higher accuracy.)
  • 17. Building feedback loop ML model <Tweet, Yp> <Tweet, YT> If YT ≠ Yp Tweet
  • 18. Works fine if the velocity of feedback data is high (don’t have to wait long to accumulate a mini-batch of feedbacks). Many applications don’t have high velocity. Very few data point - can skew the model mini-batches Instant feedback, tiny- batches Possible Approaches to incorporate feedback
  • 19. Building feedback loop • We model a feedback point <Tweet, YT> as a datapoint presented to local model in online setting. • Thus, a bunch of feedbacks = incoming data stream • We used a Online learning. • Online learning: Data is modeled as stream. Model makes a prediction (YP), when presented with data point (X). Environment reveals the correct class label (YT) If YP ≠ YT, update the model with <X, YT>
  • 21. • Dataset – 150K tweets, time sequenced • Feedback incorporation improves accuracy: o Trained (offline batch mode) model on first 100K data points. o On test set (last 50k data points) it gave 75% accuracy (offline batch mode) o Then ran the model on test data (50k data points) in online fashion Model made a total 9028 mistakes. These mistakes were instantaneously fed into the local model as feedback. This gives a accuracy ~82 % across the test set. ○ We gained ~7% accuracy by incorporating feedback. Results of Local :
  • 22. Improving accuracy # of test points We also tested the local by feeding it with wrong feedbacks.
  • 23. Combining global and local • Scores from both global and local, combined to get a single score and apply threshold to arrive at a prediction. • We got an accuracy of ~82% Global Local combined score
  • 24. # of test points
  • 25. Pros: • Improved running accuracy • Personalization : The notion of spam varies from brand to brand. Some brands treat ‘Hi’, ‘Hello’ as spam while others treat them as actionable. By learning from feedback, the model adapts to the notions of the brand. • Local is light-weight, fast thus easy to boot-strap, deploy and scale. Cons: ● Local can overfit to feedback, thus become biased. ● Need to monitor biasness. ● Reset local as when it becomes biased.
  • 26. Future Work • Instead of a single global, have vertical specific global • Try other online algorithms • Handle drift • Not incorporate every feedback? Update on most important ones.
  • 27. References 1. “Online Passive-Aggressive Algorithms” - Crammer et al., JMLR 2006 2. “The learning behind gmail priority inbox” – Aberdeen et al., LCCC: NIPS Workshop 2010 3. “Learning with drift detection” – Gama et al., BSAI 2004 4. "Adaptive regularization of weight vectors." ” - Crammer et al., ANIPS 2009 5. LIBOL - A Library for Online Learning Algorithms. https://github.com/LIBOL/LIBOL
  • 28. Thank You Please feel free to reach out post this talk or on the interwebs. @anujgupta82 https://www.linkedin.com/in/anujgupta-82/