SlideShare una empresa de Scribd logo
1
#exadelML1
Custom ML Models for Each User
A Case Study Based on ML1, an ML-powered Jira Plug-in
by Exadel
3
#exadelML1
1. About Me & Exadel
2. About ML1
3. Multi-User ML Solutions
4. ML Pipeline
5. Monitoring & Improvement
6. Implementation
AGENDA
#exadelML1
About Me
4
An ML engineer at Exadel
The leader of the Exadel Python community and an
active member of Exadel AI & DS communities
Interested in NLP, problem-solving, and writing
Siamion
Karasik
#exadelML1
About Exadel
Exadel is a software engineering company that delivers the digital platforms,
products, and applications our clients need to run and grow their businesses.
Exadel at a Glance
1998
Established in
ISO
27001 Certified
23
Offices in USA, Europe, Asia
25+
Solutions
20+
Open-source projects
1200+
Engineers
7
8
#ML1
Artificial Intelligence
The Exadel AI Practice examines existing products
and processes to discover how modern AI/ML
solutions can be applied to add value and then
brings them to life.
The Origins of ML1
#exadelML1
10
Technical Support at Exadel
Before:
Now:
Jira ticket
JC_Git_Management
Assign Category
Support Engineer
Assign Resolver
Support Engineer
Category
JC_Git_Management
Category
Assign Category Assign Resolver
Auto-Assignment Plugin
GIT help
I have an issue with GIT
Jira ticket
GIT help
I have an issue with GIT
Support Engineer
Support Engineer
11
ML1 Demo
About ML1
ML1 is an AI-powered Jira plug-in that predicts field values in issues/tickets
Predicting
Values
Training
Schedule
Training
Report
Users can select any
field to predict with their
ML model
Training can be set to a
schedule for automatically
improved accuracy
Users can get up-to-date
information on the success
of their model training
12
ML1 at Exadel
Our own Technical Support department uses ML1 to simplify the process of creating and processing
Jira tickets. Here are just a few of the benefits that we’ve seen so far:
Greatly Reduced
Assignment Time
Saved Time for
Our Employees
Saved Money on
Labor Costs
ML1 decreased the
amount of time
necessary to assign a
task from 10 minutes to 10
seconds
With around 10,000
tasks per year, ML1
saved our Technical
Support team
approximately 500
man hours
Even when the number
of technical support
tasks increased by 15%,
we didn’t have to hire
new technical support
staff
13
ML1 is available for free at Atlassian Marketplace🔗!
14
One-model-for-all
VS
A-model-for-each
#exadelML1
One-model-for-all ML Solution
16
ML Algorithm
Training Data
User 3
User 2
Metrics
ML Model
User 1
Feedback Data
Train
Predict
for
Feedback Loop - Retrain
Sometimes One-for-All Doesn’t Work
17
IoT
Legal restrictions IoT
Each client has a custom
ML problem - like in the
case of ML1
A-model-for-each ML Solution
18
ML Algorithm
Training Data
User 2
User 3
User 2
Metrics
ML Model
User 2
User 1
Train
Predict
for
Feedback Data
User 2
Feedback Data
User 3
Training Data
User 3
Training Data
User 1
ML Model
User 3
Metrics
Metrics
ML Model
User 1
Feedback Data
User 1
Train
Train
Feedback Loop - Retrain
Feedback Loop - Retrain
The ML Story
#exadelML1
Choosing the ML Pipeline
20
Multiclass
text classification
42 unbalanced classes
and ~2500 samples
Experimented with:
● TfidfVectorizer, Word2Vec, TruncatedSVD
● Linear models (Logistic Regression, SVM)
● Tree-based models (Random Forest, Boosting)
Choosing the ML Pipeline
In the end, this simple pipeline works best on our Jira data:
21
Jira ticket
Concatenate
Title + Description
TfidfVectorizer
Logistic
Regression
GIT Support
Predicted Category
● TfidfVectorizer learns user-specific words
● Logistic Regression does not require many samples
● 70% accuracy
GIT help
I have an issue with GIT
Training with Unknown Data
22
With ML1, training data
is provided by users in
runtime
We do not have control
over training data set
size and quality
So the question is: will
our pipeline work for
others?
Walking in Someone Else’s Shoes
We tried another data set and experimented (GitHub)
23
How much extra accuracy will we
get with every 1k samples?
How many features should we
select?
Quantifying the “Shortage of Data”
24
Testing set
Testing set
Testing set
Testing set
Training set
Training set
Training set
Training set
4-fold validation (k=4)
Fold 1
Fold 2
Fold 3
Fold 4
0% 25% 50% 75% 100%
Training set
Training set
high std (cross-validation scores) ⇒ shortage of data
Data Representation Score
What if there are many small classes?
25
● Rule of thumb in ML: there should be at least K samples per class
● representation_score = sum(k for k in Counter(y).values() if k >= K) / len(y)
● We can’t ensure a high-quality model if representation_score is low
C1
40
C2
30
C3
10
C4
10
C5
10
K = 20 representation_score = 70%
70 samples 30 samples
Monitoring &
Improvement
#exadelML1
Monitoring & Improvement Questions
27
How do we monitor a multi-
model ML solution?
● Accuracy
● Data drift
● Explainability
How do we improve the
system overall?
● AutoML?
● Federated Learning?
How ML1 Works
#exadelML1
How does ML1 Work?
ML1 uses the historical data from any set of permissions to automatically predict the value of
any field
31
ML1’s Server Under the Hood
32
A single Docker container
Solves multiclass and multilabel text classification
Accepts training data right from the client
Trains & serves a separate ML model for every target
34
ML1’s Multimodel Server
Jira
POST/train?modelld
POST/predict?modelld
ML Server
ML
Algorith
m
CACHE
Training
Training Data
Storage
Model File
Storage
Training
Info DB
Input Data
ML Model
ML Model
Save
Read
Predictions
Multimodel ML Server from Scratch
An article with code examples to help
you write a multimodel server using
software engineering best practices:
35
Questions?
#exadelML1
THANK YOU!
Want to know more about ML at Exadel? Connect
to our Zoom session in 5 minutes:
https://tinyurl.com/ExadelML
or
copy the link
scan the QR code
CONTACT US Siamion Karasik - ML Engineer - skarasik@exadel.com
APPENDIX
How do you Use ML1?
Step 1
Install ML1
Plugin for Jira in
your organization
Step 2
Install the ML Server
Step 3
Enable field prediction in
project settings and set
configurations
Step 4
Train your model
Step 5
Autocomplete selected
Jira field
In just five simple steps, Jira administrators can have ML1 up and running
39
47
Notes
Choosing the ML Pipeline
In the end, this simple pipeline works best on our Jira data:
48
● TfidfVectorizer learns user-specific words
● Logistic Regression does not require many samples
● 70% accuracy
Training with Unknown Data
● With ML1, training data is provided by users in runtime
● We do not have control over training data set size and quality
● So the question is: will our pipeline work for others?
61

Más contenido relacionado

La actualidad más candente

Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Sri Ambati
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
Gianmario Spacagna
 
Automatic ICD-10 Code Assignment to Consultations
Automatic ICD-10 Code Assignment to ConsultationsAutomatic ICD-10 Code Assignment to Consultations
Automatic ICD-10 Code Assignment to Consultations
Databricks
 

La actualidad más candente (20)

Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
Driver vs Driverless AI - Mark Landry, Competitive Data Scientist and Product...
 
Ml ops deployment choices
Ml ops   deployment choicesMl ops   deployment choices
Ml ops deployment choices
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
MLSEV. BigML Workshop II
MLSEV. BigML Workshop IIMLSEV. BigML Workshop II
MLSEV. BigML Workshop II
 
Version Control in AI/Machine Learning by Datmo
Version Control in AI/Machine Learning by DatmoVersion Control in AI/Machine Learning by Datmo
Version Control in AI/Machine Learning by Datmo
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
Computer vision-must-nit-silchar-ml-hackathon-2019
Computer vision-must-nit-silchar-ml-hackathon-2019Computer vision-must-nit-silchar-ml-hackathon-2019
Computer vision-must-nit-silchar-ml-hackathon-2019
 
Introduction to ML.NET
Introduction to ML.NETIntroduction to ML.NET
Introduction to ML.NET
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
Tech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning productsTech leaders guide to effective building of machine learning products
Tech leaders guide to effective building of machine learning products
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
 
CD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systemsCD4ML and the challenges of testing and quality in ML systems
CD4ML and the challenges of testing and quality in ML systems
 
Intuit - Machine learning platform lifecycle management 2018
Intuit - Machine learning platform lifecycle management  2018Intuit - Machine learning platform lifecycle management  2018
Intuit - Machine learning platform lifecycle management 2018
 
Why is dev ops for machine learning so different - dataxdays
Why is dev ops for machine learning so different  - dataxdaysWhy is dev ops for machine learning so different  - dataxdays
Why is dev ops for machine learning so different - dataxdays
 
How to Empower a Platform With a Data Pipeline At a Scale
How to Empower a Platform With a Data Pipeline At a ScaleHow to Empower a Platform With a Data Pipeline At a Scale
How to Empower a Platform With a Data Pipeline At a Scale
 
Automatic ICD-10 Code Assignment to Consultations
Automatic ICD-10 Code Assignment to ConsultationsAutomatic ICD-10 Code Assignment to Consultations
Automatic ICD-10 Code Assignment to Consultations
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
 

Similar a "Custom ML Models for Each User", Siamion Karasik

Similar a "Custom ML Models for Each User", Siamion Karasik (20)

ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?What are the Unique Challenges and Opportunities in Systems for ML?
What are the Unique Challenges and Opportunities in Systems for ML?
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
 
Alexandra Motulskaya, Exadel. ML1: Creating a machine learning powered plugin...
Alexandra Motulskaya, Exadel. ML1: Creating a machine learning powered plugin...Alexandra Motulskaya, Exadel. ML1: Creating a machine learning powered plugin...
Alexandra Motulskaya, Exadel. ML1: Creating a machine learning powered plugin...
 
PhD Thesis Presentation
PhD Thesis PresentationPhD Thesis Presentation
PhD Thesis Presentation
 
Effective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PMEffective Tips for Building ML Products by Rally Health Lead PM
Effective Tips for Building ML Products by Rally Health Lead PM
 
Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023Dmitry Spodarets: Modern MLOps toolchain 2023
Dmitry Spodarets: Modern MLOps toolchain 2023
 
A gentle introduction to relational learning
A gentle introduction to relational learning A gentle introduction to relational learning
A gentle introduction to relational learning
 
Tracking Machine learning Experiments
Tracking Machine learning ExperimentsTracking Machine learning Experiments
Tracking Machine learning Experiments
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Introduction to ml ops in daily apps
Introduction to ml ops in daily appsIntroduction to ml ops in daily apps
Introduction to ml ops in daily apps
 
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application ArchitectureNorman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application Architecture
 
Norman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application ArchitectureNorman Sasono - Incorporating AI/ML into Your Application Architecture
Norman Sasono - Incorporating AI/ML into Your Application Architecture
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
Scaling up Machine Learning Development
Scaling up Machine Learning DevelopmentScaling up Machine Learning Development
Scaling up Machine Learning Development
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 SessionsBSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 Sessions
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 

Más de Fwdays

Más de Fwdays (20)

"What I learned through reverse engineering", Yuri Artiukh
"What I learned through reverse engineering", Yuri Artiukh"What I learned through reverse engineering", Yuri Artiukh
"What I learned through reverse engineering", Yuri Artiukh
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
"Micro frontends: Unbelievably true life story", Dmytro Pavlov
"Micro frontends: Unbelievably true life story", Dmytro Pavlov"Micro frontends: Unbelievably true life story", Dmytro Pavlov
"Micro frontends: Unbelievably true life story", Dmytro Pavlov
 
"Objects validation and comparison using runtime types (io-ts)", Oleksandr Suhak
"Objects validation and comparison using runtime types (io-ts)", Oleksandr Suhak"Objects validation and comparison using runtime types (io-ts)", Oleksandr Suhak
"Objects validation and comparison using runtime types (io-ts)", Oleksandr Suhak
 
"JavaScript. Standard evolution, when nobody cares", Roman Savitskyi
"JavaScript. Standard evolution, when nobody cares", Roman Savitskyi"JavaScript. Standard evolution, when nobody cares", Roman Savitskyi
"JavaScript. Standard evolution, when nobody cares", Roman Savitskyi
 
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y..."How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
 
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro Spodarets"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro Spodarets
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Distributed graphs and microservices in Prom.ua", Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua", Maksym Kindritskyi
 
"Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl..."Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl...
 
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T..."How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...
 
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ..."The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
 
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu..."[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
 
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care..."[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
 
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"..."4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
 

Último

Último (20)

AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Motion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in TechnologyMotion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in Technology
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 

"Custom ML Models for Each User", Siamion Karasik

  • 1. 1
  • 2. #exadelML1 Custom ML Models for Each User A Case Study Based on ML1, an ML-powered Jira Plug-in by Exadel
  • 3. 3 #exadelML1 1. About Me & Exadel 2. About ML1 3. Multi-User ML Solutions 4. ML Pipeline 5. Monitoring & Improvement 6. Implementation AGENDA #exadelML1
  • 4. About Me 4 An ML engineer at Exadel The leader of the Exadel Python community and an active member of Exadel AI & DS communities Interested in NLP, problem-solving, and writing Siamion Karasik
  • 6. Exadel is a software engineering company that delivers the digital platforms, products, and applications our clients need to run and grow their businesses.
  • 7. Exadel at a Glance 1998 Established in ISO 27001 Certified 23 Offices in USA, Europe, Asia 25+ Solutions 20+ Open-source projects 1200+ Engineers 7
  • 8. 8 #ML1 Artificial Intelligence The Exadel AI Practice examines existing products and processes to discover how modern AI/ML solutions can be applied to add value and then brings them to life.
  • 9. The Origins of ML1 #exadelML1
  • 10. 10 Technical Support at Exadel Before: Now: Jira ticket JC_Git_Management Assign Category Support Engineer Assign Resolver Support Engineer Category JC_Git_Management Category Assign Category Assign Resolver Auto-Assignment Plugin GIT help I have an issue with GIT Jira ticket GIT help I have an issue with GIT Support Engineer Support Engineer
  • 12. About ML1 ML1 is an AI-powered Jira plug-in that predicts field values in issues/tickets Predicting Values Training Schedule Training Report Users can select any field to predict with their ML model Training can be set to a schedule for automatically improved accuracy Users can get up-to-date information on the success of their model training 12
  • 13. ML1 at Exadel Our own Technical Support department uses ML1 to simplify the process of creating and processing Jira tickets. Here are just a few of the benefits that we’ve seen so far: Greatly Reduced Assignment Time Saved Time for Our Employees Saved Money on Labor Costs ML1 decreased the amount of time necessary to assign a task from 10 minutes to 10 seconds With around 10,000 tasks per year, ML1 saved our Technical Support team approximately 500 man hours Even when the number of technical support tasks increased by 15%, we didn’t have to hire new technical support staff 13
  • 14. ML1 is available for free at Atlassian Marketplace🔗! 14
  • 16. One-model-for-all ML Solution 16 ML Algorithm Training Data User 3 User 2 Metrics ML Model User 1 Feedback Data Train Predict for Feedback Loop - Retrain
  • 17. Sometimes One-for-All Doesn’t Work 17 IoT Legal restrictions IoT Each client has a custom ML problem - like in the case of ML1
  • 18. A-model-for-each ML Solution 18 ML Algorithm Training Data User 2 User 3 User 2 Metrics ML Model User 2 User 1 Train Predict for Feedback Data User 2 Feedback Data User 3 Training Data User 3 Training Data User 1 ML Model User 3 Metrics Metrics ML Model User 1 Feedback Data User 1 Train Train Feedback Loop - Retrain Feedback Loop - Retrain
  • 20. Choosing the ML Pipeline 20 Multiclass text classification 42 unbalanced classes and ~2500 samples Experimented with: ● TfidfVectorizer, Word2Vec, TruncatedSVD ● Linear models (Logistic Regression, SVM) ● Tree-based models (Random Forest, Boosting)
  • 21. Choosing the ML Pipeline In the end, this simple pipeline works best on our Jira data: 21 Jira ticket Concatenate Title + Description TfidfVectorizer Logistic Regression GIT Support Predicted Category ● TfidfVectorizer learns user-specific words ● Logistic Regression does not require many samples ● 70% accuracy GIT help I have an issue with GIT
  • 22. Training with Unknown Data 22 With ML1, training data is provided by users in runtime We do not have control over training data set size and quality So the question is: will our pipeline work for others?
  • 23. Walking in Someone Else’s Shoes We tried another data set and experimented (GitHub) 23 How much extra accuracy will we get with every 1k samples? How many features should we select?
  • 24. Quantifying the “Shortage of Data” 24 Testing set Testing set Testing set Testing set Training set Training set Training set Training set 4-fold validation (k=4) Fold 1 Fold 2 Fold 3 Fold 4 0% 25% 50% 75% 100% Training set Training set high std (cross-validation scores) ⇒ shortage of data
  • 25. Data Representation Score What if there are many small classes? 25 ● Rule of thumb in ML: there should be at least K samples per class ● representation_score = sum(k for k in Counter(y).values() if k >= K) / len(y) ● We can’t ensure a high-quality model if representation_score is low C1 40 C2 30 C3 10 C4 10 C5 10 K = 20 representation_score = 70% 70 samples 30 samples
  • 27. Monitoring & Improvement Questions 27 How do we monitor a multi- model ML solution? ● Accuracy ● Data drift ● Explainability How do we improve the system overall? ● AutoML? ● Federated Learning?
  • 29. How does ML1 Work? ML1 uses the historical data from any set of permissions to automatically predict the value of any field 31
  • 30. ML1’s Server Under the Hood 32 A single Docker container Solves multiclass and multilabel text classification Accepts training data right from the client Trains & serves a separate ML model for every target
  • 31. 34 ML1’s Multimodel Server Jira POST/train?modelld POST/predict?modelld ML Server ML Algorith m CACHE Training Training Data Storage Model File Storage Training Info DB Input Data ML Model ML Model Save Read Predictions
  • 32. Multimodel ML Server from Scratch An article with code examples to help you write a multimodel server using software engineering best practices: 35
  • 34. THANK YOU! Want to know more about ML at Exadel? Connect to our Zoom session in 5 minutes: https://tinyurl.com/ExadelML or copy the link scan the QR code CONTACT US Siamion Karasik - ML Engineer - skarasik@exadel.com
  • 36. How do you Use ML1? Step 1 Install ML1 Plugin for Jira in your organization Step 2 Install the ML Server Step 3 Enable field prediction in project settings and set configurations Step 4 Train your model Step 5 Autocomplete selected Jira field In just five simple steps, Jira administrators can have ML1 up and running 39
  • 38. Choosing the ML Pipeline In the end, this simple pipeline works best on our Jira data: 48 ● TfidfVectorizer learns user-specific words ● Logistic Regression does not require many samples ● 70% accuracy
  • 39. Training with Unknown Data ● With ML1, training data is provided by users in runtime ● We do not have control over training data set size and quality ● So the question is: will our pipeline work for others? 61