Ai use cases

machine
learning
Machine Learning Use Cases
(Overview)
Aman Jain
Sr. Analyst – Data Science
Infosys | Data & Analytics

machine
learning
Agenda
• Basic framework for approaching ML projects (5 mins)
• Model deployment and serving options (5 mins)
• Discussion on the following ML use cases (30 mins):
• Next steps (5 mins)
• Q&A (10 mins)
• Text classification
• Text similarity
• Named entity recognition
• Text summarization
• Topic modeling
• Scene text recognition
• Chatbot
• Image classification
• Image similarity
• Object detection
• Image segmentation
• Pose estimation
• Facial analytics
• Object tracking
• Action recognition

machine
learning
Step 1: Problem
Understand the business and current process (AS-IS process). Understand the
problem/objective and define the problem statement. Identify the key performance
indicators. Etc.
Step 2: Plan
Design the solution framework. Propose a solution. AS-IS vs. TO-BE process. Create the
project plan with a timeline. Define the scope and initial set of hypotheses. Etc.
Step 3: Data
Acquire the data. Perform exploratory analysis. Add data version control. Preprocess and
Prepare the data for modeling. Etc.
Step 4: Model
Design/select the model architecture, Model training, Optimization and tuning, Model
validation. DEV and UAT testing. Etc.
Step 5: Solution
Containerization. Model serving via API and UI interface. Model pruning and quantization
for serving at the edge. Technical and business-level documentation. Knowledge transfer
via recorded sessions. Etc.
Basic framework for approaching ML projects

machine
learning
ML Ops (snapshot)
Prototype Production
Training
Scikit, PyTorch, TensorFlow,
Teachable Machine
Scikit, PyTorch, TensorFlow, Amazon
Sagemaker, Databricks
Serving
Gradio, Streamlit, H2O Wave Flask API, Fast API, Plotly Dash, Tableau,
PowerBI
Containerization Colab Docker
Hosting on Cloud
Colab, Heroku AWS Elastic Beanstalk, Kubernetes
Hosting at Edge
ml5.js browser TF.js browser, ml5.js browser, Jetson Nano,
TFLite android mobile
Reproducible
Pipelines
MLflow DVC, Kubeflow, MLflow
Experiment
Management
Tensorboard, MLflow W&B, Tensorboard, MLflow
CI/CD GitHub Actions GitHub Actions, CircleAI, Jenkins
Explainability LIME, SHAP, TF-explain, What-If Fiddler, Arthur
Feature Stores Tecton Tecton, Hopsworks
Data Collection Snowflake Delta lakes, Snowflake, Fivetran

machine
learning
Image Classification - Intro
Image classification model
analyzes an image and
identifies the ‘class’ the image
falls under. (Or a probability of
the image being part of a
‘class’). A class is essentially a
label, for instance, 'car',
'animal', 'building', and so on.
Applications
Automated Image
Organization, Backbone for
advanced tasks like object
detection, pose estimation,
action recognition etc.
Tools
TorchVision, TFHub
Image Classifier
T-shirt
Skirt
Cap

machine
learning
Image Classification - Methods
ResNet
Deep Residual Learning for Image Recognition. ICLR, 2016.
A very popular model that is often used as a backbone CNN to extract visual representations. It
achieves a Top 1 accuracy of 76.1 on ImageNet (1000 categories).
MobileNet
Searching for MobileNetV3. ICCV, 2019.
A lean mobile network that achieves an accuracy of 76.0 on ImageNet (1000 categories).
EfficientNet
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML, 2019.
It achieves a Top 1 accuracy of 81.3 on ImageNet (1000 categories).
BiT
Big Transfer (BiT): General Visual Representation Learning. arXiv, 2020.
It achieves a Top 1 accuracy of 85.4 on ImageNet (1000 categories).
ViT
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR, 2021.
It showed that the reliance on CNNs is not necessary and a pure transformer applied directly to
sequences of image patches can perform very well on image classification tasks. This model achieves
a Top 1 accuracy of 87.8 on ImageNet (1000 categories).

machine
learning
Image Classification – Use Cases
Flower Classification
Gradio App available. Check out this notion.
Traffic Sign Classification
Train a 43-class image classifier from scratch in Keras. This is available as Streamlit App. A tutorial video is also
available here on the notion.
STL-10 Object Classification
Fine tune a 10-class classifier in PyTorch. Checkout the notion here.
Plant Disease Classification
Available as a Streamlit App
Brain Tumor Classification
Available as a Streamlit App
TorchVision Pre-trained Classifiers
PyTorch TorchVision provides more than 10 pre-trained image classification model, which can be easily fine-tuned
on a custom image dataset. Here I experimented with VGG11, AlexNet, ResNet18, and MobileNetV2.
EfficientNet Fine-tuning
Fine-tune EfficientNet in TF Keras to build a dog classifier. There are 120 classes of dogs. The data is available in
TensorFlow datasets. The notion is available here.
BiT Fine-tuning
Fine-tune Big-Transfer few-shot model. This model is available in TFHub. Checkout Colab.

machine
learning
Image Similarity - Intro
Image similarity is the
measure of how similar two
images are. In other words,
it quantifies the degree of
similarity between intensity
patterns in two images.
Applications
Duplicate product
detection, image clustering,
visual search, product
recommendations
Tools
TFHub
Image
Similarity Finder
Match: 96%
Match: 92% Match: 87%
Match: 42%
Similar Images?
Database

machine
learning
Image Similarity - Methods
DeepRank
DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval. arXiv, 2017.
ConvNets
Pre-trained models like MobileNet, EfficientNet, BiT-L/BiT-M can be used to convert images into
vectors. These models can be found on TFHub. For more accuracy, fine-tuning can be done.
FAISS
Billion-scale similarity search with GPUs. arXiv, 2017.
FAISS is a library for efficient similarity search and clustering of dense vectors.
Siamese Network
Siamese network is a neural network that contains two or more identical subnetwork. The purpose
of this network is to find the similarity or comparing the relationship between two comparable
things. Unlike the classification task that uses cross-entropy as the loss function, the Siamese
network usually uses contrastive loss or triplet loss.
Similarity Measures
L1 (Manhattan distance), L2 (Euclidean distance), Hinge Loss for Triplets.

machine
learning
Image Similarity – Use Cases
Multi-endpoint API Similarity System
The task was to build an API that will support multiple endpoints. Each endpoint supports a separate similarity
system. We built 2 endpoints: endpoint 1 would find Tok-K most similar fashion images and endpoint 2 would find
top-K most similar food images. Checkout the notion here.
Beanstalk Image Similarity System
There are 2 endpoints in the API - one for training and the other for inference. During training, the system will
receive a zipped file of images. At the time of inference, this trained system would receive an image over inference
endpoint and send back top-K most similar images with a confidence score. The API was deployed on AWS
beanstalk. Checkout the notion here.
Image + Text Similarity
Use the textual details and images of products, find the exact similar product among different groups. Around 35
GB of retail product images was scraped and used to build the system. Checkout the notion here.
Siamese Network Image Similarity on MNIST
Siamese networks are incredibly powerful networks, responsible for significant increases in face recognition,
signature verification, and prescription pill identification applications. The objective was to build image pairs for the
Siamese network, train the Siamese network with TF Keras, and then compare image similarity with this siamese
network.
Visual Recommendation
Use image similarity to recommend users visually similar products based on what they searched. Checkout the
notion here.

machine
learning
Object Detection - Intro
Object detection is a computer vision
technique that allows us to identify
and locate objects in an image or
video.
Applications
Crowd counting, Self-driving cars,
Video surveillance, Face detection,
Anomaly detection
Scope
Detect objects in images and videos,
2-dimensional bounding boxes, Real-
time
Tools
Detectron2, TF Object Detection API,
OpenCV, TFHub, TorchVision
Object Detector
Bench
Flower
Lamp

machine
learning
Object Detection - Methods
Faster R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv, 2016.
SSD (Single Shot Detector)
SSD: Single Shot MultiBox Detector. CVPR, 2016.
YOLO (You Only Look Once)
YOLOv3: An Incremental Improvement. arXiv, 2018.
EfficientDet
EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020.
It achieved 55.1 AP on COCO test-dev with 77M parameters.

machine
learning
Object Detection – Use Cases
Automatic License Plate Recognition
Recognition of vehicle license plate number using various methods including YOLO4 object detector
and Tesseract OCR. Checkout the notion here.
Object Detection App
This is available as a Streamlit app. It detects common objects. 3 models are available for this task -
Caffe MobileNet-SSD, Darknet YOLO3-tiny, and Darknet YOLO3. Along with common objects, this app
also detects human faces and fire. Checkout the notion here.
Logo Detector
Build a REST API to detect logos in images. API will receive 2 zip files - 1) a set of images in which we
must find the logo and 2) an image of the logo. Deployed the model in AWS Elastic Beanstalk.
Checkout the notion here.
TF Object Detection API Experiments
The TensorFlow Object Detection API is an open-source framework built on top of TensorFlow that
makes it easy to construct, train, and deploy object detection models. We did inference on pre-
trained models, few-shot training on single class, few-shot training on multiple classes and
conversion to TFLite model. Checkout the notion here.
Pre-trained Inference Experiments
Inference on 6 pre-trained models - Inception-ResNet (TFHub), SSD-MobileNet (TFHub), PyTorch
YOLO3, PyTorch SSD, PyTorch Mask R-CNN, and EfficientDet. Checkout the notion here and here.

machine
learning
Object Detection – Use Cases II
Object Detection App
TorchVision Mask R-CNN model Gradio App. Checkout the notion here.
Real-time Object Detector in OpenCV
Build a model to detect common objects like scissors, cups, bottles, etc. using the MobileNet SSD
model in the OpenCV toolkit. It will task input from the camera and detect objects in real-time.
Checkout the notion here. Available as a Streamlit app also (this app is not real-time).
EfficientDet Fine-tuning
Fine-tune YOLO4 model on new classes. Checkout the notion here.
YOLO4 Fine-tuning
Fine-tune YOLO4 model on new classes. Checkout the notion here.
Detectron2 Fine-tuning
Fine-tune Detectron2 Mask R-CNN (with PointRend) model on new classes. Checkout the notion
here.

machine
learning
Image Segmentation - Intro
Image segmentation is the
task of assigning labels to
each pixel of an image.
Applications
Medical imaging, self-driving
cars, satellite imaging
Scope
Segment objects in images
and videos, 2-dimensional
pixel masks, Real-time,
Semantic and Instance masks
Tools
Detectron2, TFHub,
TorchVision, DeepLab
Segmentation
System
Bench
Flower
Lamp

machine
learning
Image Segmentation - Methods
U-Net
U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv, 2015.
It was originally designed to perform medical image segmentation, but it works well on a wide
variety of tasks, from segmenting cells on microscope images to detecting ships or houses on photos
taken from satellites.
Mask R-CNN
Mask R-CNN. arXiv, 2017.
The Mask R-CNN framework is built on top of Faster R-CNN. ****So, for a given image, Mask R-CNN,
in addition to the class label and bounding box coordinates for each object, will also return the object
mask.
DeepLabV3+
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. arXiv, 2018.
It achieves a mean IOU score of 89% on the PASCAL VOC 2012 dataset.

machine
learning
Image Segmentation – Use Cases
Satellite Image Segmentation for Agricultural Fields
An image with 1800 x 1135 resolution and 60 channels. Every Month 5 bands images were shot from agricultural
land for 12 months. There is 8 type of croplands. Task is to classify all unknown label pixels into one of these 8
categories. U-Net model was trained from scratch on patches. Checkout this notion.
Detectron2 Fine-tuning
Fine-tune Detectron2 Mask R-CNN (with PointRend) model on new classes. It supports semantic, instance, and
panoptic segmentation. We fine-tuned on balloons, chipsets, and faces. Checkout this notion.
Industrial Use Cases for Image Segmentation
Experimented with 3 industrial use cases - Carvana Vehicle Image Masking, Airbus Ship Detection, and Severstal
Steel Defect Detection. Checkout this notion.
Real-time segmentation on Videos
Real-time tracking and segmentation with SiamMask, semantic segmentation with LightNet++ and instance
segmentation with YOLACT. Checkout this notion.
Image Segmentation Exercises
Thresholding with Otsu and Riddler–Calvard, Image segmentation with self-organizing maps, RandomWalk
segmentation with Scikit-image, Skin color segmentation with the GMM–EM algorithm, Medical image
segmentation, Deep semantic segmentation, Deep instance segmentation. Checkout this notion.
TorchVision Inference Experiments
FCN-ResNet and DeepLabV3 (both are available in TorchVision library) inference. Available as a Streamlit app.
Checkout this notion.

machine
learning
Pose Estimation - Intro
Pose estimation is a computer vision task
that infers the pose of a person or object
in an image or video. This is typically
done by identifying, locating, and
tracking the number of key points on a
given object or person. For objects, this
could be corners or other significant
features. And for humans, these key
points represent major joints like an
elbow or knee.
Applications
Activity recognition, motion capture, fall
detection, plank pose corrector, yoga
pose identifier, body ration estimation
Scope
2D skeleton map (3D mapping coming
soon), Human Poses (product poses
coming soon), Single and Multi-pose,
Real-time
Tools
TensorFlow PoseNet API
Pose Classifier
Running
Standing Sitting
Fall

machine
learning
Pose Estimation - Methods
OpenPose
OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. arXiv, 2016.
A standard bottom-up model that supports real-time multi-person 2D pose estimation. The authors
of the paper have shared two models – one is trained on the Multi-Person Dataset ( MPII ) and the
other is trained on the COCO dataset. The COCO model produces 18 points, while the MPII model
outputs 15 points.
PoseNet
PoseNet is a machine learning model that allows for Real-time Human Pose Estimation. PoseNet can
be used to estimate either a single pose or multiple poses PoseNet v1 is trained on MobileNet
backbone and v2 on ResNet backbone.

machine
learning
Pose Estimation – Use Cases
OpenPose Experiments
Four types of experiments with pre-trained OpenPose model - Single and Multi-Person Pose Estimation with
OpenCV, Multi-Person Pose Estimation with PyTorch and Pose Estimation on Videos. Check out this notion.
Pose Estimation Inference Experiments
Experimented with pre-trained pose estimation models. Check out this notion for experiments with the OpenPifPaf
model, this one for the TorchVision Keypoint R-CNN model, and this notion for the Detectron2 model.
Pose Detection on the Edge
Train the pose detector using Teachable machine, employing the PoseNet model (multi-person real-time pose
estimation) as the backbone and serve it to the web browser using ml5.js. This system will infer the end-users pose
in real-time via a web browser. Check out this link and this notion.
Pose Detection on the Edge using OpenVINO
Optimize the pre-trained pose estimation model using the OpenVINO toolkit to make it ready to serve at the edge
(e.g., small embedded devices) and create an OpenVINO inference engine for real-time inference. Check out this
notion.

machine
learning
Face Analytics - Intro
Analyze the facial features like
age, gender, emotion, and
identity.
Applications
Identity verification, emotion
detection
Scope
Human faces only, Real-time
Tools
OpenCV, dlib
Face Analyzer
She is Jane
Age 39
Looks Happy
He is Mike
Age 13
Looks Angry

machine
learning
Face Analytics - Methods
FaceNet
FaceNet: A Unified Embedding for Face Recognition and Clustering. CVPR, 2015.
RetinaFace
RetinaFace: Single-stage Dense Face Localisation in the Wild. arXiv, 2019.
FER+
Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution.
arXiv, 2016.
Face Analytics – Use Cases
Automatic Attendance System via Webcam
We use Face Recognition library and OpenCV to create a real-time webcam-based attendance system
that will automatically recognizes the face and log an attendance into the excel sheet. Check out this
notion.
Detectron2 Fine-tuning for face detection
Fine-tuned detectron2 on human face dataset to detect the faces in images and videos. Check out
this notion.

machine
learning
Text Classification - Intro
Text classification is a supervised
learning method for learning and
predicting the category or the
class of a document given its text
content. The state-of-the-art
methods are based on neural
networks of different
architectures as well as pre-
trained language models or word
embeddings.
Applications
Spam classification, sentiment
analysis, email classification,
service ticket classification,
question and comment
classification
Scope
Multiclass and Multilabel
classification
Tools
TorchText, Spacy, NLTK, FastText,
HuggingFace, pyss3
Text Classifier
Toxic Urgent Info Spam

machine
learning
Text Classification - Methods
FastText
Bag of Tricks for Efficient Text Classification. arXiv, 2016.
FastText is an open-source library, developed by the Facebook AI Research lab. Its main focus is on
achieving scalable solutions for the tasks of text classification and representation while processing
large datasets quickly and accurately.
XLNet
XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv, 2019.
BERT
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv, 2018.
TextCNN
What Does a TextCNN Learn?. arXiv, 2018.
Embedding
Feature extraction using either any pre-trained embedding models (e.g., Glove, FastText embedding)
or custom-trained embedding model (e.g., using Doc2Vec) and then training an ML classifier (e.g.,
SVM, Logistic regression) on these extracted features.
Bag-of-words
Feature extraction using methods (CountVectorizer, TF-IDF) and then training an ML classifier (e.g.,
SVM, Logistic regression) on these extracted features.

machine
learning
Text Classification – Use Cases
Email Classification
The objective is to build an email classifier, trained on 700K emails and 300+ categories. Preprocessing pipeline to
handle HTML and template-based content. Ensemble of FastText and BERT classifier. Check out this notion.
User Sentiment towards Vaccine
Based on the tweets of the users, and manually annotated labels (label 0 means against vaccines and label 1 means
in-favor of vaccine), build a binary text classifier. 1D-CNN was trained on the training dataset. Check out this notion.
ServiceNow IT Ticket Classification
Based on the short description, along with a long description if available for that ticket, identify the subject of the
incident ticket in order to automatically classify it into a set of pre-defined categories. e.g. If custom wrote "Oracle
connection giving error", this ticket type should be labeled as "Database". Check out this notion.
Toxic Comment Classification
Check out this notion.
Pre-trained Transformer Experiments
Experiment with different types of text classification models that are available in the HuggingFace Transformer
library. Wrapped experiment-based inference as a Streamlit app.
Long Docs Classification
Check out this Colab.
BERT Sentiment Classification
Scrap App reviews data from Android playstore. Fine-tune a BERT model to classify the review as positive, neutral
or negative. And then deploy the model as an API using FastAPI. Check out this notion.

machine
learning
Text Similarity - Intro
Text similarity determines how
'close' two pieces of text are both
in surface closeness (lexical
similarity) and meaning (semantic
similarity).
Applications
Duplicate document detection,
text clustering, product
recommendations
Scope
No scope decided yet
Tools
Sentence Transformer Library,
Universal Sentence Encoder
Model (TFHub), Scikit-learn
Document
Similarity Finder
Match: 96%
Match: 92% Match: 87% Match: 79%
Match: 42%
Similar
Documents?

machine
learning
Text Similarity - Methods
BERT
Use transfer learning to fine-tune a BERT encoder. This encoder will work as a feature extractor. e.g.,
the most common version of BERT convert any given text into a numeric vector of length 768 (this
vector is also known as contextual embedding).
Bag-of-words
Extract features using models like TF-IDF, CountVectorizer.
DeepRank
DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval. arXiv, 2017.
FAISS
Billion-scale similarity search with GPUs. arXiv, 2017.
FAISS is a library for efficient similarity search and clustering of dense vectors.
Similarity Measures
L1 (Manhattan distance), L2 (Euclidean distance), Hinge Loss for Triplets.

machine
learning
Text Similarity – Use Cases
Semantic Relation Estimation
To maintain a level of coherence and similarity among various letters and speeches, a model was built that will help
in assessing this document similarity. In approach 1, TF-IDF with Latent semantic indexing was used to extract
features and cosine similarity as the distance metric. In approach 2, BERT with PCA was used for feature extraction
and 3 distance measures - L1, L2, and cosine, for similarity calculation. Check out this notion.
Finding Hardware Parts in Warehouse
There are millions of hardware items (e.g., 0.5mm steel wire grade q195) in the warehouse and customer generally
asks for items in natural language (e.g., grade195 steel wire with 0.5mm thickness). A text similarity system was
built using an ensemble of 3 Bag-of-words based Count vectorizer model with different types of tokenization
process and n-gram range. Check out this notion.
Image + Text Similarity
Use the textual details and images of products, find the exact similar product among different groups. Around 35
GB of retail product images was scraped and used to build the system. Checkout the notion here.
Text Recommendation
For the given BRM text, recommend top-5 GAO text. We used universal sentence encoder to encode the text and
calculated cosine similarity within group. Then an item-based recommender model was used to find most suitable
top-K candidates in GAO based on the interaction history. Check out this notion.

machine
learning
Entity Recognition - Intro
NER models classify each
word/phrase in the document
into a pre-defined category. In
other words, these models
identify named entities
(classes/labels) in the given text
document.
Applications
Text parsing, Keyword detection,
Opinion mining, Affinity towards
brands
Scope
Tools
Doccano, Flair, Spacy,
HuggingFace Transformer Library
Named Entity
Recognizer
Alex is going to catch a flight for Miami tomorrow.
Alex is going to catch a flight for Miami tomorrow.
Person
Vehicle
Location
Time
Document

machine
learning
Entity Recognition - Methods
Flair-NER
Pooled Contextualized Embeddings for Named Entity Recognition. ACL, 2019.
Contextual string embeddings are a recent type of contextualized word embedding that were shown
to yield state-of-the-art results when utilized in a range of sequence labeling tasks. This model
achieves an F1 score of 93.2 on the CoNLL-03 dataset.
Spacy-NER
Incremental parsing with bloom embeddings and residual CNNs.
spaCy v2.0's Named Entity Recognition system features a sophisticated word embedding strategy
using subword features and "Bloom" embeddings, a deep convolutional neural network with residual
connections, and a novel transition-based approach to named entity parsing.
Transformer-NER
Fine-tuning of transformer-based models like BERT, Roberta and Electra.

machine
learning
Entity Recognition – Use Cases
Name and Address Parsing
Parse names (person [first, middle and last name], household or corporation) and address (street,
city, state, country, zip) from the given text. We used Doccano for annotation and trained a Flair NER
model on GPU. Check out this notion.
NER Methods Experiment
Data is extracted from GMB(Groningen Meaning Bank) corpus and annotated using BIO scheme. 10
different NER models were trained and compared on this dataset. Frequency based tagging model
was taken as the baseline. Classification, CRF, LSTM, LSTM-CRF, Char-LSTM, Residual-LSTM ELMo,
BERT tagger, Spacy tagger and an interpretable tagger with Keras and LIME were trained. Checkout
this notion.

machine
learning
Text Summarization - Intro
To take the appropriate action, we
need the latest information, but on
the contrary, the amount of
information is more and more
growing. Making an automatic &
accurate summaries feature will
helps us to understand the topics
and shorten the time to do it.
Applications
News Summarization, Social media
Summarization, Entity timelines,
Storylines of event, Domain specific
summaries, Sentence Compression,
Event understanding,
Summarization of user generated
content
Scope
Extractive and Abstractive summary
Tools
HuggingFace Transformer Library
Text Summarizer
Please be ready by 9 AM for project onboarding.
Document 3
It was planned to start 2 months ago but got late.
Document 2
Thankfully, the project is starting tomorrow.
Document 1
Although got late by 2 months, project is starting
tomorrow. Be ready by 9 AM for the onboarding.
Summary

machine
learning
Text Summarization - Methods
ProphetNet
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv, 2020.
PEGASUS
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arXiv, 2019.
BERTSum
Fine-tune BERT for Extractive Summarization. arXiv, 2019.
Seq2Seq PointerGenerator
Get To The Point: Summarization with Pointer-Generator Networks. arXiv, 2017.

machine
learning
Text Summarization – Use Cases
Enron Email Summarization
Email overload can be a difficult problem to manage for both work and personal email inboxes. With the average
office worker receiving between 40 to 90 emails a day, it has become difficult to extract the most important
information in an optimal amount of time. A system that can create concise and coherent summaries of all emails
received within a timeframe can reclaim a large amount of time. Check out this notion.
PDF Summarization over mail
Built a system that will receive a pdf over outlook mail and create a word cloud for this pdf. Then, send this word
cloud back as an attachment to that email. Check out this notion.
BART Text Summarization
Document text summarization using the BART transformer model and visual API using the Plotly Dash app. Check
out this notion.
Transformers Summarization Experiment
Experiment with various transformers for text summarization using HuggingFace library. Summarization of 4 books.
CNN-Daily Mail and InShorts News Summarization
Check out this notion for CNN-Daily Mail and this one for InShorts.
Transformers Summarization Experiment
Experiment with various transformers for text summarization using HuggingFace library. Summarization of 4 books.
Covid-19 article summarization
Used BERT and GPT-2 for article summarization related to covid-19. Check out this notion.

machine
learning
Topic Modeling - Intro
Topic modeling is an unsupervised
machine learning technique that’s
capable of scanning a set of
documents, detecting word and
phrase patterns within them, and
automatically clustering word
groups and similar expressions that
best characterize a set of
documents.
Applications
Categorize documents without
labels, Identify emerging themes
and topics
Scope
Tools
Gensim
Topic Finder
Getting blue screen error.
Document 3
I forgot my account password.
Document 2
My Laptop has been crashed.
Document 1
Getting blue screen error.
Topic
Hardware
Failure
I forgot my account password.
My Laptop has been crashed.
Topic
Account
Reset

machine
learning
Topic Modeling - Methods
LSA
The core idea is to take a matrix of what we have — documents and terms — and decompose it into
a separate document-topic matrix and a topic-term matrix.
Text Summarization – Use Cases
Identify Themes and Emerging Issues in ServiceNow Incident Tickets
Extracted key phrases from the incident ticket descriptions and trained an LSA topic model on these
phrases to identify emerging themes and incident topics. This enabled a proactive approach to
manage and contain the issues and thus increasing CSAT. Check out this notion.
IT Support Ticket Management
In Helpdesk, almost 30–40% of incident tickets are not routed to the right team and the tickets keep
roaming around and around and by the time it reaches the right team, the issue might have
widespread and reached the top management inviting a lot of trouble. To solve this issue, we built a
system with 6 deliverables: Key Phrase Analysis, Topic Modeling, Ticket Classification, Trend,
Seasonality and Outlier Analysis, PowerBI Dashboard to visually represent the KPIs and dividing
tickets into standard vs. non-standard template responses. Check out this notion.

machine
learning
Text Recognition - Intro
Text—as a fundamental tool of
communicating information—scatters
throughout natural scenes, e.g., street
signs, product labels, license plates, etc.
Automatically reading text in natural
scene images is an important task in
machine learning and gains increasing
attention due to a variety of
applications.
Applications
Indexing of multimedia archives,
recognizing signs in driver assisted
systems, providing scene information
to visually impaired people, identifying
vehicles by reading their license plates.
Scope
Detection/ Localization and Recognition
at the same time, Real-time
Tools
OpenCV, Tesseract, PaddleOCR
Scene Text
Recognizer
Welcome to
Coffee House
Open
9 am – 11 pm
Jane’s Garden Open
9 am – 11 pm
Jane’s Garden Open
9 am – 11 pm
Jane’s Garden

machine
learning
Text Recognition - Methods
Semantic Reasoning Networks
Towards Accurate Scene Text Recognition with Semantic Reasoning Networks. arXiv, 2020.
Differentiable Binarization
Real-time Scene Text Detection with Differentiable Binarization. arXiv, 2019.
CRAFT
Character Region Awareness for Text Detection. arXiv, 2019.
EAST
EAST: An Efficient and Accurate Scene Text Detector. arXiv, 2017.

machine
learning
Text Recognition – Use Cases
Scene Text Detection with EAST Tesseract
Detect the text in images and videos using EAST model. Read the characters using Tesseract. Check
out this notion.
Scene Text Recognition with DeepText
Detect and Recognize text in images with an end-to-end model named DeepText. Check out this
notion.
Automatic License Plate Recognition
Read the characters on the license plate image using Tesseract OCR. Check out this notion.
Keras OCR Toolkit Experiment
Keras OCR is a deep learning-based toolkit for text recognition in images. Check out this notion.
OCR Experiments
Experiments with three OCR tools - Tesseract OCR, Easy OCR and Arabic OCR. Check out this and this
notion.

machine
learning
Object Tracking - Intro
Object tracking is the process of 1) Taking an
initial set of object detections (such as an
input set of bounding box coordinates, 2)
Creating a unique ID for each of the initial
detections, and then 3) tracking each of the
objects as they move around frames in a
video, maintaining the assignment of unique
IDs.
Applications
In-store consumer behavior tracking, Apply
security policies like crowd management,
traffic management, vision-based control,
human-computer interface, medical imaging,
augmented reality, robotics
Scope
Track objects in images and videos, 2-
dimensional tracking, Bounding boxes and
pixel masks, Single and Multiple Object
Tracking
Tools
OpenCV, PyTorch
Object Tracker
8 Frame Video
1
5
2
6
3
7
4
8
8 Frame Video
1
5
2
6
3
7
4
8
Person P1
Bird B1
Person P2

machine
learning
Object Tracking - Methods
FairMOT
On the Fairness of Detection and Re-Identification in Multiple Object Tracking. arXiv, 2020.
DeepSORT
Simple Online and Realtime Tracking with a Deep Association Metric. arXiv, 2017.
Detect object with models like YOLO or Mask R-CNN and then track using DeepSORT.
GOTURN
Learning to Track at 100 FPS with Deep Regression Networks. arXiv, 2016.
CNN offline learning tracker.
MDNet
Real-Time MDNet. arXiv, 2018.
CNN online learning tracker.
ROLO
Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking. arXiv, 2016.
CNN + LSTM tracker.

machine
learning
Object Tracking – Use Cases
Pedestrian Tracking
Pedestrian Tracking with YOLOv3 and DeepSORT. Check out this notion.
Object Tracking
Object tracking with FRCNN and SORT. Check out this notion.
Object Tracking using OpenCV
Tested out 5 algorithms on videos - OpticalFlow, DenseFlow, Camshift, MeanShift and Single Object
Tracking with OpenCV. Check out this notion.
Social Distancing Violation Detection
Estimate the distance between people mean points and track the violation in real-time.
People and Vehicle Counter Detection
Counting the number of people and vehicles passing through a particular point, captured by a CCTV
camera.

machine
learning
Action Recognition - Intro
Video Action recognition is the
task of identifying human
activities/actions (e.g., eating,
playing) in videos. In other
words, this task classifies
segments of videos into a set of
pre-defined categories.
Applications
Automated surveillance, elderly
behavior monitoring, human-
computer interaction, content-
based video retrieval, and video
summarization
Scope
Human Actions only
Tools
OpenCV
Action
Recognizer
4 Frame Video
1 2 3 4
4 Frame Video
1 2 3 4
4 Frame Video
1 2 3 4
4 Frame Video
1 2 3 4
Activity
Running
Activity
Ice-Skating

machine
learning
Action Recognition - Methods
3D-ResNet
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
the authors explore how existing state-of-the-art 2D architectures (such as ResNet, ResNeXt,
DenseNet, etc.) can be extended to video classification via 3D kernels.
R(2+1)D
This model was pre-trained on 65 million social media videos and fine-tuned on Kinetics400.
Action Recognition – Use Cases
Kinetics 3D CNN Human Activity Recognition
This dataset consists of 400 human activity recognition classes, at least 400 video clips per
class (downloaded via YouTube) and a total of 300,000 videos. Check out this notion.
Action Recognition using R(2+1)D Model
VGA Annotator was used for creating the video annotation for training. Check out this notion.

machine
learning
Chatbot - Intro
Chatbot automates some
tasks and handle
conversations with the user.
Applications
Customer support, Product
suggestion, Interactive FAQ,
Form filling, Question
Answering
Scope
Chat and Voice Support,
FAQ, Knowledge-based and
Contextual bot
Tools
DialogFlow, RASA,
DeepPavlov, Alexa Skill,
HuggingFace, ParlAI
Chatbot
What’s my
claim status?
I want to file a
claim request!
Congrats Danny, your
claim has been approved.
Sure Nina, I have filed
a request for you.

machine
learning
Chatbot - Methods
RASA Chatbot
RASA supports contextual conversational AI. It provided an integrated framework for Natural
language understanding, dialogue generation and management. It also supports multiple endpoints
(e.g., Facebook messenger, WhatsApp) for easy deployment.
DialogFlow Chatbot
It is an API to easily create and deploy chatbots. It also supports Voice interaction via Google cloud
Voice API.
Alexa Skill
This API enable us to create an Alexa skill that can be used via Alexa services. This also supports voice
interaction via Alexa Voice API.

machine
learning
Chatbot – Use Cases
RASA Chatbot
Categorization of services and selected 4 most usable services for automation process. Development of a text-
based chatbot application for this automation. RASA framework (python) was selected for implementation of this
chatbot. Check out this notion.
Insurance Voicebot
Automate the low-skill contact center services with the help of Voicebot AI technology. Context - Insurance Contact
Centre, Role - A virtual customer care representative, Skills – Claim status, Language – English (US), Technology –
Voice-enabled Goal-oriented Conversational AI Agents (Voicebots). Modules - Dialogflow Voicebot, Alexa Skill
Voicebot, Rasa with 3rd-party Voice API, Rasa powered Alexa skill, Rasa powered Google assistant, Rasa voicebot
with Mozilla tools, and DeepPavlov Voicebot. Check out this notion.
Wellness Tracker
A bot that logs daily wellness data to a spreadsheet (using the Airtable API), to help the user keep track of their
health goals. Connect the assistant to a messaging channel—Twilio—so users can talk to the assistant via text
message and WhatsApp. Check out this notion.
RASA Chatbot Experiments
Experiment with 4 chatbots in RASA: Financial Bot - A chatbot demonstrating how to build AI assistants for financial
services and banking, Movie Bot - A bot to book movie tickets, Cricket Bot - A bot that will bring the live info about
IPL cricket match as per user query, and Pokedex - This is a demonstration of a digital assistant that can answer
questions about Pokémon. Check out this notion.

machine
learning
Next Steps…
• Code & notebooks
• Hands-on tutorials
• Interactive sessions
• Notes & resources
• Tools, libraries & APIs
• Concepts & ideas
• Algorithms & math
• Generative Adversarial Networks
• Reinforcement Learning
• ML Ops & Date Engineering
• Recommender Systems
• Graph Neural Networks
• Machine Learning at scale
• Auto ML
• Audio Analytics
Horizontal
Vertical

machine
learning
Thank you.
aman.jain@infosys.com

Ai use cases

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Ai use cases

Similar a Ai use cases (20)

Último

Último (20)

Ai use cases

Notas del editor