SlideShare una empresa de Scribd logo
1 de 31
Few Shot Learning
Asif Ali
M.E SCUT CHINA
Date: May 31, 2019
Contents
• Introduction
• Problem statement, Why?
• Approaches
– Meta learning
• Matching network
• MAML
– Metric Learning
• Relation Networks
• Prototypical Networks
– AUGMENTAION BASED
• Delta encoder
• Few shot learning through informative retrieval lens
Introduction
• The ability of deep neural networks to extract complex statistics and learn high level features from vast
datasets is proven. Yet current deep learning approaches suffer from poor sample efficiency in stark
contrast to human perception — even a child could recognise a giraffe after seeing a single picture.
• Fine-tuning a pre-trained model is a popular strategy to achieve high sample efficiency but it is a post-hoc
hack
Can machine learning do better?
Few-shot learning aims to solve these issues
Few shot learning
• Whereas most machine learning based object categorization algorithms require
training on hundreds or thousands of samples/images and very large datasets,
one/FEW-shot learning aims to learn information about object categories from
one, or only a few, training samples/images.
• It is estimated that a child has learned almost all of the 10 ~ 30 thousand object
categories in the world by the age of six. This is due not only to the human mind's
computational power, but also to its ability to synthesize and learn new object
classes from existing information about different, previously learned classes.
Problem statement
Using a large annotated offline dataset,
dog
elephant
monkey
Offline
trainin
g
perform for novel categories,
represented by just a few samples each.
knowledge
transfer
lemur
rabbit
mongoose
model
for novel
categories
given task
Online
training
…
Problem statement
Using a large annotated offline dataset,
knowledge
transfer
classifier
for novel
categories
classification
Online
training
Offline
trainin
g
perform for novel categories,
represented by just a few samples each.
dog
elephant
monkey
…
lemur
rabbit
mongoose
Problem statement
Using a large annotated offline dataset,
knowledge
transfer
detector
for novel
categories
detection
Online
training
Offline
trainin
g
perform for novel categories,
represented by just a few samples each.
dog
elephant
monkey
…
lemur
rabbit
mongoose
Problem statement
Using a large annotated offline dataset,
knowledge
transfer
regressor
for novel
categories
Online
training
Offline
trainin
g
perform regression for novel categories,
represented by just a few samples each.
dog
elephant
monkey
…
lemur
rabbit
mongoose
Why work on few-shot learning?
1. It brings the DL closer to real-world business usecases.
• Companies hesitate to spend much time and money on
annotated data for a solution that they may profit.
• Relevant objects are continuously replaced with new ones. DL
has to be agile.
2. It involves a bunch of exciting cutting-edge technologies.
Meta-
learning
methods
Networks
generating
networks
Data
synthesizers
Semantic
metric spaces
Graph neural
networks
Neural Turing
Machines
GANs
Meta-learning
Learn a learning strategy
to adjust well to a new
few-shot learning task
Data augmentation
Synthesize more data from the novel
classes to facilitate the regular
learning
Metric learning
Learn a `semantic` embedding
space using a distance loss
function
Few-
shot
learning Each category is
represented by just a
few examples
Learn to perform
classification,
detection, regression
The n-shot, k-way task
• The ability of a algorithm to perform few-shot learning is typically measured by its
performance on n-shot, k-way tasks. These are run as follows:
1. A model is given a query sample belonging to a new, previously unseen class
2. It is also given a support set, S, consisting of n examples each from k different unseen classes.
3. The algorithm then has to determine which of the support set classes the query sample belongs to
Training a
meta learner
to learn on
each task
Meta-Learning
Standard learning: datadatadata
instances
training a
learner on
the data
model
Meta learning: datadatatasks
mod
el
mod
el
learning
strategy
data knowledge
task-
specific
learner
task-agnostic
specific
classes
training data
target data
task-specific
meta-
learner
datadatatask
data
meta-
learner
New
task
Recurrent meta-learners
Matching Networks in Vinyals et.al., NIPS 2016
Distance-based classification: based on similarity between
the query and support samples in the embedding space
(adaptive metric):
𝑦 =
𝑖
𝑎 𝑥, 𝑥𝑖 𝑦𝑖 , 𝑎 𝑥, 𝑥𝑖 = 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑓 𝑥, 𝑆 , 𝑔 𝑥𝑖, 𝑆
𝑓, 𝑔 - LSTM embeddings of 𝑥 dependent on the support set S
• Embedding space is class-agnostic
• LSTM attention mechanism adjusts the embedding to the task
to be elaborated later
Concept of episodes: test
conditions in the training.
• N new categories
• M training examples per
category
• one query example in {1..N}
categories.
• Typically, N=5, M=1, 5.
Method
miniImageNet
classification
accuracy 1/5
shot
Matching
networks
43.56 / 55.31
Optimization as a model for few-shot learning
• META-LEARN LSTM learn a general initialization
of the learner (classifier) network that allows for
quick convergence of training.
Problem: Gradient-based
optimization in high
capacity classifiers requires
many iterative steps over
many examples to perform
well.
Solution: an LSTM-based
meta-learner model to
learn the exact optimization
algorithm to train another
learner neural network
classifier in the few-shot
learning.
Optimizers
Optimize the learner to perform well after fine-tuning on the task data done by
a single (or few) step(s) of Gradient Descent.
MAML(Model-Agnostic Meta-Learning) Finn et.al., ICML 2017
Standard objective (task-specific, for task T):
min
θ
ℒT θ , learned via update θ′
= θ − α ∙ 𝛻θℒT(θ)
Meta-objective (across tasks):
min
θ T~p(ℑ) ℒT θ′ , learned via an update θ ← θ − β𝛻θ T~p(ℑ) ℒT θ′
reprinted from
Li et.al., 2017
Meta-SGD Li et.al., 2017
“Interestingly, the learning process can continue forever, thus enabling life-long learning,
and at any moment, the meta-learner can be applied to learn a learner for any new task.”
Meta-SGD Li et.al., 2017
Render α as a vector of size θ.
Method
miniImageNet
classification
accuracy 1/5
shot
Matching
networks
43.56 / 55.31
MAML 48.70 / 63.11
Meta-SGD 54.24 / 70.86
Metric Learning
Offline training
datadata
data
instan
ces
deep
embeddi
ng
model
Training: achieve good distributions for offline categories
Inference: Nearest Neightbour in the embedding space
query
semantic
embedding
space
class 3
class 1
class 2
class C
class A
class B
New task data d(q,A)
d(q,C)
d(q,B)
classification
Metric Learning
Relation networks, Sung et.al., CVPR 2018
Use the Siamese Networks principle :
• Concatenate embeddings of query and support samples
• Relation module is trained to produces score 1 for correct class and 0 for others
• Extends to zero-shot learning by replacing support embeddings with semantic features.
replicated from Sung et.al., Learning to
Compare - Relation network for few-shot
learning, CVPR 2018
Method
miniImageNet
classification
accuracy 1/5
shot
Matching
networks
43.56 / 55.31
MAML 48.70 / 63.11
Relation
networks
50.44 / 65.32
Meta-SGD 54.24 / 70.86
LEO 61.76 / 77.59
Metric Learning
Matching Networks,Vinyals et.al., NIPS 2016
Objective: maximize the cross-entropy for the non-parametric softmax
classifier (𝑥,𝑦) 𝑙𝑜𝑔𝑃 𝜃 𝑦 𝑥, 𝑆 , with
𝑃 𝜃 𝑦 𝑥, 𝑆 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑐𝑜𝑠 𝑓 𝑥, 𝑆 , 𝑔 𝑥𝑖, 𝑆
Each ca
by a sin
Prototypical Networks, Snell et al, 2016:
Each category is represented by it mean sample (prototype).
Objective: maximize the cross-entropy with the prototypes-based
Method
miniImageNet
classification
accuracy 1/5
shot
Matching
networks
43.56 / 55.31
MAML 48.70 / 63.11
Relation
networks
50.44 / 65.32
Prototypical
Networks
49.42 / 68.20
Meta-SGD 54.24 / 70.86
LEO 61.76 / 77.59
Prototypical Networks
• In Prototypical Networks Snell et al. apply a compelling inductive bias in the form of class
prototypes to achieve impressive few-shot performance — exceeding Matching Networks
without the complication of FCE. The key assumption is made is that there exists an
embedding in which samples from each class cluster around a single prototypical
representation which is simply the mean of the individual samples
Sample synthesis
Offline stage
datadata
data
instan
ces
train a
synthesizer
sampling from
class distribution
synthesizer
model
data knowledge
On new task data
datafew data
instances
synthesizer
model
novel classes
datadata
many
data
instances
train a
model
task
model
datadataoffline
data
More augmentation approaches
Δ-encoder Schwartz et.al., NeurIPS 2018
• Use a variant of autoencoder to capture the intra-class
difference between two class samples in the latent space.
• Transfer class distributions from training to novel classes.
Encoder
Decoder
𝑍
Sampled
target
Sampled
reference
Sampled
delta
New class
reference
Synthesized
new class
example
Synthesis
Eliyahu Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes and Alex M. Bronstein, 'Delta-encoder: an
effective sample synthesis method for few-shot object recognition', NeurIPS 2018.
Few shot learning through Information Retrieval lens
Goal: Ranking For classification
We want to classify the points by finding out which class is most similar one
So we are going to rank all the other w.r.t to some similarity measure
Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017.
https://arxiv.org/abs/1707.02610
Mean Average Precision
Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017.
https://arxiv.org/abs/1707.02610
PROBLEMS AHEAD
The mean Average Precision is a terrible loss function (for gradient descent purposes)
Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017.
https://arxiv.org/abs/1707.02610
Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017.
https://arxiv.org/abs/1707.02610
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models
Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, Victor Lempitsky Submitted on 20 May 2019
Thank you

Más contenido relacionado

La actualidad más candente

Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learningKaty Lee
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis PresentationWajdi Khattel
 
META-LEARNING.pptx
META-LEARNING.pptxMETA-LEARNING.pptx
META-LEARNING.pptxAyanaRukasar
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNNAshray Bhandare
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...Edge AI and Vision Alliance
 
Machine Learning
Machine LearningMachine Learning
Machine LearningRahul Kumar
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNINGMLReview
 

La actualidad más candente (20)

Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learning
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
One shot learning
One shot learningOne shot learning
One shot learning
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
META-LEARNING.pptx
META-LEARNING.pptxMETA-LEARNING.pptx
META-LEARNING.pptx
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Meta-Learning Presentation
Meta-Learning PresentationMeta-Learning Presentation
Meta-Learning Presentation
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 

Similar a Few shot learning/ one shot learning/ machine learning

ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Machine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsMachine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsClarence Chio
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learningVaibhav Singh
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopHannes Fassold
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningPramit Choudhary
 
AI & ML in Defence Systems - Sunil Chomal
AI & ML in Defence Systems   - Sunil ChomalAI & ML in Defence Systems   - Sunil Chomal
AI & ML in Defence Systems - Sunil ChomalSunil Chomal
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.ASHOK KUMAR
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachinePulse
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXmlaij
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningNatasha Grant
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE
 

Similar a Few shot learning/ one shot learning/ machine learning (20)

ML crash course
ML crash courseML crash course
ML crash course
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Machine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsMachine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning Systems
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learning
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
 
Deep Meta Learning
Deep Meta Learning Deep Meta Learning
Deep Meta Learning
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
AI & ML in Defence Systems - Sunil Chomal
AI & ML in Defence Systems   - Sunil ChomalAI & ML in Defence Systems   - Sunil Chomal
AI & ML in Defence Systems - Sunil Chomal
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Large Scale Distributed Deep Networks
Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks
Large Scale Distributed Deep Networks
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 

Último

Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxnuruddin69
 

Último (20)

Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 

Few shot learning/ one shot learning/ machine learning

  • 1. Few Shot Learning Asif Ali M.E SCUT CHINA Date: May 31, 2019
  • 2. Contents • Introduction • Problem statement, Why? • Approaches – Meta learning • Matching network • MAML – Metric Learning • Relation Networks • Prototypical Networks – AUGMENTAION BASED • Delta encoder • Few shot learning through informative retrieval lens
  • 3. Introduction • The ability of deep neural networks to extract complex statistics and learn high level features from vast datasets is proven. Yet current deep learning approaches suffer from poor sample efficiency in stark contrast to human perception — even a child could recognise a giraffe after seeing a single picture. • Fine-tuning a pre-trained model is a popular strategy to achieve high sample efficiency but it is a post-hoc hack Can machine learning do better? Few-shot learning aims to solve these issues
  • 4. Few shot learning • Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of samples/images and very large datasets, one/FEW-shot learning aims to learn information about object categories from one, or only a few, training samples/images. • It is estimated that a child has learned almost all of the 10 ~ 30 thousand object categories in the world by the age of six. This is due not only to the human mind's computational power, but also to its ability to synthesize and learn new object classes from existing information about different, previously learned classes.
  • 5. Problem statement Using a large annotated offline dataset, dog elephant monkey Offline trainin g perform for novel categories, represented by just a few samples each. knowledge transfer lemur rabbit mongoose model for novel categories given task Online training …
  • 6. Problem statement Using a large annotated offline dataset, knowledge transfer classifier for novel categories classification Online training Offline trainin g perform for novel categories, represented by just a few samples each. dog elephant monkey … lemur rabbit mongoose
  • 7. Problem statement Using a large annotated offline dataset, knowledge transfer detector for novel categories detection Online training Offline trainin g perform for novel categories, represented by just a few samples each. dog elephant monkey … lemur rabbit mongoose
  • 8. Problem statement Using a large annotated offline dataset, knowledge transfer regressor for novel categories Online training Offline trainin g perform regression for novel categories, represented by just a few samples each. dog elephant monkey … lemur rabbit mongoose
  • 9. Why work on few-shot learning? 1. It brings the DL closer to real-world business usecases. • Companies hesitate to spend much time and money on annotated data for a solution that they may profit. • Relevant objects are continuously replaced with new ones. DL has to be agile. 2. It involves a bunch of exciting cutting-edge technologies. Meta- learning methods Networks generating networks Data synthesizers Semantic metric spaces Graph neural networks Neural Turing Machines GANs
  • 10. Meta-learning Learn a learning strategy to adjust well to a new few-shot learning task Data augmentation Synthesize more data from the novel classes to facilitate the regular learning Metric learning Learn a `semantic` embedding space using a distance loss function Few- shot learning Each category is represented by just a few examples Learn to perform classification, detection, regression
  • 11. The n-shot, k-way task • The ability of a algorithm to perform few-shot learning is typically measured by its performance on n-shot, k-way tasks. These are run as follows: 1. A model is given a query sample belonging to a new, previously unseen class 2. It is also given a support set, S, consisting of n examples each from k different unseen classes. 3. The algorithm then has to determine which of the support set classes the query sample belongs to
  • 12. Training a meta learner to learn on each task Meta-Learning Standard learning: datadatadata instances training a learner on the data model Meta learning: datadatatasks mod el mod el learning strategy data knowledge task- specific learner task-agnostic specific classes training data target data task-specific meta- learner datadatatask data meta- learner New task
  • 13. Recurrent meta-learners Matching Networks in Vinyals et.al., NIPS 2016 Distance-based classification: based on similarity between the query and support samples in the embedding space (adaptive metric): 𝑦 = 𝑖 𝑎 𝑥, 𝑥𝑖 𝑦𝑖 , 𝑎 𝑥, 𝑥𝑖 = 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑓 𝑥, 𝑆 , 𝑔 𝑥𝑖, 𝑆 𝑓, 𝑔 - LSTM embeddings of 𝑥 dependent on the support set S • Embedding space is class-agnostic • LSTM attention mechanism adjusts the embedding to the task to be elaborated later Concept of episodes: test conditions in the training. • N new categories • M training examples per category • one query example in {1..N} categories. • Typically, N=5, M=1, 5. Method miniImageNet classification accuracy 1/5 shot Matching networks 43.56 / 55.31
  • 14. Optimization as a model for few-shot learning • META-LEARN LSTM learn a general initialization of the learner (classifier) network that allows for quick convergence of training. Problem: Gradient-based optimization in high capacity classifiers requires many iterative steps over many examples to perform well. Solution: an LSTM-based meta-learner model to learn the exact optimization algorithm to train another learner neural network classifier in the few-shot learning.
  • 15. Optimizers Optimize the learner to perform well after fine-tuning on the task data done by a single (or few) step(s) of Gradient Descent. MAML(Model-Agnostic Meta-Learning) Finn et.al., ICML 2017 Standard objective (task-specific, for task T): min θ ℒT θ , learned via update θ′ = θ − α ∙ 𝛻θℒT(θ) Meta-objective (across tasks): min θ T~p(ℑ) ℒT θ′ , learned via an update θ ← θ − β𝛻θ T~p(ℑ) ℒT θ′ reprinted from Li et.al., 2017 Meta-SGD Li et.al., 2017 “Interestingly, the learning process can continue forever, thus enabling life-long learning, and at any moment, the meta-learner can be applied to learn a learner for any new task.” Meta-SGD Li et.al., 2017 Render α as a vector of size θ. Method miniImageNet classification accuracy 1/5 shot Matching networks 43.56 / 55.31 MAML 48.70 / 63.11 Meta-SGD 54.24 / 70.86
  • 16. Metric Learning Offline training datadata data instan ces deep embeddi ng model Training: achieve good distributions for offline categories Inference: Nearest Neightbour in the embedding space query semantic embedding space class 3 class 1 class 2 class C class A class B New task data d(q,A) d(q,C) d(q,B) classification
  • 17. Metric Learning Relation networks, Sung et.al., CVPR 2018 Use the Siamese Networks principle : • Concatenate embeddings of query and support samples • Relation module is trained to produces score 1 for correct class and 0 for others • Extends to zero-shot learning by replacing support embeddings with semantic features. replicated from Sung et.al., Learning to Compare - Relation network for few-shot learning, CVPR 2018 Method miniImageNet classification accuracy 1/5 shot Matching networks 43.56 / 55.31 MAML 48.70 / 63.11 Relation networks 50.44 / 65.32 Meta-SGD 54.24 / 70.86 LEO 61.76 / 77.59
  • 18. Metric Learning Matching Networks,Vinyals et.al., NIPS 2016 Objective: maximize the cross-entropy for the non-parametric softmax classifier (𝑥,𝑦) 𝑙𝑜𝑔𝑃 𝜃 𝑦 𝑥, 𝑆 , with 𝑃 𝜃 𝑦 𝑥, 𝑆 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑐𝑜𝑠 𝑓 𝑥, 𝑆 , 𝑔 𝑥𝑖, 𝑆 Each ca by a sin Prototypical Networks, Snell et al, 2016: Each category is represented by it mean sample (prototype). Objective: maximize the cross-entropy with the prototypes-based Method miniImageNet classification accuracy 1/5 shot Matching networks 43.56 / 55.31 MAML 48.70 / 63.11 Relation networks 50.44 / 65.32 Prototypical Networks 49.42 / 68.20 Meta-SGD 54.24 / 70.86 LEO 61.76 / 77.59
  • 19. Prototypical Networks • In Prototypical Networks Snell et al. apply a compelling inductive bias in the form of class prototypes to achieve impressive few-shot performance — exceeding Matching Networks without the complication of FCE. The key assumption is made is that there exists an embedding in which samples from each class cluster around a single prototypical representation which is simply the mean of the individual samples
  • 20. Sample synthesis Offline stage datadata data instan ces train a synthesizer sampling from class distribution synthesizer model data knowledge On new task data datafew data instances synthesizer model novel classes datadata many data instances train a model task model datadataoffline data
  • 21. More augmentation approaches Δ-encoder Schwartz et.al., NeurIPS 2018 • Use a variant of autoencoder to capture the intra-class difference between two class samples in the latent space. • Transfer class distributions from training to novel classes. Encoder Decoder 𝑍 Sampled target Sampled reference Sampled delta New class reference Synthesized new class example Synthesis Eliyahu Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes and Alex M. Bronstein, 'Delta-encoder: an effective sample synthesis method for few-shot object recognition', NeurIPS 2018.
  • 22. Few shot learning through Information Retrieval lens Goal: Ranking For classification We want to classify the points by finding out which class is most similar one So we are going to rank all the other w.r.t to some similarity measure Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017. https://arxiv.org/abs/1707.02610
  • 23. Mean Average Precision Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017. https://arxiv.org/abs/1707.02610
  • 24.
  • 25. PROBLEMS AHEAD The mean Average Precision is a terrible loss function (for gradient descent purposes)
  • 26.
  • 27.
  • 28. Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017. https://arxiv.org/abs/1707.02610
  • 29. Eleni Triantafillou, Richard Zemel, and Raquel Urtasun. Few-Shot Learning Through an Information Retrieval Lens, In Advances in Neural Information Processing Systems, 2252-2262, 2017. https://arxiv.org/abs/1707.02610
  • 30. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, Victor Lempitsky Submitted on 20 May 2019

Notas del editor

  1. Meta learning: “learn on other problems how to improve learning for our target problem”
  2. References: Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra, Matching networks for one shot learning. In NIPS 2016 Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap, Meta-Learning with Memory-Augmented Neural Networks, ICML 2016
  3. References: Finn, Chelsea, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017 Li2017 - Z. Li, F. Zhou, F. Chen, and H. Li. Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv:1707.09835, 2017
  4. References Sung, Flood & Yang, Yongxin & Zhang, Li & Xiang, Tao & H. S. Torr, Philip & Hospedales, Timothy. Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018
  5. Eliyahu Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Rogerio Feris, Abhishek Kumar, Raja Giryes and Alex M. Bronstein, 'Delta-encoder: an effective sample synthesis method for few-shot object recognition', NeurIPS 2018. Chen, Z., Fu, Y., Zhang, Y., Jiang, Y. G., Xue, X., & Sigal, L. (2018). Semantic feature augmentation in few-shot learning. arXiv preprint arXiv:1804.05298