SlideShare una empresa de Scribd logo
1 de 19
Large Scale Online Learning of Image Similarity
Through Ranking
from G. Chechik, V. Sharma, U. Shalit, S. Bengio – JML 2010

by Lukas Tencer
Motivation
• Needed for applications, which compare any kind of data:
    – image, video, web-page, document
• Two levels of similarity:
    – Features (visual for images)
    – Semantic
• Large-scale learning: limited by computational cost, not by
  availability of data
• What similarity the user wants to express, visual or semantic?
• Presented approach deals with semantic similarity once we have
  visual similarity
• Similarity learning requires pairwise distance, not always available
• Instead pairwise distance use relative distance, two images are
  close:
    – if are returned by same query
    – if does have the same label
Example of query
• Especially problem in QVE (Query by Visual Example)

• Query:



• Images retrieved for     vs.    visually similar images
  “mount royal park”
Motivation II
• Relationship to classification:
   – Similarity measure could be used as metric for
     classification
   – Good classification infers labels, which induce
     similarity across images
• Constrain on semidefinite positive
  similarity matrix:
   – for small data prevents overfitting
   – for big data, with enough of samples could
     be removed to reduce computational cost
Problem Statement
• Get pairwise similarity function S on given data
  on relative pairs of image simlarities
• Given data P and rij r ( pi , p j ) relative
  similarities
• We do not have access to all values of r, where
  it is not available equals 0
• Then S ( pi , p j ) is defined as:
S ( pi , pi )     S ( pi , pi ),   pi , pi , pi        P, such as r ( pi , pi )   r ( pi , pi )

SW ( pi , p j )    piTWp j , whereW               Rd   d
Online Algorithm
• Passive-Aggressive family of learning
  algorithms, online learning algorithm (iterative)
   – PA 1:
                      1      2
   wt   1   arg min   2
                        w wt , such as l ( w; ( xt , yt ))   0
              w Rn


   – Passive, if loss function is 0
   – Aggressive, if loss is positive, enforces to satisfy
     regardless of the step size l ( w; ( xt , yt )) 0

   – PA2: Trade off between proximity and desired
     margin – constrained optimization problem
Online Algorithm II
• So we are searching for S, with safety margin of 1, to
  then:
                    SW ( pi , pi )           SW ( pi , pi ) 1
• The hinge loss function is defined as:
        lW ( pi , pi , pi )    max{ 0,1 SW ( pi , pi ) SW ( pi , pi )}

        LW                    lW ( pi , pi , pi )
                 ( pi , pi , pi ) P
• Then the PA 2 constrained optimization problem is:
                 i           1            i 1 2
               w arg min W W                      C
                      W      2                Fro

               such that lW ( pi , pi , pi )      and 0
  where C is the parameter, which controls tradeoff
  between margin enforcement and proximity of solution
Online Algorithm III




• Loss bound could be derived by rewriting
  into linear classification problem
Sampling strategy
• Uniformly sample pi from P
• Uniformly sample pi+ from images with same category
• Uniformly sample pi- from images which does not share
  category with pi,
   – pi- could be chosen by random from all images, if number of
     categories and number of queries is very large
• If relevance feedback r(pi,pj) is not just binary function,
  then sampling of positive examples could be changed
  to prioritize samples with higher relevance
Image representation
• bag-of-word approach (bag-of-local-descriptors)
   – get regions of interest
   – calculate local descriptors
   – treat them independently
• Divide image into overlapping square blocks
• Extract color and edge descriptors
   – Edge: uniform Local Binary Patterns – difference of intensities
     at circular neighborhood,
       • 2^8 possible sequence = 256 bin histogram
       • Non-uniform sequences could be merged  59 bin histogram
   – Color: histograms from k-means clustering
       • Train color codebook and map block pixel to closes value in codebook
   – Concatenate in the end
Image representation II
• Aim for high dimensional sparse vector representation
• Thus representing local descriptor as visual term and
  image is represented as binary vector indicating
  presence/absence of visual term
• Visual terms are rated according to term frequency and
  inverse document frequency

• Parameters of setup:
   –   20 bins for colors
   –   10000 visterm vocabulary size (approx 70 non 0 values / img)
   –   Blocks of 64x64 overlapping each 32 pixels
   –   Blocks extracted at different scales, by downscaling images by
       factor of 1:25 until less then 10 block remains
Experiments and evaluation
• Tested in 2 settings
   – Caltech256 dataset (30k images)
   – Web-Scale experiment (2.7 M images)
   – (another databases for image retrieval testing: MIRFLICK
     1M, Corel5k, Corel30k, UCID)
• Web-Scale Experiment:



   – Queries from Google Image Search and relevance feedback
   – Stop condition for training is value of mean average precision (160M
     iterations) ~ 4000 min on single CPU
   – Evaluation Criterion: mAP and precision at top k
Failure cases
Scalability
•   Comparison with Largest Margin Nearest Neighbour LMNN
•   Scales linearly with number of images
Caltech 256 test
Discussion
• Metric learning could help to capture semantic relationships, once
  visual similarity is available
• Relevance feedback or semantic similarity measure (class
  modeling) is required to capture semantic similarity
• Compared to raw visual similarity comparison precision at top k
  and mAP increases,
   • recall is hard to measure for databases, which are not fully
      annotated
• Online metric learning is an ongoing problem (Davis 2007) (Jain
  2008) (Chechik 2010) and even though applied to images, could
  be used in other fields to capture semantic similarity
   • Images: object semantics vs. visual features
   • Documents: topics vs. textual features (dtf,tf-idf)
   • SBIR: relative object mapping vs. sketch features
Thank you for your attention
              Available at: http://www.slideshare.net/lukastencer

Más contenido relacionado

La actualidad más candente

InfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksInfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksZak Jost
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prismalostleaves
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Dongmin Choi
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Bartlomiej Twardowski
 
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Jonathon Hare
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating modelsLuba Elliott
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Dongmin Choi
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Simone Ercoli
 

La actualidad más candente (20)

Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
InfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial NetworksInfoGAN and Generative Adversarial Networks
InfoGAN and Generative Adversarial Networks
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prisma
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating models
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Convolutional Features for Instance Search
Convolutional Features for Instance SearchConvolutional Features for Instance Search
Convolutional Features for Instance Search
 
Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
 

Destacado

Web-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrievalWeb-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrievalLukas Tencer
 
ICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionLukas Tencer
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised LearningLukas Tencer
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability DistibutionLukas Tencer
 
Introduction to Probability
Introduction to ProbabilityIntroduction to Probability
Introduction to ProbabilityLukas Tencer
 
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalSupervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalLukas Tencer
 
Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20Lukas Tencer
 
Slovakia Presentation at Day of Cultures
Slovakia Presentation at Day of CulturesSlovakia Presentation at Day of Cultures
Slovakia Presentation at Day of CulturesLukas Tencer
 
Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1Lukas Tencer
 
Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011Lukas Tencer
 
Data driven recruiting
Data driven recruitingData driven recruiting
Data driven recruitingBrendan Browne
 
Computer graphics on web and in mobile devices
Computer graphics on web and in mobile devicesComputer graphics on web and in mobile devices
Computer graphics on web and in mobile devicesLukas Tencer
 
AIT presentation
AIT presentationAIT presentation
AIT presentationShan .
 

Destacado (14)

Web-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrievalWeb-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrieval
 
ICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and Interaction
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
Introduction to Probability
Introduction to ProbabilityIntroduction to Probability
Introduction to Probability
 
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalSupervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
 
Telnet and SSH
Telnet and SSHTelnet and SSH
Telnet and SSH
 
Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20
 
Slovakia Presentation at Day of Cultures
Slovakia Presentation at Day of CulturesSlovakia Presentation at Day of Cultures
Slovakia Presentation at Day of Cultures
 
Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1
 
Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011
 
Data driven recruiting
Data driven recruitingData driven recruiting
Data driven recruiting
 
Computer graphics on web and in mobile devices
Computer graphics on web and in mobile devicesComputer graphics on web and in mobile devices
Computer graphics on web and in mobile devices
 
AIT presentation
AIT presentationAIT presentation
AIT presentation
 

Similar a Large Scale Online Learning of Image Similarity Through Ranking

Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsNYC Predictive Analytics
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptxsghorai
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
Week06 bme429-cbir
Week06 bme429-cbirWeek06 bme429-cbir
Week06 bme429-cbirIkram Moalla
 
Computer Vision descriptors
Computer Vision descriptorsComputer Vision descriptors
Computer Vision descriptorsWael Badawy
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...Daiki Tanaka
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...Wei Lu
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...DB Tsai
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...ActiveEon
 
background.pptx
background.pptxbackground.pptx
background.pptxKabileshCm
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptxAbdusSadik
 
1 chayes
1 chayes1 chayes
1 chayesYandex
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lecturesTaymoor Nazmy
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection aftab alam
 

Similar a Large Scale Online Learning of Image Similarity Through Ranking (20)

Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
2. IP Fundamentals.pdf
2. IP Fundamentals.pdf2. IP Fundamentals.pdf
2. IP Fundamentals.pdf
 
Week06 bme429-cbir
Week06 bme429-cbirWeek06 bme429-cbir
Week06 bme429-cbir
 
Computer Vision descriptors
Computer Vision descriptorsComputer Vision descriptors
Computer Vision descriptors
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
 
background.pptx
background.pptxbackground.pptx
background.pptx
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
1 chayes
1 chayes1 chayes
1 chayes
 
Image processing 1-lectures
Image processing  1-lecturesImage processing  1-lectures
Image processing 1-lectures
 
Digital Image Fundamentals - II
Digital Image Fundamentals - IIDigital Image Fundamentals - II
Digital Image Fundamentals - II
 
R user group meeting 25th jan 2017
R user group meeting 25th jan 2017R user group meeting 25th jan 2017
R user group meeting 25th jan 2017
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Large Scale Online Learning of Image Similarity Through Ranking

  • 1. Large Scale Online Learning of Image Similarity Through Ranking from G. Chechik, V. Sharma, U. Shalit, S. Bengio – JML 2010 by Lukas Tencer
  • 2. Motivation • Needed for applications, which compare any kind of data: – image, video, web-page, document • Two levels of similarity: – Features (visual for images) – Semantic • Large-scale learning: limited by computational cost, not by availability of data • What similarity the user wants to express, visual or semantic? • Presented approach deals with semantic similarity once we have visual similarity • Similarity learning requires pairwise distance, not always available • Instead pairwise distance use relative distance, two images are close: – if are returned by same query – if does have the same label
  • 3. Example of query • Especially problem in QVE (Query by Visual Example) • Query: • Images retrieved for vs. visually similar images “mount royal park”
  • 4. Motivation II • Relationship to classification: – Similarity measure could be used as metric for classification – Good classification infers labels, which induce similarity across images • Constrain on semidefinite positive similarity matrix: – for small data prevents overfitting – for big data, with enough of samples could be removed to reduce computational cost
  • 5. Problem Statement • Get pairwise similarity function S on given data on relative pairs of image simlarities • Given data P and rij r ( pi , p j ) relative similarities • We do not have access to all values of r, where it is not available equals 0 • Then S ( pi , p j ) is defined as: S ( pi , pi ) S ( pi , pi ), pi , pi , pi P, such as r ( pi , pi ) r ( pi , pi ) SW ( pi , p j ) piTWp j , whereW Rd d
  • 6. Online Algorithm • Passive-Aggressive family of learning algorithms, online learning algorithm (iterative) – PA 1: 1 2 wt 1 arg min 2 w wt , such as l ( w; ( xt , yt )) 0 w Rn – Passive, if loss function is 0 – Aggressive, if loss is positive, enforces to satisfy regardless of the step size l ( w; ( xt , yt )) 0 – PA2: Trade off between proximity and desired margin – constrained optimization problem
  • 7. Online Algorithm II • So we are searching for S, with safety margin of 1, to then: SW ( pi , pi ) SW ( pi , pi ) 1 • The hinge loss function is defined as: lW ( pi , pi , pi ) max{ 0,1 SW ( pi , pi ) SW ( pi , pi )} LW lW ( pi , pi , pi ) ( pi , pi , pi ) P • Then the PA 2 constrained optimization problem is: i 1 i 1 2 w arg min W W C W 2 Fro such that lW ( pi , pi , pi ) and 0 where C is the parameter, which controls tradeoff between margin enforcement and proximity of solution
  • 8. Online Algorithm III • Loss bound could be derived by rewriting into linear classification problem
  • 9. Sampling strategy • Uniformly sample pi from P • Uniformly sample pi+ from images with same category • Uniformly sample pi- from images which does not share category with pi, – pi- could be chosen by random from all images, if number of categories and number of queries is very large • If relevance feedback r(pi,pj) is not just binary function, then sampling of positive examples could be changed to prioritize samples with higher relevance
  • 10. Image representation • bag-of-word approach (bag-of-local-descriptors) – get regions of interest – calculate local descriptors – treat them independently • Divide image into overlapping square blocks • Extract color and edge descriptors – Edge: uniform Local Binary Patterns – difference of intensities at circular neighborhood, • 2^8 possible sequence = 256 bin histogram • Non-uniform sequences could be merged  59 bin histogram – Color: histograms from k-means clustering • Train color codebook and map block pixel to closes value in codebook – Concatenate in the end
  • 11. Image representation II • Aim for high dimensional sparse vector representation • Thus representing local descriptor as visual term and image is represented as binary vector indicating presence/absence of visual term • Visual terms are rated according to term frequency and inverse document frequency • Parameters of setup: – 20 bins for colors – 10000 visterm vocabulary size (approx 70 non 0 values / img) – Blocks of 64x64 overlapping each 32 pixels – Blocks extracted at different scales, by downscaling images by factor of 1:25 until less then 10 block remains
  • 12. Experiments and evaluation • Tested in 2 settings – Caltech256 dataset (30k images) – Web-Scale experiment (2.7 M images) – (another databases for image retrieval testing: MIRFLICK 1M, Corel5k, Corel30k, UCID) • Web-Scale Experiment: – Queries from Google Image Search and relevance feedback – Stop condition for training is value of mean average precision (160M iterations) ~ 4000 min on single CPU – Evaluation Criterion: mAP and precision at top k
  • 13.
  • 15.
  • 16. Scalability • Comparison with Largest Margin Nearest Neighbour LMNN • Scales linearly with number of images
  • 18. Discussion • Metric learning could help to capture semantic relationships, once visual similarity is available • Relevance feedback or semantic similarity measure (class modeling) is required to capture semantic similarity • Compared to raw visual similarity comparison precision at top k and mAP increases, • recall is hard to measure for databases, which are not fully annotated • Online metric learning is an ongoing problem (Davis 2007) (Jain 2008) (Chechik 2010) and even though applied to images, could be used in other fields to capture semantic similarity • Images: object semantics vs. visual features • Documents: topics vs. textual features (dtf,tf-idf) • SBIR: relative object mapping vs. sketch features
  • 19. Thank you for your attention Available at: http://www.slideshare.net/lukastencer