SlideShare una empresa de Scribd logo
1 de 63
Descargar para leer sin conexión
DEEP
LEARNING
WORKSHOP
Dublin City University
28-29 April 2017
Eva Mohedano
eva.mohedano@insight-centre.org
PhD Student
Insight Centre for Data Analytics
Dublin City University
Content-based
Image Retrieval
Day 2 Lecture 6
Overview
● What is content based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
2
Overview
● What is content based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
3
The problem: query by example
Given:
● An example query image that
illustrates the user's information
need
● A very large dataset of images
Task:
● Rank all images in the dataset
according to how likely they
are to fulfil the user's
information need
4
Classification
5
Query: This chair
Results from dataset classified as “chair”
Retrieval
6
Query: This chair
Results from dataset ranked by similarity to the query
Overview
● What is content based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
7
The retrieval pipeline
8
Image RepresentationsQuery
Image
Dataset
Image Matching Ranked List
Similarity score Image
.
.
.
0.98
0.97
0.10
0.01
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Euclidean distance
Cosine Similarity
Similarity
Metric .
.
.
The classic SIFT retrieval pipeline
9
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
variable number of
feature vectors per image
Bag of Visual
Words
N-Dimensional
feature space
M visual words
(M clusters)
INVERTED FILE
word Image ID
1 1, 12,
2 1, 30, 102
3 10, 12
4 2,3
6 10
...
Large vocabularies (50k-1M)
Very fast!
Typically used with SIFT features
Convolutional neural networks and retrieval
10
Classification Object Detection Segmentation
Overview
● What is content based image retrieval?
● The classic SIFT retrieval pipeline
● Using off-the-shelf CNN features for retrieval
● Learning representations for retrieval
11
Off-the-shelf CNN representations
12Razavian et al. CNN features off-the-shelf: an astounding baseline for recognition. In CVPR
Babenko et al. Neural codes for image retrieval. ECCV 2014
FC layers as global feature
representation
Off-the-shelf CNN representations
Neural codes for retrieval [1]
● FC7 layer (4096D)
● L2
norm + PCA whitening + L2
norm
● Euclidean distance
● Only better than traditional SIFT approach after fine tuning on similar domain image dataset.
CNN features off-the-shelf: an astounding baseline for recognition [2]
● Extending Babenko’s approach with spatial search
● Several features extracted by image (sliding window approach)
● Really good results but too computationally expensive for practical situations
[1] Babenko et al, Neural codes for image retrieval CVPR 2014
[2] Razavian et al, CNN features off-the-shelf: an astounding baseline for recognition, CVPR 2014
13
Off-the-shelf CNN representations
14
sum/max pool conv features across filters
Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015
Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv:1511.05879.
Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv:1512.04065.
Off-the-shelf CNN representations
15
Descriptors from convolutional layers
Off-the-shelf CNN representations
16
Pooling features on conv layers allow to describe specific parts of an image
Off-the-shelf CNN representations
[2] Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015
[1] Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016
Apply spatial weighting on the features
before pooling them
Sum/max pooling operation of a conv
layer
[1] weighting based on ‘strength’ of
the local features
[2] weighting based on the
distance to the center of
the image
17
R-MAC
Regional Maximum Activation of Convolution.
Settings
- Fully convolutional off-the-shelf VGG16
- Pool5
- Spatial Max pooling
- High Resolution images
- Global descriptor based on aggregating region vectors
- Sliding window approach
R-MAC
Region1
R-MAC
Region1
Region2
R-MAC
Region1
Region2
Region3
R-MAC
Region1
Region2
Region3
Region4
R-MAC
Region1
Region2
Region3
Region4
...
R-MAC
Region1
Region2
Region3
Region4
...
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
Region vectors
512D
l2-PCAw-l2
Region1
Region2
Region3
Region4
...
RegionN
Global
vector
512D
Off-the-shelf CNN representations
27
BoW, VLAD encoding of conv features
Ng et al. Exploiting local features from deep networks for image retrieval. CVPR Workshops 2015
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
Off-the-shelf CNN representations
28
(336x256)
Resolution
conv5_1 from
VGG16
(42x32)
25K centroids 25K-D vector
Descriptors from convolutional layers
Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
Off-the-shelf CNN representations
29
(336x256)
Resolution
conv5_1 from
VGG16
(42x32)
25K centroids 25K-D vector
Off-the-shelf CNN representations
30
Paris Buildings 6k Oxford Buildings 5k
TRECVID Instance Search 2013
(subset of 23k frames)
[7] Kalantidis et al. Cross-dimensional Weighting for Aggregated
Deep Convolutional Features. arXiv:1512.04065.
Mohedano et al. Bags of Local Convolutional Features for Scalable
Instance Search. ICMR 2016
Off-the-shelf CNN representations
CNN representations
● L2
Normalization + PCA whitening + L2
Normalization
● Cosine similarity
● Convolutional features better than fully connected features
● Convolutional features keep spatial information → retrieval + object location
● Convolutional layers allows custom input size.
● If data labels available, fine tuning the network to the image domain improves
CNN representations.
31
CNN as a feature extractor
source : Zheng, SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016
32
Overview
● What is content based image retrieval?
● The classic SIFT retrieval pipeline
● Using off the shelf CNN features for retrieval
● Learning representations for retrieval
33
What loss function should we use?
Learning representations for retrieval
Siamese network: network to learn a function that maps input
patterns into a target space such that L2
norm in the target
space approximates the semantic distance in the input space.
Applied in:
● Dimensionality reduction [1]
● Face verification [2]
● Learning local image representations [3]
35
[1] Song et al.: Deep metric learning via lifted structured feature embedding. CVPR 2015
[2] Chopra et al. Learning a similarity metric discriminatively, with application to face verification CVPR' 2005
[3] Simo-Serra et al. Fracking deep convolutional image descriptors. CoRR, abs/1412.6537, 2014
Learning representations for retrieval
Siamese network: network to learn a function that maps input
patterns into a target space such that L2
norm in the target
space approximates the semantic distance in the input space.
36
Image credit: Simo-Serra et al. Fracking deep convolutional image descriptors. CoRR 2014
Margin: m
Hinge loss
Negative pairs: if
nearer than the
margin, pay a linear
penalty
The margin is a
hyperparameter
37
Learning representations for retrieval
Siamese network with triplet loss: loss function minimizes distance between query and positive
and maximizes distance between query and negative
38Schroff et al. FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015
CNN
w w
a p n
L2 embedding space
Triplet Loss
CNN CNN
How do we get the training
data?
Exploring Image datasets with GPS coordinates (Google Time Machine Data)
53.386037N, -6.255495E
Geolocation
DCU
Dataset Images with
Geolocations
Ranked List
N E
N E
N E
N E
N E
.
.
.
Visual Similarity
Top N
Place recognition problem
casted as a image retrieval
?
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
Radenović, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016
Gordo, Deep Image Retrieval: Learning global representations for image search, CVPR 2016
[1] Schonberger From single image query to detailed 3D reconstruction
Automatic data cleaning
Strong baseline (SIFT) +
geometric verification + 3D
camera estimation[1]
Further manual inspection
Further manual inspection
Radenović, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016
Generating training pairs
Select the hardest negative
Select triplet which
generates larger loss
Jayaraman and Grauman, Slow and steady feature analysis: higher order temporal coherence in video, ECCV 2016
NetVLAD
Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
VLAD
Local features
Vectors dimension D
Sparse local feature extraction Dense Local Feature extraction
Feature Space D-dim
Conv layer i
D
H
W
N = H*W local
descriptors of
dimension D
VLAD
Local features
Vectors dimension D
K cluster centers
Visual Vocabulary
(K vectors with dimension D)
Learnt from data with K-means
Feature Space D-dim
Local features
Vectors dimension D
K cluster centers
Visual Vocabulary
(K vectors with dimension D)
Learnt from data with K-means
Feature Space D-dim
VLAD vector
Sum of all residuals of the
assignments to the cluster 1
d1 d2
d3
d4
Dim D
Dim K x D
VLAD
V Matrix VLAD (residuals)
D
k
Hard Assignment
Not differentiable !
j
Soft Assignment
N Local features {Xi}
D
i
K Clusters features {Ck}
D
k
0.99
0.98
0.05
0.01
0.00
Cluster k=1
j
j
1
1
0
0
0
Softmax function :)
VLAD
VGG16 and Alexnet
Consider the positive
example closer to the query
Consider all negative
candidates
With distance inferior to a
certain margin
Fine-tuned R-MAC
R-MAC
Region1
Region2
Region3
Region4
...
RegionN
Region vectors
512D
l2-PCAw-l2
Region1
Region2
Region3
Region4
...
RegionN
Global
vector
512D
Learning representations for retrieval
54
Generated dataset to create the training triplets
Dataset: Landmarks dataset:
● 214K images of 672 famous landmark site.
● Dataset processing based on a matching
baseline: SIFT + Hessian-Affine keypoint
detector.
● Important to select the “useful” triplets.
Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
Learning representations for retrieval
55
Learning representations for retrieval
56
Comparison between R-MAC from off-the-shelf network and R-MAC retrained for retrieval
Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
Summary
57
● Pre-trained CNN are useful to generate image descriptors for retrieval
● Convolutional layers allow us to encode local information
● Knowing how to rank similarity is the primary task in retrieval
● Designing CNN architectures to learn how to rank
Questions?
58
Summary state-of-the-art
Layer Method Dim Oxford 5k Paris 6k
Off-the-shelf
CNN
representations
Global
representation
fc Neural Codes 128 0.433 -
conv
SPoC 256 0.657 -
CroW 256 0.684 0.765
R-MAC 256 0.561 0.729
conv
NetVLAD 256 0.555 0.643
BoCF 25k(*) 0.739 0.82
Multiple
representations
fc CNN-astounding 4-15k 0.68 0.795
conv
CNN-astounding2 32k 0.843 0.879
Learning
representations
for retrieval
Global
representation
conv
MAC 512 0.774 0.824
regionProposal+MAC 512 0.813 0.855
NetVLAD 512 0.676 0.749
59
PARIS dataset
Oxford dataset
TRECVID dataset
INSTRE dataset

Más contenido relacionado

La actualidad más candente

Image segmentation
Image segmentationImage segmentation
Image segmentationKuppusamy P
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance SegmentationHichem Felouat
 
Lec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemLec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemUlaş Bağcı
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Hsing-chuan Hsieh
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Super Resolution
Super ResolutionSuper Resolution
Super Resolutionalokahuti
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Basit Rafiq
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
 
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
[論文解説]Unsupervised monocular depth estimation with Left-Right ConsistencyRyutaro Yamauchi
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classificationWael Badawy
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnRahat Yasir
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationVikas Jain
 
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisKento Doi
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 

La actualidad más candente (20)

Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 
Lec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization ProblemLec10: Medical Image Segmentation as an Energy Minimization Problem
Lec10: Medical Image Segmentation as an Energy Minimization Problem
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Super Resolution
Super ResolutionSuper Resolution
Super Resolution
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
Edge detection
Edge detectionEdge detection
Edge detection
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep Learning
 
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
[論文解説]Unsupervised monocular depth estimation with Left-Right Consistency
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Object detection
Object detectionObject detection
Object detection
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 

Destacado

Content based image retrieval(cbir)
Content based image retrieval(cbir)Content based image retrieval(cbir)
Content based image retrieval(cbir)paddu123
 
Pattern recognition voice biometrics
Pattern recognition voice biometricsPattern recognition voice biometrics
Pattern recognition voice biometricsMazin Alwaaly
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrievalrubaiyat11
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image RetrievalAman Patel
 
Content Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval AlgorithmContent Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval AlgorithmAkshit Bum
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final pptrinki nag
 
Need for Software Engineering
Need for Software EngineeringNeed for Software Engineering
Need for Software EngineeringUpekha Vandebona
 
Multimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMultimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMazin Alwaaly
 
CBIR For Medical Imaging...
CBIR For Medical  Imaging...CBIR For Medical  Imaging...
CBIR For Medical Imaging...Isha Sharma
 
Literature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalLiterature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalUpekha Vandebona
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image RetrievalPrem kumar
 
Content based image retrieval (cbir) using
Content based image retrieval (cbir) usingContent based image retrieval (cbir) using
Content based image retrieval (cbir) usingijcsity
 
Content based image retrieval using clustering Algorithm(CBIR)
Content based image retrieval using clustering Algorithm(CBIR)Content based image retrieval using clustering Algorithm(CBIR)
Content based image retrieval using clustering Algorithm(CBIR)Raja Sekar
 

Destacado (20)

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Content based image retrieval(cbir)
Content based image retrieval(cbir)Content based image retrieval(cbir)
Content based image retrieval(cbir)
 
Pattern recognition voice biometrics
Pattern recognition voice biometricsPattern recognition voice biometrics
Pattern recognition voice biometrics
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
Image retrieval
Image retrievalImage retrieval
Image retrieval
 
CBIR
CBIRCBIR
CBIR
 
Content Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval AlgorithmContent Based Image and Video Retrieval Algorithm
Content Based Image and Video Retrieval Algorithm
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final ppt
 
Need for Software Engineering
Need for Software EngineeringNeed for Software Engineering
Need for Software Engineering
 
Multimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMultimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital libraries
 
CBIR For Medical Imaging...
CBIR For Medical  Imaging...CBIR For Medical  Imaging...
CBIR For Medical Imaging...
 
Literature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalLiterature Review on Content Based Image Retrieval
Literature Review on Content Based Image Retrieval
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
Content based image retrieval (cbir) using
Content based image retrieval (cbir) usingContent based image retrieval (cbir) using
Content based image retrieval (cbir) using
 
Cbir ‐ features
Cbir ‐ featuresCbir ‐ features
Cbir ‐ features
 
Ga
GaGa
Ga
 
CBIR
CBIRCBIR
CBIR
 
Content based image retrieval using clustering Algorithm(CBIR)
Content based image retrieval using clustering Algorithm(CBIR)Content based image retrieval using clustering Algorithm(CBIR)
Content based image retrieval using clustering Algorithm(CBIR)
 

Similar a Deep Learning Workshop Content-Based Image Retrieval Lecture

Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfSamuCerezo
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learningNAVER Engineering
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術CHENHuiMei
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET Journal
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...Hyeongmin Lee
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding捷恩 蔡
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
 
Aerial detection part3
Aerial detection part3Aerial detection part3
Aerial detection part3ssuser456ad6
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachConvolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachUniversitat de Barcelona
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Improving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationImproving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationAmgad Muhammad
 

Similar a Deep Learning Workshop Content-Based Image Retrieval Lecture (20)

Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
IRJET- Remote Sensing Image Retrieval using Convolutional Neural Network with...
 
Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval Class Weighted Convolutional Features for Image Retrieval
Class Weighted Convolutional Features for Image Retrieval
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
Aerial detection part3
Aerial detection part3Aerial detection part3
Aerial detection part3
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachConvolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approach
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Improving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimizationImproving region based CNN object detector using bayesian optimization
Improving region based CNN object detector using bayesian optimization
 
Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
 

Más de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Más de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Último

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 

Último (20)

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 

Deep Learning Workshop Content-Based Image Retrieval Lecture

  • 1. DEEP LEARNING WORKSHOP Dublin City University 28-29 April 2017 Eva Mohedano eva.mohedano@insight-centre.org PhD Student Insight Centre for Data Analytics Dublin City University Content-based Image Retrieval Day 2 Lecture 6
  • 2. Overview ● What is content based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval 2
  • 3. Overview ● What is content based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval 3
  • 4. The problem: query by example Given: ● An example query image that illustrates the user's information need ● A very large dataset of images Task: ● Rank all images in the dataset according to how likely they are to fulfil the user's information need 4
  • 5. Classification 5 Query: This chair Results from dataset classified as “chair”
  • 6. Retrieval 6 Query: This chair Results from dataset ranked by similarity to the query
  • 7. Overview ● What is content based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval 7
  • 8. The retrieval pipeline 8 Image RepresentationsQuery Image Dataset Image Matching Ranked List Similarity score Image . . . 0.98 0.97 0.10 0.01 v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Euclidean distance Cosine Similarity Similarity Metric . . .
  • 9. The classic SIFT retrieval pipeline 9 v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... variable number of feature vectors per image Bag of Visual Words N-Dimensional feature space M visual words (M clusters) INVERTED FILE word Image ID 1 1, 12, 2 1, 30, 102 3 10, 12 4 2,3 6 10 ... Large vocabularies (50k-1M) Very fast! Typically used with SIFT features
  • 10. Convolutional neural networks and retrieval 10 Classification Object Detection Segmentation
  • 11. Overview ● What is content based image retrieval? ● The classic SIFT retrieval pipeline ● Using off-the-shelf CNN features for retrieval ● Learning representations for retrieval 11
  • 12. Off-the-shelf CNN representations 12Razavian et al. CNN features off-the-shelf: an astounding baseline for recognition. In CVPR Babenko et al. Neural codes for image retrieval. ECCV 2014 FC layers as global feature representation
  • 13. Off-the-shelf CNN representations Neural codes for retrieval [1] ● FC7 layer (4096D) ● L2 norm + PCA whitening + L2 norm ● Euclidean distance ● Only better than traditional SIFT approach after fine tuning on similar domain image dataset. CNN features off-the-shelf: an astounding baseline for recognition [2] ● Extending Babenko’s approach with spatial search ● Several features extracted by image (sliding window approach) ● Really good results but too computationally expensive for practical situations [1] Babenko et al, Neural codes for image retrieval CVPR 2014 [2] Razavian et al, CNN features off-the-shelf: an astounding baseline for recognition, CVPR 2014 13
  • 14. Off-the-shelf CNN representations 14 sum/max pool conv features across filters Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015 Tolias et al. Particular object retrieval with integral max-pooling of CNN activations. arXiv:1511.05879. Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv:1512.04065.
  • 16. Off-the-shelf CNN representations 16 Pooling features on conv layers allow to describe specific parts of an image
  • 17. Off-the-shelf CNN representations [2] Babenko and Lempitsky, Aggregating local deep features for image retrieval. ICCV 2015 [1] Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. ECCV 2016 Apply spatial weighting on the features before pooling them Sum/max pooling operation of a conv layer [1] weighting based on ‘strength’ of the local features [2] weighting based on the distance to the center of the image 17
  • 18. R-MAC Regional Maximum Activation of Convolution. Settings - Fully convolutional off-the-shelf VGG16 - Pool5 - Spatial Max pooling - High Resolution images - Global descriptor based on aggregating region vectors - Sliding window approach
  • 27. Off-the-shelf CNN representations 27 BoW, VLAD encoding of conv features Ng et al. Exploiting local features from deep networks for image retrieval. CVPR Workshops 2015 Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
  • 28. Off-the-shelf CNN representations 28 (336x256) Resolution conv5_1 from VGG16 (42x32) 25K centroids 25K-D vector Descriptors from convolutional layers Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
  • 29. Off-the-shelf CNN representations 29 (336x256) Resolution conv5_1 from VGG16 (42x32) 25K centroids 25K-D vector
  • 30. Off-the-shelf CNN representations 30 Paris Buildings 6k Oxford Buildings 5k TRECVID Instance Search 2013 (subset of 23k frames) [7] Kalantidis et al. Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv:1512.04065. Mohedano et al. Bags of Local Convolutional Features for Scalable Instance Search. ICMR 2016
  • 31. Off-the-shelf CNN representations CNN representations ● L2 Normalization + PCA whitening + L2 Normalization ● Cosine similarity ● Convolutional features better than fully connected features ● Convolutional features keep spatial information → retrieval + object location ● Convolutional layers allows custom input size. ● If data labels available, fine tuning the network to the image domain improves CNN representations. 31
  • 32. CNN as a feature extractor source : Zheng, SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016 32
  • 33. Overview ● What is content based image retrieval? ● The classic SIFT retrieval pipeline ● Using off the shelf CNN features for retrieval ● Learning representations for retrieval 33
  • 34. What loss function should we use?
  • 35. Learning representations for retrieval Siamese network: network to learn a function that maps input patterns into a target space such that L2 norm in the target space approximates the semantic distance in the input space. Applied in: ● Dimensionality reduction [1] ● Face verification [2] ● Learning local image representations [3] 35 [1] Song et al.: Deep metric learning via lifted structured feature embedding. CVPR 2015 [2] Chopra et al. Learning a similarity metric discriminatively, with application to face verification CVPR' 2005 [3] Simo-Serra et al. Fracking deep convolutional image descriptors. CoRR, abs/1412.6537, 2014
  • 36. Learning representations for retrieval Siamese network: network to learn a function that maps input patterns into a target space such that L2 norm in the target space approximates the semantic distance in the input space. 36 Image credit: Simo-Serra et al. Fracking deep convolutional image descriptors. CoRR 2014
  • 37. Margin: m Hinge loss Negative pairs: if nearer than the margin, pay a linear penalty The margin is a hyperparameter 37
  • 38. Learning representations for retrieval Siamese network with triplet loss: loss function minimizes distance between query and positive and maximizes distance between query and negative 38Schroff et al. FaceNet: A Unified Embedding for Face Recognition and Clustering, CVPR 2015 CNN w w a p n L2 embedding space Triplet Loss CNN CNN
  • 39. How do we get the training data?
  • 40. Exploring Image datasets with GPS coordinates (Google Time Machine Data) 53.386037N, -6.255495E Geolocation DCU Dataset Images with Geolocations Ranked List N E N E N E N E N E . . . Visual Similarity Top N Place recognition problem casted as a image retrieval ? Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
  • 41. Radenović, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016 Gordo, Deep Image Retrieval: Learning global representations for image search, CVPR 2016 [1] Schonberger From single image query to detailed 3D reconstruction Automatic data cleaning Strong baseline (SIFT) + geometric verification + 3D camera estimation[1] Further manual inspection Further manual inspection
  • 42. Radenović, CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples, CVPR 2016 Generating training pairs Select the hardest negative Select triplet which generates larger loss
  • 43. Jayaraman and Grauman, Slow and steady feature analysis: higher order temporal coherence in video, ECCV 2016
  • 44. NetVLAD Relja Arandjelović et al, NetVLAD: CNN architecture for weakly supervised place recognition, CVPR 2016
  • 45. VLAD Local features Vectors dimension D Sparse local feature extraction Dense Local Feature extraction Feature Space D-dim Conv layer i D H W N = H*W local descriptors of dimension D
  • 46. VLAD Local features Vectors dimension D K cluster centers Visual Vocabulary (K vectors with dimension D) Learnt from data with K-means Feature Space D-dim
  • 47. Local features Vectors dimension D K cluster centers Visual Vocabulary (K vectors with dimension D) Learnt from data with K-means Feature Space D-dim VLAD vector Sum of all residuals of the assignments to the cluster 1 d1 d2 d3 d4 Dim D Dim K x D VLAD
  • 48. V Matrix VLAD (residuals) D k Hard Assignment Not differentiable ! j Soft Assignment N Local features {Xi} D i K Clusters features {Ck} D k 0.99 0.98 0.05 0.01 0.00 Cluster k=1 j j 1 1 0 0 0 Softmax function :) VLAD
  • 49. VGG16 and Alexnet Consider the positive example closer to the query Consider all negative candidates With distance inferior to a certain margin
  • 50.
  • 51.
  • 54. Learning representations for retrieval 54 Generated dataset to create the training triplets Dataset: Landmarks dataset: ● 214K images of 672 famous landmark site. ● Dataset processing based on a matching baseline: SIFT + Hessian-Affine keypoint detector. ● Important to select the “useful” triplets. Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
  • 56. Learning representations for retrieval 56 Comparison between R-MAC from off-the-shelf network and R-MAC retrained for retrieval Gordo et al, Deep Image Retrieval: Learning global representations for image search. ECCV 2016
  • 57. Summary 57 ● Pre-trained CNN are useful to generate image descriptors for retrieval ● Convolutional layers allow us to encode local information ● Knowing how to rank similarity is the primary task in retrieval ● Designing CNN architectures to learn how to rank
  • 59. Summary state-of-the-art Layer Method Dim Oxford 5k Paris 6k Off-the-shelf CNN representations Global representation fc Neural Codes 128 0.433 - conv SPoC 256 0.657 - CroW 256 0.684 0.765 R-MAC 256 0.561 0.729 conv NetVLAD 256 0.555 0.643 BoCF 25k(*) 0.739 0.82 Multiple representations fc CNN-astounding 4-15k 0.68 0.795 conv CNN-astounding2 32k 0.843 0.879 Learning representations for retrieval Global representation conv MAC 512 0.774 0.824 regionProposal+MAC 512 0.813 0.855 NetVLAD 512 0.676 0.749 59