SlideShare a Scribd company logo
1 of 54
Download to read offline
Time-Sensitive Egocentric Image Retrieval for
Finding Objects in Lifelogs
Xavier Giró
Cristian Reyes
Author:
Advisors:
14th July 2016
Eva Mohedano Kevin McGuinness
Outline
1. Motivation
2. Methodology
3. Experiments
4. Results & Conclusions
5. Conclusions
2
Lifelogging
Extract value from this new data
3
4
Can’t find my phone
Motivation
5
Hundreds of images!
Review
Egocentric cameras may help
Last time seen: At the CAFE
6
Goal: Retrieve a useful image to find the object.
Task
7
How: Exploiting visual and temporal information.
Outline
1. Motivation
2. Methodology
3. Experiments
4. Results & Conclusions
5. Conclusions
8
System Overview
9
Visual Ranking - Descriptors
Convolutional Neural Networks
CNN design (vgg16) used for feature extraction.
Conv-5_1
10Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto.
Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
Visual Ranking - Descriptors
Bag of Words
11Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto.
Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
Visual Ranking - Queries
5 visual examples
12
Visual Ranking - Queries
3 masking strategies
Full Image
(FI)
Hard Bounding Box
(HBB)
Soft Bounding Box
(SBB)
13
Visual Ranking - Target
3 masking strategies
Full Image
(FI)
Center Bias
(CB)
Saliency Mask
(SM)
Saliency Map
14Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O’Connor, and Xavier Giro-i Nieto.
Shallow and deep convolutional networks for saliency prediction. CVPR 2016.
System Overview
15
Candidate Selection
Absolute
Threshold on Visual Similarity Scores (TVSS)
Adaptive
Nearest Neighbor Distance Ratio (NNDR)
2 thresholding strategies
Parameters
16
LEARNT
D. G. Loewe.
Distinctive image features from scale-invariant keypoints, 2004.
System Overview
17
Temporal aware reranking
Time-stamp sorting Time-stamp sorting
18
Redundancy
19
Temporal Diversity
20
Effect of Diversity
New locations introduced
21
Outline
1. Motivation
2. Methodology
3. Experiments
4. Results & Conclusions
5. Conclusions
22
Compare
different configurations
1. Dataset of images
2. Queries
3. Annotations
4. Metric
Evaluate
quantitatively
23
Dataset
EDUB1
● 4912 images
● 4 users
● 2 days/user
● Narrative Clip 1
NTCIR-Lifelog2
● 88185 images
● 3 users
● 30 days/user
● Autographer
General Behavior
Wide Angle Lens 24
1. Marc Bolaños and Petia Radeva.
Ego-object discovery.
2. C. Gurrin, H. Joho, F. Hopfgartner, L. Zhou, and R. Albatal.
NTCIR Lifelog: The first test collection for lifelog research.
25
EDUB NTCIR-Lifelog
Dataset - Query Definition
26
Annotation Strategy
2. Annotate only the last occurrence
3. Annotate all the scene of the last occurrence
→ Neighbor images may also be helpful
→ The system is expected to find all 31. Annotate the 3 last occurrences
27
Metric
Mean Average Precision Mean Reciprocal Rank
28
ALL RELEVANT IMAGES THE BEST RANKED RELEVANT IMAGE
9 Days 15 Days
Training
TRAINING TEST
29
Training - Codebook
~ 13,500
images
~18,000,000
feature vectors
25,000
clusters
30
Training - Thresholds
31
Training - Thresholds
32
TIME-STAMP SORTING
INTERLEAVING
FULL IMAGE
CENTER BIAS
SALIENCY MASK
SAME OPTIMAL THRESHOLDS
Parameters summary
33
Outline
1. Motivation
2. Methodology
3. Experiments
4. Results & Conclusions
5. Conclusions
34
Discussion & Conclusions
35
FI
SBB
HBB
QUERY APPROACH
Discussion & Conclusions
36
Full Image
Saliency Mask
Center Bias
TARGET APPROACH
Discussion & Conclusions
37
Adaptive: NNDR
Absolute: TVSS
Time-stamp reordering
+
Discussion & Conclusions
38
Adaptive: NNDR
Absolute: TVSS
Interleaving
+
Discussion & Conclusions
39
Discussion & Conclusions
The system helps the user
40
Diversity helps
Discussion & Conclusions
41
Discussion & Conclusions
Objects are not
always in the center
42
Discussion & Conclusions
Saliency Maps do not help here
But they do here
43
Discussion & Conclusions
Parameters are not
independent.
44
45
46
47
Lifelogging Tools and Applications
Acknowledgements
Financial Support
Noel E. O’Connor Cathal Gurrin
Albert Gil
48
49
50
f(Q) NNDR TVSS NNDR + I TVSS + I
Saliency Mask 0,201 0,217 0,210 0,226
Using Saliency Mask for target
f(Q) NNDR TVSS NNDR + I TVSS + I
Saliency Mask 0,176 0,271 0,190 0,279
Using Full Image for target
f(Q) NNDR TVSS NNDR + I TVSS + I
Saliency Mask 0,162 0,198 0,173 0,213
Using Center Bias for target
Visual Ranking - Descriptors
Bag of Words
A) There is a red ball in a white box.
B) The red box contains a white ball.
Sentence
A
Sentence
B
there 1 0
is 1 0
a 2 1
red 1 1
ball 1 1
in 1 0
white 1 1
box 1 1
the 0 1
contains 0 1
Cosine Similarity Score = 0.68
An example using text
51
52
+
Outline
1. Motivation
2. Methodology
3. Experiments
4. Results
5. Conclusions
53
Conclusions
● The system accomplishes its task.
● Thresholding and temporal reranking have improved performance
● Center Bias does not necessary improve performance.
● Saliency Maps have improved performance.
● Parameters are not independent when measuring with A-MRR.
● Good baseline for further research.
54

More Related Content

More from Universitat Politècnica de Catalunya

Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

Time sensitive egocentric image retrieval for finding objects in lifelogs