SlideShare a Scribd company logo
1 of 33
Download to read offline
Prof. Dr. Laura Leal-Taixé
Out with the Old?
CNN vs. SIFT-based visual localization
Visual Localization
Compute exact position and
orientation of query image
(relative to 3D scene model)
Classic Localization Pipeline
Extract Local Features
Establish 2D-3D Matches
Estimate Camera Pose
(RANSAC)
Learning visual
localization?
Where do we get training data?
Structure-from-Motion gives us plenty of data
TRAINING TESTING
Related work: PoseNet
Kendall et al. PoseNet:. ICCV 2015
Non-Linear Embedding
Pretrained
y R2048
Linear Regression
p R3
FC
FC
q R4
Outdoor localization: SIFT wins
• Experiments on Cambridge Landmarks
SIFT CNN
Walch, Hazirbas, Leal-Taixé, Sattler, Hilsenbeck, Cremers, Image-based localization using LSTMs for structured feature correlation. ICCV, 2017
Indoor localization: SIFT suffers
• Experiments on 7Scenes
Number of images that it failed to localize
SIFT CNN
SIFT-based methods do not work at all!
Our new dataset TUM-LSI: SIFT dies
https://github.com/NavVisResearch/NavVis-Indoor-Dataset
• One separate model is needed per scene
Limitations of current methods
Pretrained
y R2048
p R3
FC
FC
q R4
Kendall et al. PoseNet:. ICCV 2015 Brachmann et al. DSAC:. CVPR 2017
• One separate model is needed per scene
• The most used loss function requires the setting of a
hyperparameter which is again scene dependent
Limitations of current methods
Pretrained
y R2048
p R3
FC
FC
q R4
DOES NOT SCALE
• L2 loss
pose label pose
prediction
camera center
position
camera rotation
in quaternion
scaling factor to balance
the weighting between
the rotation error and the
position error
PoseNet loss functions
Kendall et al. PoseNet:. ICCV 2015
Need to find its value per scene à can range from 200 to 2500
No geometry information
• Geometric loss function: Minimize re-projection error of 3D points
visible in image
PoseNet loss functions
Kendall et al. Geometric Loss Functions for Camera Pose Regression with Deep Learning. CVPR 2017
Network needs to be pre-trained on L2 loss
One network per scene
More accurate results
Relative Pose Estimation
• Use a neural network to predict relative poses
• Loss function: PoseNet L2 loss
Laskar et al. Camera relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network. ICCVW 2017
Loss still depends on a scene-dependent
hyperparameter
One network for all scenes
• One separate model is needed per scene
• The most used loss function requires the setting of a
hyperparameter which is again scene dependent
Limitations of current methods
• One separate model is needed per scene
• The most used loss function requires the setting of a
hyperparameter which is again scene dependent
Proposed method
Relative Pose Estimation
with
Geometric Matching
Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018
Essential Matrix
Regression
One network
for all scenes
No
hyperparameter
on the loss
• One separate model is needed per scene
• The most used loss function requires the setting of a
hyperparameter which is again scene dependent
Proposed method
Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018
Essential Matrix
Regression
One network
for all scenes
No
hyperparameter
on the loss
Relative Pose Estimation
with
Geometric Matching
• Siamese architecture
– Shared weights on the input
image pair
– Feature concatenation
Regressing relative poses
I. Rocco et al. “Convolutional neural network architecture for geometric matching. CVPR 2017.
Geometric matching layer
• Mimic the geometrical matching process
Features
from the
siamese net
Fixed
operation,
no learnable
parameters
Relative Pose Estimation
with
Geometric Matching
• One separate model is needed per scene
• The most used loss function requires the setting of a
hyperparameter which is again scene dependent
Proposed method
Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018
One network
for all scenes
No
hyperparameter
on the loss
Essential Matrix
Regression
• Essential matrix between query and training image
• L2 loss function on the entries of the essential matrix
The loss is
hyperparameter
free!
Essential Matrix loss
Fixing the scale of translations to
get consistent predictions
Are we really learning an essential matrix?
• Ratio between first two eigenvalues is close to 1?
Are we really learning an essential matrix?
• Ratio between first two eigenvalues is close to 1?
• Third eigenvalue close to 0?
Are we really learning an essential matrix?
• Ratio between first two eigenvalues is close to 1?
• Third eigenvalue close to 0?
• We learn a good approximation of a true essential matrix
• Feature-based methods also project the resulting matrix to a rank 2
matrix (due to noise)
• Forcing this projection does not have any effect on the results on
any of the tested datasets
A CNN model for Image
retrieval to identify 30
nearest neighbors for the
query image
F. Radenovic ́et al.. CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples.. ECCV 2016
Full localization pipeline
EssNet to regress essential
matrices and estimate
relative poses
Full localization pipeline
Triangulation of the
position and averaging of
the rotation within a
RANSAC loop.
Full localization pipeline
Comparison to SOA
• Experiments on outdoor scenes, Cambridge Landmarks
Comparison to SOA
• Experiments on outdoor scenes, Cambridge Landmarks
• SOA positional error with better rotational error
Comparison to SOA
• Experiments on outdoor scenes, Cambridge Landmarks
• Only geometric loss is more accurate in rotation but 21% less
accurate in translation
Comparison to SOA
• Experiments on outdoor scenes, Cambridge Landmarks
One single
network
One network trained per scene
What can we do with more data?
• Training relative poses allows us to leverage more data
Trained on
Cambridge
Landmarks
Pre-trained on a new dataset and fine-
tuned on Cambridge Landmarks
Out with the Old?
• SIFT-based methods still outperform CNN-based methods but
there is potential in indoor scenes
• EssNet: relative pose estimation
– mimics the visual localization pipeline à takes advantage of multiple
view geometry
– One network to perform localization in any scene
• Generalization still has a long way to go: if no images of the test
scene are shown to the network, performance drops
40
Thank you!
Prof. Laura Leal-Taixé
https://dvl.in.tum.de

More Related Content

What's hot

How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...Dongmin Choi
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Dongmin Choi
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]Dongmin Choi
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNNanna8885
 
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...Universitat Politècnica de Catalunya
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 

What's hot (20)

How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Faster R-CNN
Faster R-CNNFaster R-CNN
Faster R-CNN
 
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 

Similar to CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018

Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfSamuCerezo
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep LearningYu Huang
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceNAVER Engineering
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...mustafa sarac
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Sara Granados Cabeza
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceDiego Tosato
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...Hyeongmin Lee
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learningNAVER Engineering
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsVijay Karan
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding捷恩 蔡
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Yu Huang
 
Content-based image retrieval using a mobile device as a novel interface
Content-based image retrieval using a mobile device as a novel interfaceContent-based image retrieval using a mobile device as a novel interface
Content-based image retrieval using a mobile device as a novel interfaceJonathon Hare
 

Similar to CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018 (20)

Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondenceParn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
Parn pyramidal+affine+regression+networks+for+dense+semantic+correspondence
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video Surveillance
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 ProjectsRemote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 
Content-based image retrieval using a mobile device as a novel interface
Content-based image retrieval using a mobile device as a novel interfaceContent-based image retrieval using a mobile device as a novel interface
Content-based image retrieval using a mobile device as a novel interface
 
Kintinuous review
Kintinuous reviewKintinuous review
Kintinuous review
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018

  • 1. Prof. Dr. Laura Leal-Taixé Out with the Old? CNN vs. SIFT-based visual localization
  • 2. Visual Localization Compute exact position and orientation of query image (relative to 3D scene model)
  • 3. Classic Localization Pipeline Extract Local Features Establish 2D-3D Matches Estimate Camera Pose (RANSAC) Learning visual localization?
  • 4. Where do we get training data? Structure-from-Motion gives us plenty of data TRAINING TESTING
  • 5. Related work: PoseNet Kendall et al. PoseNet:. ICCV 2015 Non-Linear Embedding Pretrained y R2048 Linear Regression p R3 FC FC q R4
  • 6. Outdoor localization: SIFT wins • Experiments on Cambridge Landmarks SIFT CNN Walch, Hazirbas, Leal-Taixé, Sattler, Hilsenbeck, Cremers, Image-based localization using LSTMs for structured feature correlation. ICCV, 2017
  • 7. Indoor localization: SIFT suffers • Experiments on 7Scenes Number of images that it failed to localize SIFT CNN
  • 8. SIFT-based methods do not work at all! Our new dataset TUM-LSI: SIFT dies https://github.com/NavVisResearch/NavVis-Indoor-Dataset
  • 9. • One separate model is needed per scene Limitations of current methods Pretrained y R2048 p R3 FC FC q R4 Kendall et al. PoseNet:. ICCV 2015 Brachmann et al. DSAC:. CVPR 2017
  • 10. • One separate model is needed per scene • The most used loss function requires the setting of a hyperparameter which is again scene dependent Limitations of current methods Pretrained y R2048 p R3 FC FC q R4 DOES NOT SCALE
  • 11. • L2 loss pose label pose prediction camera center position camera rotation in quaternion scaling factor to balance the weighting between the rotation error and the position error PoseNet loss functions Kendall et al. PoseNet:. ICCV 2015 Need to find its value per scene à can range from 200 to 2500 No geometry information
  • 12. • Geometric loss function: Minimize re-projection error of 3D points visible in image PoseNet loss functions Kendall et al. Geometric Loss Functions for Camera Pose Regression with Deep Learning. CVPR 2017 Network needs to be pre-trained on L2 loss One network per scene More accurate results
  • 13. Relative Pose Estimation • Use a neural network to predict relative poses • Loss function: PoseNet L2 loss Laskar et al. Camera relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network. ICCVW 2017 Loss still depends on a scene-dependent hyperparameter One network for all scenes
  • 14. • One separate model is needed per scene • The most used loss function requires the setting of a hyperparameter which is again scene dependent Limitations of current methods
  • 15. • One separate model is needed per scene • The most used loss function requires the setting of a hyperparameter which is again scene dependent Proposed method Relative Pose Estimation with Geometric Matching Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018 Essential Matrix Regression One network for all scenes No hyperparameter on the loss
  • 16. • One separate model is needed per scene • The most used loss function requires the setting of a hyperparameter which is again scene dependent Proposed method Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018 Essential Matrix Regression One network for all scenes No hyperparameter on the loss Relative Pose Estimation with Geometric Matching
  • 17. • Siamese architecture – Shared weights on the input image pair – Feature concatenation Regressing relative poses
  • 18. I. Rocco et al. “Convolutional neural network architecture for geometric matching. CVPR 2017. Geometric matching layer • Mimic the geometrical matching process Features from the siamese net Fixed operation, no learnable parameters
  • 19. Relative Pose Estimation with Geometric Matching • One separate model is needed per scene • The most used loss function requires the setting of a hyperparameter which is again scene dependent Proposed method Q. Zhou, T. Sattler, L. Leal-Taixé. Submitted to ECCV 2018 One network for all scenes No hyperparameter on the loss Essential Matrix Regression
  • 20. • Essential matrix between query and training image • L2 loss function on the entries of the essential matrix The loss is hyperparameter free! Essential Matrix loss Fixing the scale of translations to get consistent predictions
  • 21. Are we really learning an essential matrix? • Ratio between first two eigenvalues is close to 1?
  • 22. Are we really learning an essential matrix? • Ratio between first two eigenvalues is close to 1? • Third eigenvalue close to 0?
  • 23. Are we really learning an essential matrix? • Ratio between first two eigenvalues is close to 1? • Third eigenvalue close to 0? • We learn a good approximation of a true essential matrix • Feature-based methods also project the resulting matrix to a rank 2 matrix (due to noise) • Forcing this projection does not have any effect on the results on any of the tested datasets
  • 24. A CNN model for Image retrieval to identify 30 nearest neighbors for the query image F. Radenovic ́et al.. CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples.. ECCV 2016 Full localization pipeline
  • 25. EssNet to regress essential matrices and estimate relative poses Full localization pipeline
  • 26. Triangulation of the position and averaging of the rotation within a RANSAC loop. Full localization pipeline
  • 27. Comparison to SOA • Experiments on outdoor scenes, Cambridge Landmarks
  • 28. Comparison to SOA • Experiments on outdoor scenes, Cambridge Landmarks • SOA positional error with better rotational error
  • 29. Comparison to SOA • Experiments on outdoor scenes, Cambridge Landmarks • Only geometric loss is more accurate in rotation but 21% less accurate in translation
  • 30. Comparison to SOA • Experiments on outdoor scenes, Cambridge Landmarks One single network One network trained per scene
  • 31. What can we do with more data? • Training relative poses allows us to leverage more data Trained on Cambridge Landmarks Pre-trained on a new dataset and fine- tuned on Cambridge Landmarks
  • 32. Out with the Old? • SIFT-based methods still outperform CNN-based methods but there is potential in indoor scenes • EssNet: relative pose estimation – mimics the visual localization pipeline à takes advantage of multiple view geometry – One network to perform localization in any scene • Generalization still has a long way to go: if no images of the test scene are shown to the network, performance drops 40
  • 33. Thank you! Prof. Laura Leal-Taixé https://dvl.in.tum.de