SlideShare a Scribd company logo
1 of 23
Download to read offline
DeepVO
Towards End-to-End Visual Odometry with Deep
Recurrent Convolutional Neural Networks
National Chung Cheng University, Taiwan
Robot Vision Laboratory
2017/11/08
Jacky Liu
About this work
DeepVO : Towards Visual Odometry with Deep Learning
Sen Wang1,2, Ronald Clark2, Hongkai Wen2 and Niki Trigoni2
1. Edinburgh Centre for Robotics, Heriot-Watt University, UK
2. University of Oxford, UK
Download this paper: http://senwang.gitlab.io/DeepVO/#paper
Watch video: http://senwang.gitlab.io/DeepVO/#video
2
DeepVO : Towards Visual Odometry with Deep Learning
Contributions
1. Proving that
Monocular VO could
be build by End-to-
End training
2. RCNN architecture
could generalized to
unseen environment
3. Complex movement
could be modeled by
RCNN
3
DeepVO : Towards Visual Odometry with Deep Learning
Related works
4
Visual odometry
Geometric
Sparse Direct
Learning
Related works
Sparse
 PTAM
 ORB-SLAM
Direct
 DTAM
5
Network
 CNN
 RNN
 LSTM
Network design
1. Traditional computer vision learn knowledge from
appearance and image context
2. Visual odometry should learn from geometry.
This is what RCNN tried to address
6
DeepVO : Towards Visual Odometry with Deep Learning
Network design
7
DeepVO : Towards Visual Odometry with Deep Learning
8
DeepVO : Towards Visual Odometry with Deep Learning
Preprocessing
 Normalizing inputs (speed up training)
=> subtracting the mean RGB values of the
training set
 Resize image to 64x
 Stack two images to form a tensor
9
DeepVO : Towards Visual Odometry with Deep Learning
CNN
 What this research mean by learning
“geometric” feature?
=> They stacking two RGB images and feed it
into CNN. Expecting the network to perform
feature extraction on the concatenation of
two consecutive monocular RGB images.
10
DeepVO : Towards Visual Odometry with Deep Learning
RNN
 RNN is not suitable to directly learn sequential
representation from high-dimensional raw
data, such as images.
 Hidden state:
ℎ 𝑘 = ℋ 𝑊𝑥ℎ 𝑥 𝑘 + 𝑊ℎℎℎ 𝑘−1 + 𝑏ℎ
 Output:
𝑦 𝑘 = 𝑊ℎ𝑦ℎ 𝑘 + 𝑏 𝑦
11
DeepVO : Towards Visual Odometry with Deep Learning
𝑏: bias vector𝑊: weight matrix
𝑘: time index ℋ: activation function
Vanishing gradient
problem
LSTM (Long short-term memory)
12
DeepVO : Towards Visual Odometry with Deep Learning
Need depth to
learn high level
representation
13
DeepVO : Towards Visual Odometry with Deep Learning
14
Cost function
𝜃∗
= argmin
𝜃
1
𝑁
෍
𝑖=1
𝑁
෍
𝑘=1
𝑡
Ƹ𝑝 𝑘 − 𝑝 𝑘 2
2
+ 𝜘 ො𝜑 𝑘 − 𝜑 𝑘 2
2
Conditional probability of pose
𝑝 𝑌𝑡 𝑋𝑡 = 𝑝(𝑦1, … , 𝑦𝑡|𝑥1, … , 𝑥𝑡)
𝜃∗
= argmin
𝜃
𝑝(𝑌𝑡|𝑋𝑡; 𝜃)
Ground truth pose (𝑝 𝑘, 𝜑 𝑘) = (position, orientation)
𝑠𝑐𝑎𝑙𝑒 𝑓𝑎𝑐𝑡𝑜𝑟
Experimental results
DeepVO
VISO2
15
Training & testing
1. Dataset: KITTI VO/SLAM benchmark
(22 sequences of images / 10fps / dynamic object)
2. 7410 training samples (image and trajectory pair)
3. Implemented based on Theano
4. Hardware: Nvidia Tesla K40 GPU
5. 200 epochs
6. Learning rate 0.001
7. Regularization: dropout / early stopping
8. CNN: transfer learning from FlowNet
16
overfitting
 Orientation is more
prone to overfitting
17
DeepVO : Towards Visual Odometry with Deep Learning
Compare with
traditional VO
 Open-source VO library
LIBVISO2
 Monocular / Stereo
18
DeepVO : Towards Visual Odometry with Deep Learning
Trajectory (1/2)
19
DeepVO : Towards Visual Odometry with Deep Learning
Trajectory (2/2)
 No ground truth:
Seq11~19
20
DeepVO : Towards Visual Odometry with Deep Learning
21
DeepVO : Towards Visual Odometry with Deep Learning
Dynamic
 This research don’t
know how to deal
with this issue
 Traditional VO –
RANSAC (remove
outlier)
 Get more training
data
22
DeepVO : Towards Visual Odometry with Deep Learning
Conclusion
23
 End-to-end monocular VO based on Deep learning
 Deep RCNN
 No need to carefully tune the parameters of the
VO system
 It is not expected as a replacement to the classic
geometry based approach

More Related Content

What's hot

What's hot (20)

Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Yolo
YoloYolo
Yolo
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Object detection
Object detectionObject detection
Object detection
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Ray tracing
Ray tracingRay tracing
Ray tracing
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
Object detection
Object detectionObject detection
Object detection
 
Object tracking
Object trackingObject tracking
Object tracking
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
You only look once
You only look onceYou only look once
You only look once
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[解説スライド] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
Lec14 multiview stereo
Lec14 multiview stereoLec14 multiview stereo
Lec14 multiview stereo
 

Similar to DeepVO - Towards Visual Odometry with Deep Learning

(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
Jacky Liu
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Fadwa Fouad
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
PAY2 YOU
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
CSCJournals
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 
Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Poster
nikhilus85
 

Similar to DeepVO - Towards Visual Odometry with Deep Learning (20)

(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & Future
 
Review of Pose Recognition Systems
Review of Pose Recognition SystemsReview of Pose Recognition Systems
Review of Pose Recognition Systems
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
 
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
Emily Denton - Unsupervised Learning of Disentangled Representations from Vid...
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
Iciap 2
Iciap 2Iciap 2
Iciap 2
 
Human Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-PosterHuman Action Recognition Based on Spacio-temporal features-Poster
Human Action Recognition Based on Spacio-temporal features-Poster
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...
 
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsAction Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
 
Exploring visual and motion saliency for automatic video object extraction
Exploring visual and motion saliency for automatic video object extractionExploring visual and motion saliency for automatic video object extraction
Exploring visual and motion saliency for automatic video object extraction
 
Exploring visual and motion saliency for automatic video object extraction
Exploring visual and motion saliency for automatic video object extractionExploring visual and motion saliency for automatic video object extraction
Exploring visual and motion saliency for automatic video object extraction
 
Sub-sampled dictionaries for coarse-to-fine sparse representation-based human...
Sub-sampled dictionaries for coarse-to-fine sparse representation-based human...Sub-sampled dictionaries for coarse-to-fine sparse representation-based human...
Sub-sampled dictionaries for coarse-to-fine sparse representation-based human...
 
lec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdflec_11_self_supervised_learning.pdf
lec_11_self_supervised_learning.pdf
 
Particle filter framework for salient object detection in videos
Particle filter framework for salient object detection in videosParticle filter framework for salient object detection in videos
Particle filter framework for salient object detection in videos
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Multispectral Purkinje Imaging
 Multispectral Purkinje Imaging Multispectral Purkinje Imaging
Multispectral Purkinje Imaging
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 

DeepVO - Towards Visual Odometry with Deep Learning

  • 1. DeepVO Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks National Chung Cheng University, Taiwan Robot Vision Laboratory 2017/11/08 Jacky Liu
  • 2. About this work DeepVO : Towards Visual Odometry with Deep Learning Sen Wang1,2, Ronald Clark2, Hongkai Wen2 and Niki Trigoni2 1. Edinburgh Centre for Robotics, Heriot-Watt University, UK 2. University of Oxford, UK Download this paper: http://senwang.gitlab.io/DeepVO/#paper Watch video: http://senwang.gitlab.io/DeepVO/#video 2 DeepVO : Towards Visual Odometry with Deep Learning
  • 3. Contributions 1. Proving that Monocular VO could be build by End-to- End training 2. RCNN architecture could generalized to unseen environment 3. Complex movement could be modeled by RCNN 3 DeepVO : Towards Visual Odometry with Deep Learning
  • 5. Related works Sparse  PTAM  ORB-SLAM Direct  DTAM 5 Network  CNN  RNN  LSTM
  • 6. Network design 1. Traditional computer vision learn knowledge from appearance and image context 2. Visual odometry should learn from geometry. This is what RCNN tried to address 6 DeepVO : Towards Visual Odometry with Deep Learning
  • 7. Network design 7 DeepVO : Towards Visual Odometry with Deep Learning
  • 8. 8 DeepVO : Towards Visual Odometry with Deep Learning
  • 9. Preprocessing  Normalizing inputs (speed up training) => subtracting the mean RGB values of the training set  Resize image to 64x  Stack two images to form a tensor 9 DeepVO : Towards Visual Odometry with Deep Learning
  • 10. CNN  What this research mean by learning “geometric” feature? => They stacking two RGB images and feed it into CNN. Expecting the network to perform feature extraction on the concatenation of two consecutive monocular RGB images. 10 DeepVO : Towards Visual Odometry with Deep Learning
  • 11. RNN  RNN is not suitable to directly learn sequential representation from high-dimensional raw data, such as images.  Hidden state: ℎ 𝑘 = ℋ 𝑊𝑥ℎ 𝑥 𝑘 + 𝑊ℎℎℎ 𝑘−1 + 𝑏ℎ  Output: 𝑦 𝑘 = 𝑊ℎ𝑦ℎ 𝑘 + 𝑏 𝑦 11 DeepVO : Towards Visual Odometry with Deep Learning 𝑏: bias vector𝑊: weight matrix 𝑘: time index ℋ: activation function Vanishing gradient problem
  • 12. LSTM (Long short-term memory) 12 DeepVO : Towards Visual Odometry with Deep Learning Need depth to learn high level representation
  • 13. 13 DeepVO : Towards Visual Odometry with Deep Learning
  • 14. 14 Cost function 𝜃∗ = argmin 𝜃 1 𝑁 ෍ 𝑖=1 𝑁 ෍ 𝑘=1 𝑡 Ƹ𝑝 𝑘 − 𝑝 𝑘 2 2 + 𝜘 ො𝜑 𝑘 − 𝜑 𝑘 2 2 Conditional probability of pose 𝑝 𝑌𝑡 𝑋𝑡 = 𝑝(𝑦1, … , 𝑦𝑡|𝑥1, … , 𝑥𝑡) 𝜃∗ = argmin 𝜃 𝑝(𝑌𝑡|𝑋𝑡; 𝜃) Ground truth pose (𝑝 𝑘, 𝜑 𝑘) = (position, orientation) 𝑠𝑐𝑎𝑙𝑒 𝑓𝑎𝑐𝑡𝑜𝑟
  • 16. Training & testing 1. Dataset: KITTI VO/SLAM benchmark (22 sequences of images / 10fps / dynamic object) 2. 7410 training samples (image and trajectory pair) 3. Implemented based on Theano 4. Hardware: Nvidia Tesla K40 GPU 5. 200 epochs 6. Learning rate 0.001 7. Regularization: dropout / early stopping 8. CNN: transfer learning from FlowNet 16
  • 17. overfitting  Orientation is more prone to overfitting 17 DeepVO : Towards Visual Odometry with Deep Learning
  • 18. Compare with traditional VO  Open-source VO library LIBVISO2  Monocular / Stereo 18 DeepVO : Towards Visual Odometry with Deep Learning
  • 19. Trajectory (1/2) 19 DeepVO : Towards Visual Odometry with Deep Learning
  • 20. Trajectory (2/2)  No ground truth: Seq11~19 20 DeepVO : Towards Visual Odometry with Deep Learning
  • 21. 21 DeepVO : Towards Visual Odometry with Deep Learning
  • 22. Dynamic  This research don’t know how to deal with this issue  Traditional VO – RANSAC (remove outlier)  Get more training data 22 DeepVO : Towards Visual Odometry with Deep Learning
  • 23. Conclusion 23  End-to-end monocular VO based on Deep learning  Deep RCNN  No need to carefully tune the parameters of the VO system  It is not expected as a replacement to the classic geometry based approach