SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Hierarchical Object Detection with Deep
Reinforcement Learning
NIPS 2016 Workshop on Reinforcement Learning
[github] [arXiv]
Míriam Bellver, Xavier Giró i Nieto, Ferran Marqués, Jordi Torres
Outline
● Introduction
● Related Work
● Hierarchical Object Detection Model
● Experiments
● Conclusions
2
Introduction
3
Introduction
We present a method for performing hierarchical object detection in images
guided by a deep reinforcement learning agent.
4
OBJECT
FOUND
Introduction
We present a method for performing hierarchical object detection in images
guided by a deep reinforcement learning agent.
5
OBJECT
FOUND
Introduction
We present a method for performing hierarchical object detection in images
guided by a deep reinforcement learning agent.
6
OBJECT
FOUND
Introduction
What is Reinforcement Learning ?
“a way of programming agents by reward and punishment without needing to
specify how the task is to be achieved”
[Kaelbling, Littman, & Moore, 96]
7
Introduction
Reinforcement Learning
● There is no supervisor, only reward
signal
● Feedback is delayed, not
instantaneous
● Time really matters (sequential, non
i.i.d data)
8
Slide credit: UCL Course on RL by David Silver
Introduction
Reinforcement Learning
An agent that is a decision-maker interacts with the environment and learns
through trial-and-error
9
Slide credit: UCL Course on RL by David Silver
We model the
decision-making
process through
a Markov
Decision
Process
Introduction
Reinforcement Learning
An agent that is a decision-maker interacts with the environment and learns
through trial-and-error
10
Slide credit: UCL Course on RL by David Silver
Introduction
Contributions:
● Hierarchical object detection in images using deep reinforcement
learning agent
● We define two different hierarchies of regions
● We compare two different strategies to extract features for each
candidate proposal to define the state
● We achieve to find objects analyzing just a few regions
11
Related Work
12
Related Work
Deep Reinforcement Learning
13
ATARI 2600 Alpha Go
Mnih, V. (2013). Playing atari with deep reinforcement learning
Silver, D. (2016). Mastering the game of Go with deep neural networks and tree search
Related Work
14
Region
Proposals/Sliding
Window +
Detector
Sharing
convolutions over
locations +
Detector
Sharing
convolutions over
location and also
to the detector
Single Shot
detectors
Uijlings, J. R.
(2013). Selective
search for object
recognition
Girshick, R.
(2015). Fast
R-CNN
Ren, S., He, K., Girshick, R., &
Sun, J. (2015). Faster R-CNN
Redmon, J., (2015). YOLO
Liu, W.,(2015). SSD
Object Detection
Related Work
15
Region
Proposals/Sliding
Window +
Detector
Sharing
convolutions over
locations +
Detector
Sharing
convolutions over
location and also
to the detector
Single Shot
detectors
Object Detection
they rely on a large
number of locations
they rely on a number
of reference boxes
from which bbs are
regressed
Uijlings, J. R.
(2013). Selective
search for object
recognition
Girshick, R.
(2015). Fast
R-CNN
Ren, S., He, K., Girshick, R., &
Sun, J. (2015). Faster R-CNN
Redmon, J., (2015). YOLO
Liu, W.,(2015). SSD
Related Work
So far we can cluster object detection pipelines based on how the regions
analyzed are obtained:
● Using object proposals
● Using reference boxes “anchors” to be potentially regressed
16
Related Work
So far we can cluster object detection pipelines based on how the regions
analyzed are obtained:
● Using object proposals
● Using reference boxes “anchors” to be potentially regressed
There is a third approach:
● Approaches that refine iteratively one initial bounding box
(AttentionNet, Active Object Localization with DRL)
17
Related Work
Refinement of bounding box predictions
Attention Net:
They cast an object detection problem as an
iterative classification problem. Each category
corresponds to a weak direction pointing to the
target object.
18Yoo, D. (2015). Attentionnet: Aggregating weak directions for accurate object detection.
Related Work
Refinement of bounding box predictions
Active Object Localization with Deep Reinforcement Learning:
19Caicedo, J. C., & Lazebnik, S. (2015). Active object localization with deep reinforcement learning
Hierarchical Object Detection Model
Reinforcement Learning formulation
20
Reinforcement Learning Formulation
We cast the problem as a Markov Decision Process
21
Reinforcement Learning Formulation
We cast the problem as a Markov Decision Process
State: The agent will decide which action to choose based on the
concatenation of:
● visual description of the current observed region
● history vector that maps past actions performed
22
Reinforcement Learning Formulation
We cast the problem as a Markov Decision Process
Actions: Two kind of actions:
● movement actions: to which of the 5 possible regions defined by the
hierarchy to move
● terminal action: the agent indicates that the object has been found
23
Reinforcement Learning Formulation
Hierarchies of regions
For the first kind of hierarchy,
less steps are required to reach
a certain scale of bounding
boxes, but the space of possible
regions is smaller
24
trigger
Reinforcement Learning Formulation
Reward:
25
Reward for movement actions
Reward for terminal action
Hierarchical Object Detection Model
Q-learning
26
Q-learning
In Reinforcement Learning we want to obtain a function Q(s,a) that predicts
best action a in state s in order to maximize a cumulative reward.
This function can be estimated using Q-learning, which iteratively updates
Q(s,a) using the Bellman Equation
27
immediate
reward
future
reward
discount factor = 0.90
Q-learning
What is deep reinforcement learning?
It is when we estimate this Q(s,a) function by means of a deep network
28
Figure credit: nervana blogpost about RL
one output for
each action
Hierarchical Object Detection Model
Model
29
Model
We tested two different
configurations of feature
extraction:
Image-Zooms model: We extract
features for every region observed
Pool45-Crops model: We extract
features once for the whole image,
and ROI-pool features for each
subregion
30
Model
Our RL agent is based on a
Q-network. The input is:
● Visual description
● History vector
The output is:
● A FC of 6 neurons,
indicating the Q-values
for each action
31
Hierarchical Object Detection Model
Training
32
Training
Exploration-Exploitation dilemma
ε-greedy policy
Exploration: With probability ε the agent performs a random action
Exploitation: With probability 1-ε performs action associated to highest Q(s,a)
33
Training
Experience Replay
Bellman equation learns from transitions formed by (s,a,r,s’) Consecutive
experiences are very correlated, leading to inefficient training.
Experience replay collects a buffer of experiences and the algorithm
randomly takes mini batches from this replay memory to train the network
34
Experiments
35
Visualizations
These results were obtained
with the Image-zooms
model, which yielded better
results.
We observe that the model
approximates to the
object, but that the final
bounding box is not
accurate.
36
Experiments
We calculate an upper-bound and baseline experiment with the hierarchies,
and observe that both are very limited in terms of recall.
Image-Zooms model achieves better Precision-Recall metric 37
Experiments
Most of the searches for objects of our agent
finish with just 1, 2 or 3 steps, so our agent
requires very few steps to approximate to
objects.
38
Conclusions
39
Conclusions
● Image-Zooms model yields better results. We argue that with the
ROI-pooling approach we do not have as much resolution as with the
Image-Zoom features. Although Image-Zooms is more computationally
intensive, we can afford it because with just a few steps we approximate
to the object.
● Our agent approximates to the object, but the final bounding box is not
accurate enough due that the hierarchy limits our space of solutions. A
solution could be training a regressor that adjusts the bounding box to
the target object.
40
Acknowledgements
Technical Support Financial Support
41
Albert Gil (UPC)
Josep Pujal (UPC)
Carlos Tripiana (BSC)
Thank you for your attention!
42

Más contenido relacionado

La actualidad más candente

Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement LearningUsman Qayyum
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imagingCheng-Bin Jin
 
Image anomaly detection with generative adversarial networks
Image anomaly detection with generative adversarial networksImage anomaly detection with generative adversarial networks
Image anomaly detection with generative adversarial networksSakshiSingh480
 
Causal discovery and prediction mechanisms
Causal discovery and prediction mechanismsCausal discovery and prediction mechanisms
Causal discovery and prediction mechanismsShiga University, RIKEN
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!J On The Beach
 
Deep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseDeep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseWei Yang
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingJinwon Lee
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionAmar Jindal
 
M.Sc. Thesis - Automatic People Counting in Crowded Scenes
M.Sc. Thesis - Automatic People Counting in Crowded ScenesM.Sc. Thesis - Automatic People Counting in Crowded Scenes
M.Sc. Thesis - Automatic People Counting in Crowded ScenesAhmed Gad
 

La actualidad más candente (20)

Object tracking
Object trackingObject tracking
Object tracking
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imaging
 
Image anomaly detection with generative adversarial networks
Image anomaly detection with generative adversarial networksImage anomaly detection with generative adversarial networks
Image anomaly detection with generative adversarial networks
 
Causal discovery and prediction mechanisms
Causal discovery and prediction mechanismsCausal discovery and prediction mechanisms
Causal discovery and prediction mechanisms
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
Expectation maximization
Expectation maximizationExpectation maximization
Expectation maximization
 
Bayesian network
Bayesian networkBayesian network
Bayesian network
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Real-time object detection coz YOLO!
Real-time object detection coz YOLO!Real-time object detection coz YOLO!
Real-time object detection coz YOLO!
 
Deep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defenseDeep learning-for-pose-estimation-wyang-defense
Deep learning-for-pose-estimation-wyang-defense
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Graph Based Pattern Recognition
Graph Based Pattern RecognitionGraph Based Pattern Recognition
Graph Based Pattern Recognition
 
M.Sc. Thesis - Automatic People Counting in Crowded Scenes
M.Sc. Thesis - Automatic People Counting in Crowded ScenesM.Sc. Thesis - Automatic People Counting in Crowded Scenes
M.Sc. Thesis - Automatic People Counting in Crowded Scenes
 

Similar a Hierarchical Object Detection with Deep Reinforcement Learning

Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
 
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...Universitat Politècnica de Catalunya
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Universitat Politècnica de Catalunya
 
Motion and tracking
Motion and trackingMotion and tracking
Motion and trackingpotaters
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overviewNatalia Díaz Rodríguez
 
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Jisu Han
 
Object Discovery using CNN Features in Egocentric Videos
Object Discovery using CNN Features in Egocentric VideosObject Discovery using CNN Features in Egocentric Videos
Object Discovery using CNN Features in Egocentric VideosMarc Bolaños Solà
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Deep reinforcement learning from scratch
Deep reinforcement learning from scratchDeep reinforcement learning from scratch
Deep reinforcement learning from scratchJie-Han Chen
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningDongHyun Kwak
 
Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in RoboticsSungjoon Choi
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Numenta
 
Wang midterm-defence
Wang midterm-defenceWang midterm-defence
Wang midterm-defenceZhipeng Wang
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...Exploiting User Interaction and Object Candidates for Instance Retrieval and ...
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...Universitat Politècnica de Catalunya
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning Chandra Meena
 

Similar a Hierarchical Object Detection with Deep Reinforcement Learning (20)

Active Object Localization with Deep Reinforcement Learning
Active Object Localization with Deep Reinforcement LearningActive Object Localization with Deep Reinforcement Learning
Active Object Localization with Deep Reinforcement Learning
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...
Reinforcement Learning (DLAI D7L2 2017 UPC Deep Learning for Artificial Intel...
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
 
Motion and tracking
Motion and trackingMotion and tracking
Motion and tracking
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
 
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
Preference learning for guiding the tree searches in continuous POMDPs (CoRL ...
 
Object Discovery using CNN Features in Egocentric Videos
Object Discovery using CNN Features in Egocentric VideosObject Discovery using CNN Features in Egocentric Videos
Object Discovery using CNN Features in Egocentric Videos
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Deep reinforcement learning from scratch
Deep reinforcement learning from scratchDeep reinforcement learning from scratch
Deep reinforcement learning from scratch
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in Robotics
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
Brains@Bay Meetup: The Increasing Role of Sensorimotor Experience in Artifici...
 
Wang midterm-defence
Wang midterm-defenceWang midterm-defence
Wang midterm-defence
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...Exploiting User Interaction and Object Candidates for Instance Retrieval and ...
Exploiting User Interaction and Object Candidates for Instance Retrieval and ...
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning
 

Más de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Más de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Último

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Último (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Hierarchical Object Detection with Deep Reinforcement Learning

  • 1. Hierarchical Object Detection with Deep Reinforcement Learning NIPS 2016 Workshop on Reinforcement Learning [github] [arXiv] Míriam Bellver, Xavier Giró i Nieto, Ferran Marqués, Jordi Torres
  • 2. Outline ● Introduction ● Related Work ● Hierarchical Object Detection Model ● Experiments ● Conclusions 2
  • 4. Introduction We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. 4 OBJECT FOUND
  • 5. Introduction We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. 5 OBJECT FOUND
  • 6. Introduction We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. 6 OBJECT FOUND
  • 7. Introduction What is Reinforcement Learning ? “a way of programming agents by reward and punishment without needing to specify how the task is to be achieved” [Kaelbling, Littman, & Moore, 96] 7
  • 8. Introduction Reinforcement Learning ● There is no supervisor, only reward signal ● Feedback is delayed, not instantaneous ● Time really matters (sequential, non i.i.d data) 8 Slide credit: UCL Course on RL by David Silver
  • 9. Introduction Reinforcement Learning An agent that is a decision-maker interacts with the environment and learns through trial-and-error 9 Slide credit: UCL Course on RL by David Silver We model the decision-making process through a Markov Decision Process
  • 10. Introduction Reinforcement Learning An agent that is a decision-maker interacts with the environment and learns through trial-and-error 10 Slide credit: UCL Course on RL by David Silver
  • 11. Introduction Contributions: ● Hierarchical object detection in images using deep reinforcement learning agent ● We define two different hierarchies of regions ● We compare two different strategies to extract features for each candidate proposal to define the state ● We achieve to find objects analyzing just a few regions 11
  • 13. Related Work Deep Reinforcement Learning 13 ATARI 2600 Alpha Go Mnih, V. (2013). Playing atari with deep reinforcement learning Silver, D. (2016). Mastering the game of Go with deep neural networks and tree search
  • 14. Related Work 14 Region Proposals/Sliding Window + Detector Sharing convolutions over locations + Detector Sharing convolutions over location and also to the detector Single Shot detectors Uijlings, J. R. (2013). Selective search for object recognition Girshick, R. (2015). Fast R-CNN Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN Redmon, J., (2015). YOLO Liu, W.,(2015). SSD Object Detection
  • 15. Related Work 15 Region Proposals/Sliding Window + Detector Sharing convolutions over locations + Detector Sharing convolutions over location and also to the detector Single Shot detectors Object Detection they rely on a large number of locations they rely on a number of reference boxes from which bbs are regressed Uijlings, J. R. (2013). Selective search for object recognition Girshick, R. (2015). Fast R-CNN Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN Redmon, J., (2015). YOLO Liu, W.,(2015). SSD
  • 16. Related Work So far we can cluster object detection pipelines based on how the regions analyzed are obtained: ● Using object proposals ● Using reference boxes “anchors” to be potentially regressed 16
  • 17. Related Work So far we can cluster object detection pipelines based on how the regions analyzed are obtained: ● Using object proposals ● Using reference boxes “anchors” to be potentially regressed There is a third approach: ● Approaches that refine iteratively one initial bounding box (AttentionNet, Active Object Localization with DRL) 17
  • 18. Related Work Refinement of bounding box predictions Attention Net: They cast an object detection problem as an iterative classification problem. Each category corresponds to a weak direction pointing to the target object. 18Yoo, D. (2015). Attentionnet: Aggregating weak directions for accurate object detection.
  • 19. Related Work Refinement of bounding box predictions Active Object Localization with Deep Reinforcement Learning: 19Caicedo, J. C., & Lazebnik, S. (2015). Active object localization with deep reinforcement learning
  • 20. Hierarchical Object Detection Model Reinforcement Learning formulation 20
  • 21. Reinforcement Learning Formulation We cast the problem as a Markov Decision Process 21
  • 22. Reinforcement Learning Formulation We cast the problem as a Markov Decision Process State: The agent will decide which action to choose based on the concatenation of: ● visual description of the current observed region ● history vector that maps past actions performed 22
  • 23. Reinforcement Learning Formulation We cast the problem as a Markov Decision Process Actions: Two kind of actions: ● movement actions: to which of the 5 possible regions defined by the hierarchy to move ● terminal action: the agent indicates that the object has been found 23
  • 24. Reinforcement Learning Formulation Hierarchies of regions For the first kind of hierarchy, less steps are required to reach a certain scale of bounding boxes, but the space of possible regions is smaller 24 trigger
  • 25. Reinforcement Learning Formulation Reward: 25 Reward for movement actions Reward for terminal action
  • 26. Hierarchical Object Detection Model Q-learning 26
  • 27. Q-learning In Reinforcement Learning we want to obtain a function Q(s,a) that predicts best action a in state s in order to maximize a cumulative reward. This function can be estimated using Q-learning, which iteratively updates Q(s,a) using the Bellman Equation 27 immediate reward future reward discount factor = 0.90
  • 28. Q-learning What is deep reinforcement learning? It is when we estimate this Q(s,a) function by means of a deep network 28 Figure credit: nervana blogpost about RL one output for each action
  • 30. Model We tested two different configurations of feature extraction: Image-Zooms model: We extract features for every region observed Pool45-Crops model: We extract features once for the whole image, and ROI-pool features for each subregion 30
  • 31. Model Our RL agent is based on a Q-network. The input is: ● Visual description ● History vector The output is: ● A FC of 6 neurons, indicating the Q-values for each action 31
  • 32. Hierarchical Object Detection Model Training 32
  • 33. Training Exploration-Exploitation dilemma ε-greedy policy Exploration: With probability ε the agent performs a random action Exploitation: With probability 1-ε performs action associated to highest Q(s,a) 33
  • 34. Training Experience Replay Bellman equation learns from transitions formed by (s,a,r,s’) Consecutive experiences are very correlated, leading to inefficient training. Experience replay collects a buffer of experiences and the algorithm randomly takes mini batches from this replay memory to train the network 34
  • 36. Visualizations These results were obtained with the Image-zooms model, which yielded better results. We observe that the model approximates to the object, but that the final bounding box is not accurate. 36
  • 37. Experiments We calculate an upper-bound and baseline experiment with the hierarchies, and observe that both are very limited in terms of recall. Image-Zooms model achieves better Precision-Recall metric 37
  • 38. Experiments Most of the searches for objects of our agent finish with just 1, 2 or 3 steps, so our agent requires very few steps to approximate to objects. 38
  • 40. Conclusions ● Image-Zooms model yields better results. We argue that with the ROI-pooling approach we do not have as much resolution as with the Image-Zoom features. Although Image-Zooms is more computationally intensive, we can afford it because with just a few steps we approximate to the object. ● Our agent approximates to the object, but the final bounding box is not accurate enough due that the hierarchy limits our space of solutions. A solution could be training a regressor that adjusts the bounding box to the target object. 40
  • 41. Acknowledgements Technical Support Financial Support 41 Albert Gil (UPC) Josep Pujal (UPC) Carlos Tripiana (BSC)
  • 42. Thank you for your attention! 42