PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

bit.ly/PathGAN
@DocXavi
EPIC
Third International
Workshop on
Egocentric Perception,
Interaction and
Computing
Munich, Germany
9th September, 2018
PathGAN:
Visual Scanpath Prediction with
Generative Adversarial Networks
@DocXavi
Marc
Assens
Kevin
McGuinness
Noel E.
O’Connor
Xavier
Giro-i-Nieto

bit.ly/PathGAN
@DocXavi
2
Let’s play a
game!

bit.ly/PathGAN
@DocXavi
The importance of visual attention

bit.ly/PathGAN
@DocXavi
Why don’t we see the changes?
We don’t really see the whole image
We only focus on small specific regions: the salient parts
Human beings reliably attend to the same regions of images
when shown

bit.ly/PathGAN
@DocXavi
What we perceive

bit.ly/PathGAN
@DocXavi
Where we look

bit.ly/PathGAN
@DocXavi
What we actually see

bit.ly/PathGAN
@DocXavi
Why predicting WHEREhumans will look?

bit.ly/PathGAN
@DocXavi
15Johanna Närväinen, Janne Laine, “Looking through their eyes” VTT Impulse 2016.

bit.ly/PathGAN
@DocXavi
16
Yashas Rai, Jesús Gutiérrez, and Patrick Le Callet. 2017. “A Dataset of Head and Eye
Movements for 360 Degree Images“. MMSys 2017

bit.ly/PathGAN
@DocXavi
Can we predict WHEREhumans will look?
Yes! We “just” need to collect data...

bit.ly/PathGAN
@DocXavi
Can we predict WHEREhumans will look?
Yes! We “just” need to collect data...
...and train a deep neural network !

bit.ly/PathGAN
@DocXavi
Image
(input)
Visual
scanpath
(ouput)

bit.ly/PathGAN
@DocXavi
Image
(input)
Saliency
map
(ouput)

bit.ly/PathGAN
@DocXavi
Our brief history of saliency models
Eva
Mohedano
Panagiotis
Linardos
Cristian
Canton
Junting
Pan
Èric
Arazo
Marta
Coll
Elisa
Sayrol
Jordi
Torres

bit.ly/PathGAN
@DocXavi
JuntingNet
Saliency maps

bit.ly/PathGAN
@DocXavi
25
Upsample
+ filter
2D
map
J. Pan, X. Giró-i-Nieto, “End-to-end Convolutional Network for Saliency Prediction” (CVPRW 2015)
JuntingNet

bit.ly/PathGAN
@DocXavi
SalNetJuntingNet
Saliency maps

bit.ly/PathGAN
@DocXavi
28
VGG-M
Pan, Junting, Kevin McGuinness, Elisa Sayrol, Xavier Giro-i-Nieto, and Noel E. O'Connor. "Shallow and
deep convolutional networks for saliency prediction." CVPR 2016.
SalNet

bit.ly/PathGAN
@DocXavi
SalNetJuntingNet SalGAN
Saliency maps

bit.ly/PathGAN
@DocXavi
32
Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and Xavier
Giro-i-Nieto. “SalGAN: Visual Saliency Prediction with Generative Adversarial Networks.” CVPRW 2017.
Generator
SalGAN
Discriminator

bit.ly/PathGAN
@DocXavi
SaltiNet
Saliency maps
Scanpaths

bit.ly/PathGAN
@DocXavi
35
Stochastic sampling
SalTiNet
Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto and Noel E. O’Connor. “SaltiNet: Scan-Path
Prediction on 360 Degree Images Using Saliency Volumes.” ICCVW 2017.

bit.ly/PathGAN
@DocXavi
36
Winners of IEEE ICME
2017 Challenge
Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto and Noel E. O’Connor. “SaltiNet: Scan-Path
Prediction on 360 Degree Images Using Saliency Volumes.” ICCVW 2017.

bit.ly/PathGAN
@DocXavi
SaltiNet PathGAN
Saliency maps
Scanpaths

bit.ly/PathGAN
@DocXavi
40
Create a model able to predict visual scanpaths
PathGAN solution:
Generate a sequence of fixation points with
Recurrent Neural Networks (RNNs).
Challenge:

bit.ly/PathGAN
@DocXavi
41
Recurrent Neural Network (RNN)
Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8
(1997): 1735-1780.

bit.ly/PathGAN
@DocXavi
43Credit: Santiago Pascual [slides] [video]

bit.ly/PathGAN
@DocXavi
44
● Pick a real scanpath x from training set
● Show x to D and update weights to
output 1 (real)
Adversarial training in a nutshell

bit.ly/PathGAN
@DocXavi
45
● Pick a real scanpath x from training set
● Show x to D and update weights to predict “real” (1).

bit.ly/PathGAN
@DocXavi
46
● G generates a synthetic scanpath ẍ (adding noise & drop out)
● Show synthetic scanpath ẍ to D and update its weights to predict “fake” (0).

bit.ly/PathGAN
@DocXavi
47
● Freeze D weights
● Keep showing the same synthetic scanpath ẍ to D
● Update G weights to make D predict “real” (1) (only G weights!)
● Unfreeze D Weights and repeat

bit.ly/PathGAN
@DocXavi
48
Adversarial Loss for cGAN
Content Loss
Multiobjective Loss for G

bit.ly/PathGAN
@DocXavi
49
Training curves

bit.ly/PathGAN
@DocXavi
50
Model Architecture (cGAN)
Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint
arXiv:1411.1784 (2014).

bit.ly/PathGAN
@DocXavi
51

bit.ly/PathGAN
@DocXavi
52

bit.ly/PathGAN
@DocXavi
53
Qualitative Results: iSUN

bit.ly/PathGAN
@DocXavi
54
Quantitative Results: iSUN

bit.ly/PathGAN
@DocXavi
55
Qualitative Results: Salient360(2017)

bit.ly/PathGAN
@DocXavi
56
Quantitative Results: Salient360(2017)

bit.ly/PathGAN
@DocXavi
59
Next: What about video ?
Check our other poster presented by Eva Mohedano !
Panagiotis Linardos, Eva Mohedano, Monica Cherto, Cathal Gurrin, Xavier Giro-i-Nieto. “Temporal
Saliency Adaptation in Egocentric Videos”, Extended abstract at the ECCV Workshop on Egocentric
Perception, Interaction and Computing (EPIC), 2018.
EgoMon datasetSalGAN + ConvLSTM Saliency maps

bit.ly/PathGAN
@DocXavi
Project page at:
http://bit.ly/pathgan
xavier.giro@upc.edu
Danke
schön !

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Recomendados

Recomendados

Más contenido relacionado

Similar a PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

Similar a PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks (20)

Más de Universitat Politècnica de Catalunya

Más de Universitat Politècnica de Catalunya (20)

Último

Último (20)

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks