Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona 2020

Xavier Giro-i-Nieto
Associate Professor
Universitat Politecnica de Catalunya
@DocXavi
xavier.giro@upc.edu
Generative Adversarial Networks - GANs
Lecture 13
[course site]

2
Acknowledgements
Santiago Pascual
santi.pascual@upc.edu
@santty128
PhD
Universitat Politecnica de Catalunya
Technical University of Catalonia
Albert Pumarola
apumarola@iri.upc.edu
@AlbertPumarola
PhD Student
Universitat Politécnica de Catalunya
Technical University of Catalonia
Kevin McGuinness
kevin.mcguinness@dcu.ie
Research Fellow
Insight Centre for Data Analytics
Dublin City University

3
Video-lectures
Santiago Pascual - DLAI 2018 Xavier Giró - DLAI 2020Kevin McGuinness - DLCV 2018

Today
4
● We are going to make a shift in the modeling paradigm:
discriminative → generative.
● Is this a type of neural network? No → either fully connected
networks, convolutional networks or recurrent networks ﬁt,
depending on how we condition data points (sequentially,
globally, etc.).
● This is a very fun topic with which we can make a network
paint, sing or write.

What we are used to do with Neural Nets
6
Slide credit: Albert Pumarola (UPC 2019)
Classiﬁcation Regression
Text Prob. of being a Potential Customer
Image
Audio Language Translation
Jim Carrey
What Language?
Discriminative Modeling
P(y|x)

7
What are generative models?
Classiﬁcation Regression Generative
Text Prob. of being a Potential Customer
“What about Ron magic?” offered Ron.
To Harry, Ron was loud, slow and soft
bird. Harry did not like to think about
birds.
Image
Audio Language Translation
Music Composer and Interpreter
MuseNet Sample
Jim Carrey
What Language?
Discriminative Modeling
P(y|x)
Generative Modeling
P(x)

Learn
Sample Out
Training Dataset
Generated Samples
Modeled Data Space
“Model the data distribution so that we can sample new samples out of the
distribution”

“Model the data distribution so that we can sample new samples out of the
distribution”
Learn
Training Dataset
Interpolated Samples
Modeled Data Space

10
1) We want our model with parameters θ={weights, biases} to output samples
distributed Pmodel, matching the distribution of our training data Pdata or
P(X).
2) We can sample points from Pmodel plausibly looking Pdata distributed.
What is a generative model?
We have not mentioned any network structure
or model input data format, just a requirement
on what we want in the output of our model.

11
We can generate any type of data, like speech waveform samples or image
pixels.
M samples = M dimensions
32
32
.
.
.
Scan and unroll
x3 channels
M = 32x32x3 = 3072 dimensions

Our learned model should be able to make up new samples from the
distribution, not just copy and paste existing samples!
12
Figure from NIPS 2016 Tutorial: Generative Adversarial Networks (I. Goodfellow)

Outline
1. Generative Models
2. Motivating Applications

Super-resolution
14#TecoGAN Chu, M., Xie, Y., Mayer, J., Leal-Taixé, L., & Thuerey, N. (2020). Learning temporal coherence via self-supervision
for GAN-based video generation. ACM Transactions on Graphics (TOG), 39(4), 75-1.

Image translation
15#pix2pixHD Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018). High-resolution image synthesis
and semantic manipulation with conditional gans. CVPR 2018.

Image generation
16#StyleGAN Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial
networks. CVPR 2019.

Image generation
17#StarGANv2 Choi, Y., Uh, Y., Yoo, J., & Ha, J. W. (2020). Stargan v2: Diverse image synthesis for multiple domains. CVPR
2020. [code]

Human Motion Transfer
18
#EDN Chan, C., Ginosar, S., Zhou, T., & Efros, A. A. (2019). Everybody dance now. ICCV 2019.

Speech Enhancement
19
Recover lost information/add enhancing details by learning the natural distribution of audio
samples.
original
enhanced

Reﬁnement of Synthetic Training Data
20
Shrivastava, A., Pﬁster, T., Tuzel, O., Susskind, J., Wang, W., & Webb, R. Learning from simulated and unsupervised
images through adversarial training. CVPR 2017 [Best paper award]
Refiner = Generator

Outline
3. Generative Adversarial Networks (GANs)

First GANs appearance, back then… (only 6 years ago)
22
(Goodfellow et al. 2014)

23
Generative Adversarial Networks (GAN)
Source: “The Malicious Use of Artiﬁcial Intelligence: Forecasting, Prevention, and Mitigation” (2018)

24
Generative Adversarial Networks (GAN)
#BigGAN Brock, Andrew, Jeﬀ Donahue, and Karen Simonyan. "Large scale gan training for high ﬁdelity natural image
synthesis." ICLR 2019.

Outline
a. Generator & Discriminator Networks

27
Credit: Santiago Pascual [slides] [video]

28
Generator & Discriminator
We have two modules: Generator (G) and Discriminator (D).
● They “ﬁght” against each other during training→ Adversarial Learning
● G mission: Fool D to missclassify.
● D mission: Discriminate between G samples and real samples.

29
Generator network G → generate samples ẍ that follow a probability distribution Pg
, and make it close
to Pdata
(our dataset samples distribution).
Q: What do we input to our model in order to make up new samples through a neural network?
...
...
...
...
ẍ?
Generated Samples

30
Generator network G → generate samples ẍ that follow a probability distribution Pg
, and make it close
to Pdata
(our dataset samples distribution).
Q: What do we input to our model in order to make up new samples through a neural network?
A: We can sample from a simplistic prior, for example, a Gaussian distribution N(0, I).
...
...
...
...
ẍz
Generated Samples

31
A: With another network trained to do so.
Discriminator network D → probability that a sample came from the training data rather than G.
...
...
...
...
cx
Real cat?
Yes or No
Q: How can we evaluate the quality of the generated samples?

Generator
Real world
samples
Database
Discriminator
Real
Loss
Latentrandomvariable
Sample
Sample
Fake
32
z
D is a binary classiﬁer, whose objectives
are: database images are Real, whereas
generated ones are Fake .

33
Generator
Example: DCGAN for image generation
#DCGAN Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative
adversarial networks. ICLR 2016.

Outline
b. Adversarial Training

Imagine we have a counterfeiter (G) trying to make fake money, and the police (D) has to detect
whether money is real or fake.
100
100
FAKE: It’s
not even
green
Adversarial Training Analogy: is it fake money?

100
100
FAKE:
There is no
watermark

100
100
FAKE:
Watermark
should be
rounded

Imagine we have a counterfeiter (G) trying to make fake money, and the police (D) has to
detect whether money is real or fake.
After enough iterations, and if the counterfeiter is good enough (in terms of G network it
means “has enough parameters”), the police should be confused.
REAL?
FAKE?

Adversarial Training
Generator
Real world
images
Discriminator
Real
Loss
Sample
Sample
Fake
Alternate between training the discriminator and generator
Diﬀerentiable module
Diﬀerentiable module
Figure: Kevin McGuinness

Adversarial Training: Discriminator
Generator
Real world
images
Discriminator
Real
Loss
Sample
Sample
Fake
1. Fix generator weights, draw samples from both real world and generated images
2. Train discriminator to distinguish between real world and generated images
Backprop error to
update discriminator
weights

Adversarial Training: Generator
1. Fix discriminator weights
2. Sample from generator
3. Backprop error through discriminator to update generator weights
Generator
Real world
images
Discriminator
Real
Loss
Sample
Sample
Fake
Backprop error to
update generator
weights

Adversarial Loss
Minimax objective for GANs
Binary classiﬁcation
REAL (1) / FAKE (0)
Synthetic dataReal data

Adversarial Training: Why does it work ?
#SRGAN Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., ... & Shi, W. (2017). Photo-realistic single
image super-resolution using a generative adversarial network. CVPR 2017.

Adversarial Training: How to make it work ?
Soumith Chintala, “How to train a GAN ? Tips and tricks to make GAN work”. Github 2016.
NeurIPS Barcelona 2016

Outline
b. Adversarial Training
c. Conditional GANs
d. GANs at UPC

Non-Conditional GANs
46
Slide credit: Víctor Garcia
Discriminator
D(·)
Generator
G(·)
Real World
Random
seed (z)
Real/Synthetic

47
Conditional GANs (cGAN)
Slide credit: Víctor Garcia
Conditional Adversarial Networks
Real World
Real/Synthetic
Condition
Discriminator
D(·)
Generator
G(·)

#cGAN Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784
(2014).

(Reed et al. 2016b) (Zhang et al. 2016)

#pix2pix Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-image translation with conditional adversarial
networks." CVPR 2017.

#pix2pix Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-image translation with conditional adversarial
networks." CVPR 2017.
Generator
Discriminator
Generated
pairs
Real World
Ground truth
pairs
Loss

GANs @ UPC: Pioneers
“How to train a GAN” tutorial, NeurIPS Barcelona 2016.
Víctor
Garcia
Santi
Pascual
Junting
Pan

GANs @ UPC: Colorization
Generator
Discriminator Adversarial
+ Binary Cross-Entropy
Losses
Loss1
{MSE or MAE}
c1, c2, …, cnMulti-class
Cross
entropy
Víctor Garcia, Introduction to Research, UPC ETSETB 2016.

GANs @ UPC: Speech denoising
#SEGAN Pascual, Santiago, Antonio Bonafonte, and Joan Serra. "SEGAN: Speech enhancement generative
adversarial network." Interspeech 2017.

#SalGAN Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O’Connor, Jordi Torres, Elisa Sayrol and
Xavier Giro-i-Nieto. “SalGAN: Visual Saliency Prediction with Generative Adversarial Networks.” CVPRW
2017.
Generator Discriminator
GANs @ UPC: Visual Saliency prediction

GANs @ UPC: Visual Saliency prediction
#PathGAN Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto and Noel E. O’Connor. “PathGan: Visual
Scan-path Prediction with Generative Adversarial Networks“ ECCV EPIC Workshop 2018.

GANs @ UPC: Face Hallucination from Speech
#Wav2Pix Amanda Duarte, Francisco Roldan, Miquel Tubau, Janna Escur, Santiago Pascual, Amaia
Salvador, Eva Mohedano et al. “Wav2Pix: Speech-conditioned Face Generation using Generative
Adversarial Networks” ICASSP 2019
Generated faces condition by spoken utterances of known YouTubers.

GANs @ UPC: Human Motion Transfer
Ventura L., Duarte, A., Giró-i-Nieto, X. “Can everybody sign now ? Exploring sign language video
generation from 2D poses”. .ECCV Workshop (2020)

GANs @ UPC: GANimation
#GANimation Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. Ganimation:
Anatomically-aware facial animation from a single image. ECCV 2018. (Best paper honourable mention)

GANs @ UPC: Cloth Transfer
Pumarola, A., Goswami, V., Vicente, F., De la Torre, F., & Moreno-Noguer, F. (2019). Unsupervised image-to-video
clothing transfer. ICCV Workshops 2019.

Learn more
Victòria Miró / Oriol Esteve, “La meitat de les notícies que consumirem el 2022
seran falses”. TN Vespre, TV3 (26 de Novembre, 2017)

Learn more
Ian Goodfellow, NeurIPS Barcelona 2016.

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona 2020

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona 2020

Similar a Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona 2020 (20)

Más de Universitat Politècnica de Catalunya

Más de Universitat Politècnica de Catalunya (20)

Último

Último (20)

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona 2020