TensorFlow London: Cutting edge generative models

Cutting edge generative models
London TensorFlow meetup, March 2019
Pierre Harvey Richemond

Table of contents
1. Background
2. New in TensorFlow 2.0
3. Dual use
4. Perspectives
1

Background : everything old is new again
• Advent of ’modern’ deep learning (resnets, batchnorm) : 2015
(arbitrary depth + large scale training)
• If stabilized, deeper is better.
• Invention of neural network training rule : 1970
(Linnainmaa’s masters thesis on backpropagation, applied to
neural networks)
2

Deep vs shallow networks
7x7 conv, 64, /2
pool, /2
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 128, /2
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 256, /2
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 512, /2
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
avg pool
fc 1000
image
3x3 conv, 512
3x3 conv, 64
3x3 conv, 64
pool, /2
3x3 conv, 128
3x3 conv, 128
pool, /2
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
pool, /2
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
pool, /2
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
pool, /2
fc 4096
fc 4096
fc 1000
image
output
size: 112
output
size: 224
output
size: 56
output
size: 28
output
size: 14
output
size: 7
output
size: 1
VGG-19 34-layer plain
7x7 conv, 64, /2
pool, /2
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 64
3x3 conv, 128, /2
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 128
3x3 conv, 256, /2
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 256
3x3 conv, 512, /2
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
avg pool
fc 1000
image
34-layer residual
3

Why does it work now ?
• In between : GPUs, fibre broadband, handheld megapixel
cameras, automatic differentiation software.
• All contributed to the explosion of applications we see today.
• Deep networks tend to work better and automatically find good
features for data classification & explanation on natural data:
images/video and sound.
• Other domains like text, structured data (chemistry...) can work
too with careful network design.
4

Deep Learning breakthroughs - 2014
• VGG architecture
• DeepDream
• First formulation of GANs
• First formulation of VAEs
• Black-box Variational Inference
• Neural Turing Machines
• Attention mechanisms
5

• Residual Networks
• Batch Normalization
• ADAM optimizer
• ELU activation function
• Neural style transfer
• Graph convolutional networks
• Visual question answering (Karpathy)
• Normalizing Flows
• Deep Q-learning on Atari games
• Keras, TensorFlow
6

• Wide Residual Networks
• DenseNets
• WaveNet
• PixelCNN
• Asynchronous Advantage Actor Critic (A3C)
• AlphaGo
• Neural Machine Translation at scale
• PyTorch
7

• Wasserstein GAN
• Progressive Growing of GANs
• Capsule Networks
• Restarted SGD - cyclical learning rates
• AlphaGo Zero (MCTS as policy improvement ; no human
knowledge involved)
• Distributional Reinforcement Learning
• Equivalence of softmax Q-learning and entropic policy gradients
• TensorFlow distributions, Pyro
8

• Spectral Normalization for GANs
• Fast initialization of convolutional neural networks
• Deep Video Portraits
• Differentiable neural architecture search
• OpenAI Five
• Video-to-video synthesis
• StyleGAN
9

A success story : Generative models
• Generative modelling aims at automatically explain features,
and generate new instances, of datapoints in a dataset
• Two main approaches that both use 2 different neural networks
instead of 1.
• Applications of the ﬁeld to deep learning are 4 years old
• A success story of engineering
10

Generative models : GANs
Figure 1: Generative Adversarial Networks (GANs). A generator
(counterfeinter) and a discriminator (police) network play an adversarial
game, whose Nash equilibrium is perfect replication of the data distribution.
11

Generative models : VAEs
Figure 2: A typical variational autoencoder (VAE) architecture. An encoder
pushes input data through a (low) dimensionality bottleneck, that learns
relevant features in a latent code (optimized probabilistically). The decoder
network attempts to reconstruct inputs.
12

Illustrations - 2015
Figure 3: MNIST digits toy dataset. Interpolation.
13

Figure 4: Interpolations varying the strength of a smile vector computed by
doing latent space averaging of labeled pictures. Courtesy of Tom White.
14

Figure 5: GAN-generated 1024*1024 portraits from Karras et al., 2017.
15

Progressive Growing of GANs - Results, late 2018
Figure 6: High-resolution generations from Karras et al., 2018. 16

TensorFlow 2.0 - General layout
Figure 7: Eager (dynamic graphing) and more Pythonic.
17

TensorFlow 2.0 - Layers declaration
Figure 8: Object-oriented layers.
18

TensorFlow 2.0 - Functional
Figure 9: The ’functional’ API, from Keras, is available.
19

TensorFlow 2.0 - VAE example
Figure 10: The functional API can shorten code signiﬁcantly !
20

From fake news to un-real people
www.thispersondoesnotexist.com
A blog post by Kyle McDonald on how to detect fake portraits:
21

Some photorealistic transformations - 2018
Deep Video Portraits (Portrait reenactment)
Vid2Vid (Photorealistic video-to-video translation)
See also TacoTron for realistic speech generation.
22

OpenAI GPT-2 - Principle
• A generative model trained at scale
• A relatively ’simple’ architecture and prediction concept, but the
scaling makes all the difference (8 million webpages training set)
• While it can be used in a multitask context ( question answering,
text summarization, translation...), it really shines in text
generation.
• For the ﬁrst time, OpenAI has declined releasing the weights of
the full trained model, citing risks
• This has caused an unprecedented stir in the (self-regulated)
machine learning research community
23

OpenAI GPT-2 - Coherent paragraph generation - 1
When initalized with the following prompt text :
’In a shocking ﬁnding, scientist discovered a herd of unicorns living
in a remote, previously unexplored valley, in the Andes Mountains.
Even more surprising to the researchers was the fact that the
unicorns spoke perfect English.’
the model generated as a follow-up...
24

OpenAI GPT-2 - Coherent paragraph generation - 2
’ “The scientist named the population, after their distinctive horn,
Ovid’s Unicorn. These four-horned, silver-white unicorns were
previously unknown to science. Now, after almost two centuries, the
mystery of what sparked this odd phenomenon is ﬁnally solved.
Dr. Jorge Pérez, an evolutionary biologist from the University of La
Paz, and several companions, were exploring the Andes Mountains
when they found a small valley, with no other animals or humans.
Pérez noticed that the valley had what appeared to be a natural
fountain, surrounded by two peaks of rock and silver snow.
Pérez and the others then ventured further into the valley. ‘By the
time we reached the top of one peak, the water looked blue, with
some crystals on top,’ said Pérez.
Pérez and his friends were astonished to see the unicorn herd. These
creatures could be seen from the air without having to move too
much to see them – they were so close they could touch their horns.”’
25

OpenAI GPT-2 : Test 1 - Neural news generation
GPT2 seed ’DeepMind should have been a UK champion says
ex-Google CFO who took it to America’:
Google executives Mark Zuckerberg and Sergey Brin,formerly head of
the social media giant, have been accused of lying to the US
government about the extent of their ties to the Kremlin.
26

OpenAI GPT-2 : Test 2 - Scientific Generation !
’We compute a stochastic scaling function for data with a weighted
average of the mean as the value for which to apply an Likert
distribution over the data in the gradient descent. We then evaluate
the prediction accuracy and ﬁnd that the weights are not completely
uniform in this way. Instead, the weight of this residual model is a
weighted average of the data in the gradient descent on the average,
which is also very accurate.’
27

Towards post-truth economics
• Shattering the realism barrier of the uncanny valley
• Legislation will be required (don’t hold your breath)
• Content industries could be notably disrupted
• Is this a serious use case for blockchain tech ?
Is the marginal cost of content creation going to zero ?
28

Are we at AGI yet ?
3 different views :
1. Killer robots are coming, let’s ban research !
2. AI is just matrix multiplication...
3. AI/Deep Learning is in fact letting an optimization algorithm
automatically write code for you
29

’Science sans conscience’ : TBC...
Figure 11: Human DNA data embedding and clustering (UMAP), categorized
by ethnicity. BiorXiv 2018, ’Revealing multi-scale population structure in
large cohorts’.
30

Thank you for your attention
Twitter @KloudStrife
www.deeplearningmathematics.com

TensorFlow London: Cutting edge generative models

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a TensorFlow London: Cutting edge generative models

Similar a TensorFlow London: Cutting edge generative models (20)

Más de Seldon

Más de Seldon (20)

Último

Último (20)

TensorFlow London: Cutting edge generative models