Mostly paper review of Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017.
Prepared for Lab Seminar at SNU Datamining Center on 20180213.
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Semantic Image Inpainting
1. Semantic Image Inpainting
Semantic Image Inpainting with Deep Generative Models,
R Yeh et al. CVPR 2017
LAB SEMINAR
1
2018.02.13
SNU DATAMINING CENTER
MINKI CHUNG
2. TABLE OF CONTENTS
▸ Motivation
▸ What is image inpainting
▸ Problem statement
▸ Baseline
▸ Semantic image inpainting with Deep Generative
Models
▸ My work
▸ Discussion
2
4. MOTIVATION 4
▸ What is image inpainting?
https://www.youtube.com/watch?v=1F-6iRrgh1s
5. MOTIVATION 5
▸ Objective: Make attentive inpainter
IF BACKGROUND OF TARGET REMOVING OBJECT IS SIMPLE, EXISTING METHOD WORKS FINE
HOWEVER, IF BACKGROUND OF TARGET REMOVING OBJECT IS COMPLEX, BETTER NEED ANOTHER METHOD
7. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 7
▸ DCGAN-Based
▸ Not end-to-end:
▸ 1. Train generator first (uncorrupted data)
▸ 2. Find z_hat for inpainting
CONTEXTUAL LOSSPRIOR LOSS
https://arxiv.org/abs/1607.07539
8. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 8
▸ Hypothesis: Trained G is efficient- image not from pdata (e.g., corrupted data) should
not lie on the learned encoding manifold, z
▸ Objective: Find encoding z_hat: “closest” to the corrupted image while being
constrained to the manifold,
▸ y: corrupted image
M: binary mask(size equal to the image)
https://arxiv.org/abs/1607.07539
PRIOR LOSSCONTEXTUAL LOSS
9. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 9
▸ Contextual Loss: Not simply l1 norm between G(z) and uncorrupted portion of
input image y, do consider corrupted area
▸ Weighting term W,
▸ So,
Wi: importance weight at pixel location i
N(i): set of neighbors of pixel i in a local window
BIGGER WEIGHT
y: corrupted image
M: binary mask(size equal to the image)
https://arxiv.org/abs/1607.07539
10. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 10
▸ Prior Loss: how realistic the generated image is
▸ Identical to the GAN loss for training the discriminator D,
▸
▸ Without Lp, the mapping from y to z may converge to a perceptually implausible
result
https://arxiv.org/abs/1607.07539
11. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 11
▸ Tackling points:
▸ Object-level occlusion: Narrowing down for object removal
▸ Contextual loss: A pixel that is very far away from any holes plays very little role
in the inpainting process.
▸ What if..?
▸ Interpretation: Want to see the pixel which plays key role in deciding z_hat
▸ → Attention
1
2
3
13. MY WORK 13
▸ Object-level occlusion: Narrowing down to object removal
▸ MS-COCO Dataset
▸ Train set: 118287
▸ COCO Api: Get annotations(instance)
▸ Use images which have person instance such that
smaller than 1/4 of the image bigger than 1/20 of the
image
▸ 30830, (rescale to 256x256)
1
14. MY WORK 14
▸ Limitation of contextual loss: less influence of farther part on inpainting
▸ Naive approach: for each grid of image, find pixel influence(attention_ratio) on
finding optimal z_hat
▸ Do it subsequently, grid by grid
occlusio
0.1 0.1 0.1
0.4 0.6
0.4 0.3
0.1 0.7
0.5 0.3
0.8 0.7
0.6 0.7
0.2 0.40.1 0.3
0.1 0.1
0.2
0.1
0.7
0.1
occlusio
2
15. MY WORK 15
▸ After finding optimal attention_ratio for each grid
▸ Find noize z hat based on ‘Original * Attn_Ratio image’ to reconstruct image
▸ Visualization of pixel influence on inpainting
ORIGINAL ORIGINAL*ATTN_RATIOMASKED
3
16. MY WORK 16
▸ However… because of computation inefficiency, unable to learn
▸ (Current situation) Rethinking about the attention method..
WITHOUT ATTENTION, 1000 EPOCH WITH ATTENTION, 20 EPOCH
18. REFERENCE
▸ Semantic image inpainting with Deep Generative Models, Raymond A.
Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-
Johnson, Minh N. Do, CVPR 2017, https://arxiv.org/abs/1607.07539
▸ MS COCO dataset, http://cocodataset.org/#home
18