Semantic Segmentation with Limited Annotation

•Descargar como PPTX, PDF•

1 recomendación•486 vistas

哲

1. Simple Does It: Weakly Supervised Instance and Semantic Segmentation (CVPR 2017) Weak 2. Colorful Image Colorization (ECCV 2016 oral) Self

Tecnología

Semantic Segmentation with
Limited Annotation
Zhedong Zheng
24 Feb 2018
1

What can we learn from
(from Stephen Chow’s film)
2

1. Simple Does It: Weakly Supervised Instance and
Semantic Segmentation (CVPR 2017) Weak
2. Colorful Image Colorization (ECCV 2016 oral) Self
Related Works
4

How
Start from object bounding box annotations
7

Recall Several Rules
1. Background : No bounding box -> background
2. Object Extent : Bboxes are instance-level, provide
information
3. Objectness : Spatial Continuity / Contrasting boundary
8

How to begin?
If two boxes overlap, we assume the smaller one is in front.
9

Post-Process
• Any pixel outside bbox is discard.
• If IoU<50%, re-inital
• DenseCRF
11

Result
Naïve is without post-processing.
12

Grayscale image: L channel Color information: ab channels
abL
17

abL
Concatenate (L,ab)Grayscale image: L channel
“Free”
supervisory
signal
Semantics? Higher-level
abstraction?
18

Inherent Ambiguity
Our Output Ground Truth
20

Colors in ab space
(continuous)Better Loss Function
• Regression with L2 loss inadequate
• Use multinomial classification
• Class rebalancing to encourage
learning of rare colors
21

Better Loss Function Colors in ab space
(discrete)
• Regression with L2 loss inadequate
• Use multinomial classification
• Class rebalancing to encourage
learning of rare colors
22

Evaluation
Visual Quality Representation Learning
Quantitative
Per-pixel accuracy
Perceptual realism
Semantic interpretability
Task generalization
ImageNet classification
Task & dataset generalization
PASCAL classification, detection, segmentation
Qualitative
Low-level stimuli
Legacy grayscale photos
Hidden unit activations
25

faces
dog
faces
flowers
Hidden Unit (conv5) Activations
26

Dataset & Task Generalization on PASCAL VOC
%fromGaussianto
ImageNetlabels
Classification Detection Segmentation
Gaussian
Initialization
ImageNet
Labels
100%
0%
Pathak et al.
Donahue et al.
Doersch et al.Krähenbühl et al.
Ours
Autoencoder Wang & Gupta
Agrawal et al.
27

Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 30

Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 31

Más contenido relacionado

La actualidad más candente

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee

【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった！モバイルに最適化されたリアルタイム3D演出！UnityTechnologiesJapan002

Deep Learning Roshan Chettri

Emerging Properties in Self-Supervised Vision TransformersSungchul Kim

YOLOgeothomas18

HDR Theory and practicce (JP)Hajime Uchimura

Semantic Segmentation Methods using Deep LearningSungjoon Choi

[0903 구경원] recast 네비메쉬KyeongWon Koo

Real Time Object TrackingVanya Valindria

Display color와 Digital texture format의 이해SangYun Yi

Moving object detectionRaviraj singh shekhawat

Custom fabric shader for unreal engine 4동석 김

Yann le cunYandex

Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson

Deep VO and SLAMYu Huang

Lidar for Autonomous Driving II (via Deep Learning)Yu Huang

[UniteKorea2013] Memory profiling in UnityWilliam Hugo Yang

Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...JP Lee

Image segmentation with deep learningAntonio Rueda-Toicen

Neural Scene Representation & Rendering: Introduction to Novel View SynthesisVincent Sitzmann

La actualidad más candente (20)

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった！モバイルに最適化されたリアルタイム3D演出！

Deep Learning

Emerging Properties in Self-Supervised Vision Transformers

YOLO

HDR Theory and practicce (JP)

Semantic Segmentation Methods using Deep Learning

[0903 구경원] recast 네비메쉬

Real Time Object Tracking

Display color와 Digital texture format의 이해

Moving object detection

Custom fabric shader for unreal engine 4

Yann le cun

Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)

Deep VO and SLAM

Lidar for Autonomous Driving II (via Deep Learning)

[UniteKorea2013] Memory profiling in Unity

Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...

Image segmentation with deep learning

Neural Scene Representation & Rendering: Introduction to Novel View Synthesis

Similar a Semantic Segmentation with Limited Annotation

Image segmentation ajalAJAL A J

Object detection - RCNNs vs RetinanetRishabh Indoria

Unsupervised Cross-Domain Image GenerationJunho Cho

最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui

Computer vision seriesPerry Lea

Image Translation with GANJunho Cho

ilp-nlp-slides.pdfFlorentBersani

Hill Stephen Rendering Tools Splinter Cell Convictionozlael ozlael

Deep learning in Computer VisionDavid Dao

Exploiting Worker Correlation for Label Aggregation in CrowdsourcingYuanLi589586

Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...JaeJun Yoo

What's Wrong With Deep Learning?Philip Zheng

Human parsingssuserb1420b

Deep-Learning Based Stereo Super-ResolutionNAVER Engineering

Face Detection techniquesAbhineet Bhamra

“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance

MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle

Lec11 object-re-idUnited States Air Force Academy

Convolutional neural networks 이론과 응용홍배 김

Similar a Semantic Segmentation with Limited Annotation (20)

Image segmentation ajal

Object detection - RCNNs vs Retinanet

Unsupervised Cross-Domain Image Generation

最近の研究情勢についていくために - Deep Learningを中心に -

Computer vision series

Image Translation with GAN

ilp-nlp-slides.pdf

Hill Stephen Rendering Tools Splinter Cell Conviction

Deep learning in Computer Vision

Exploiting Worker Correlation for Label Aggregation in Crowdsourcing

Modeling perceptual similarity and shift invariance in deep networks

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...

What's Wrong With Deep Learning?

Human parsing

Deep-Learning Based Stereo Super-Resolution

Face Detection techniques

“Vision-language Representations for Robotics,” a Presentation from the Unive...

MLIP - Chapter 5 - Detection, Segmentation, Captioning

Lec11 object-re-id

Convolutional neural networks 이론과 응용

Más de 哲东郑

Deep learning for person re-identification哲东郑

Cross-domain complementary learning with synthetic data for multi-person part...哲东郑

Step zhedong哲东郑

Visual saliency哲东郑

Image Synthesis From Reconfigurable Layout and Style哲东郑

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东郑

Weijian image retrieval哲东郑

Scops self supervised co-part segmentation哲东郑

Video object detection哲东郑

Center nets哲东郑

C2 ae open set recognition哲东郑

Sota semantic segmentation哲东郑

Deep randomized embedding哲东郑

Semantic Image Synthesis with Spatially-Adaptive Normalization哲东郑

Instance level facial attributes transfer with geometry-aware flow哲东郑

Learning to adapt structured output space for semantic哲东郑

Unsupervised Learning of Object Landmarks through Conditional Image Generation哲东郑

Graph based global reasoning networks 哲东郑

Style gan哲东郑

Vi2vi哲东郑

Más de 哲东郑 (20)

Deep learning for person re-identification

Cross-domain complementary learning with synthetic data for multi-person part...

Step zhedong

Visual saliency

Image Synthesis From Reconfigurable Layout and Style

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval

Weijian image retrieval

Scops self supervised co-part segmentation

Video object detection

Center nets

C2 ae open set recognition

Sota semantic segmentation

Deep randomized embedding

Semantic Image Synthesis with Spatially-Adaptive Normalization

Instance level facial attributes transfer with geometry-aware flow

Learning to adapt structured output space for semantic

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Graph based global reasoning networks

Style gan

Vi2vi

Último

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

How to convert PDF to text with Nanonetsnaman860154

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Histor y of HAM Radio presentation slidevu2urc

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

A Call to Action for Generative AI in 2024Results

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Semantic Segmentation with Limited Annotation

1. Semantic Segmentation with Limited Annotation Zhedong Zheng 24 Feb 2018 1

2. What can we learn from (from Stephen Chow’s film) 2

3. 3

4. 1. Simple Does It: Weakly Supervised Instance and Semantic Segmentation (CVPR 2017) Weak 2. Colorful Image Colorization (ECCV 2016 oral) Self Related Works 4

5. 1. Simple Does It: Weakly Supervised Instance and Semantic Segmentation (CVPR 2017) Weak 2. Colorful Image Colorization (ECCV 2016 oral) Self Related Works 5

6. What 6

7. How Start from object bounding box annotations 7

8. Recall Several Rules 1. Background : No bounding box -> background 2. Object Extent : Bboxes are instance-level, provide information 3. Objectness : Spatial Continuity / Contrasting boundary 8

9. How to begin? If two boxes overlap, we assume the smaller one is in front. 9

10. How to begin? 10

11. Post-Process • Any pixel outside bbox is discard. • If IoU<50%, re-inital • DenseCRF 11

12. Result Naïve is without post-processing. 12

13. Result 13

14. Result 14

15. 1. Simple Does It: Weakly Supervised Instance and Semantic Segmentation (CVPR 2017) Weak 2. Colorful Image Colorization (ECCV 2016 oral) Self Related Works 15

16. 16

17. Grayscale image: L channel Color information: ab channels abL 17

18. abL Concatenate (L,ab)Grayscale image: L channel “Free” supervisory signal Semantics? Higher-level abstraction? 18

19. Inherent Ambiguity Grayscale 19

20. Inherent Ambiguity Our Output Ground Truth 20

21. Colors in ab space (continuous)Better Loss Function • Regression with L2 loss inadequate • Use multinomial classification • Class rebalancing to encourage learning of rare colors 21

22. Better Loss Function Colors in ab space (discrete) • Regression with L2 loss inadequate • Use multinomial classification • Class rebalancing to encourage learning of rare colors 22

23. Failure Cases 23

24. Biases 24

25. Evaluation Visual Quality Representation Learning Quantitative Per-pixel accuracy Perceptual realism Semantic interpretability Task generalization ImageNet classification Task & dataset generalization PASCAL classification, detection, segmentation Qualitative Low-level stimuli Legacy grayscale photos Hidden unit activations 25

26. faces dog faces flowers Hidden Unit (conv5) Activations 26

27. Dataset & Task Generalization on PASCAL VOC %fromGaussianto ImageNetlabels Classification Detection Segmentation Gaussian Initialization ImageNet Labels 100% 0% Pathak et al. Donahue et al. Doersch et al.Krähenbühl et al. Ours Autoencoder Wang & Gupta Agrawal et al. 27

28. Amateur Family Photo, 1956. 28

29. Amateur Family Photo, 1956. 29

30. Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 30

31. Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 31

Notas del editor

So formally, we are working in the Lab color space. The grayscale information is contained in the L, or lightness channel of the image, and is the input to our system. The output is the ab, or color channels. We’re looking to learn the mapping from L to ab using a CNN. We can then take the predicted ab channels, concatenate them with the input, and hopefully get a plausible colorization of the input image. This is the graphics benefit of this problem.
We note that any image can be broken up into its grayscale and color components, and in this manner, can serve as a free supervisory signal for training a CNN. So perhaps by learning to color, we can achieve a deep representation which has higher level abstractions, or semantics. Now, this learning problem is less straightforward than one may expect.
For example, consider this grayscale image.
This is the output after passing it through our system. Now, it seems to look plausible. Now here is the ground truth. So notice that these two look very different. But even though red and blue are far apart in ab space, we are just as happy with the red colorization as we are with the blue, and perhaps the red is even better...
This indicates that any loss which assumes a unimodal output distribution, such as an L2 regression loss, is likely to be inadequate.
We reformulate the problem as multinomial classification. We divide the output ab space into discrete bins of size 10.
The system does have some interesting failure cases. We find that many man-made objects can be multiple colors. The system sometimes has a difficult time deciding which one to go with, leading to this type of tie-dye effect.
Also, we find other curious behaviors and biases. For example, when the system sees a dog, it sometimes expects a tongue underneath. Even when there is none, it will just go ahead and hallucinate one for us anyways.
Due to time constraints, we will not be able to discuss all of the tests, but please come by our poster for more details.
We also see units which correspond to more “thing” categories, such as human and dog faces, and flowers. The network was able to discover these units in an unsupervised regime.
The y=0 line shows the performance if we initialize the network using Gaussian weights. The performance we are hoping to match is if we use imagenet labels to train the system. We will see how well each of these methods make up the difference between Gaussian initialization and using Imagenet labels. One method for learning features is autoencoders, which rely on a bottleneck. The autoencoder features do not learn very semantically meaningful features. Using stacked k-means, as implemented by Krahenbuhl et al, makes up some of the ground. Previous self-supervision methods are shown here: inpainting, bidirectional GAN, relative context prediction. Finally, our method, outside of the Doersch detection result, performs competitively relative to other self-supervision methods. We found this result surprising, as our project was primarily focused on the graphics task of colorization. However, note the large gap between self-supervision methods and pre-training on ImageNet. There is still work to be done to achieve strong semantic representations without the benefit of labels.
This is an amateur family photo from the 1950s of my father and great grand-father.
This is a professional photograph from Henri Cartier-Bresson.

Semantic Segmentation with Limited Annotation

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Semantic Segmentation with Limited Annotation

Similar a Semantic Segmentation with Limited Annotation (20)

Más de 哲东郑

Más de 哲东郑 (20)

Último

Último (20)

Semantic Segmentation with Limited Annotation

Notas del editor

Semantic Segmentation with Limited Annotation

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Semantic Segmentation with Limited Annotation

Similar a Semantic Segmentation with Limited Annotation (20)

Más de 哲东 郑

Más de 哲东 郑 (20)

Último

Último (20)

Semantic Segmentation with Limited Annotation

Notas del editor

Más de 哲东郑

Más de 哲东郑 (20)