SlideShare una empresa de Scribd logo
1 de 31
Semantic Segmentation with
Limited Annotation
Zhedong Zheng
24 Feb 2018
1
What can we learn from
(from Stephen Chow’s film)
2
3
1. Simple Does It: Weakly Supervised Instance and
Semantic Segmentation (CVPR 2017) Weak
2. Colorful Image Colorization (ECCV 2016 oral) Self
Related Works
4
1. Simple Does It: Weakly Supervised Instance and
Semantic Segmentation (CVPR 2017) Weak
2. Colorful Image Colorization (ECCV 2016 oral) Self
Related Works
5
What
6
How
Start from object bounding box annotations
7
Recall Several Rules
1. Background : No bounding box -> background
2. Object Extent : Bboxes are instance-level, provide
information
3. Objectness : Spatial Continuity / Contrasting boundary
8
How to begin?
If two boxes overlap, we assume the smaller one is in front.
9
How to begin?
10
Post-Process
• Any pixel outside bbox is discard.
• If IoU<50%, re-inital
• DenseCRF
11
Result
Naïve is without post-processing.
12
Result
13
Result
14
1. Simple Does It: Weakly Supervised Instance and
Semantic Segmentation (CVPR 2017) Weak
2. Colorful Image Colorization (ECCV 2016 oral) Self
Related Works
15
16
Grayscale image: L channel Color information: ab channels
abL
17
abL
Concatenate (L,ab)Grayscale image: L channel
“Free”
supervisory
signal
Semantics? Higher-level
abstraction?
18
Inherent Ambiguity
Grayscale
19
Inherent Ambiguity
Our Output Ground Truth
20
Colors in ab space
(continuous)Better Loss Function
• Regression with L2 loss inadequate
• Use multinomial classification
• Class rebalancing to encourage
learning of rare colors
21
Better Loss Function Colors in ab space
(discrete)
• Regression with L2 loss inadequate
• Use multinomial classification
• Class rebalancing to encourage
learning of rare colors
22
Failure Cases
23
Biases
24
Evaluation
Visual Quality Representation Learning
Quantitative
Per-pixel accuracy
Perceptual realism
Semantic interpretability
Task generalization
ImageNet classification
Task & dataset generalization
PASCAL classification, detection, segmentation
Qualitative
Low-level stimuli
Legacy grayscale photos
Hidden unit activations
25
faces
dog
faces
flowers
Hidden Unit (conv5) Activations
26
Dataset & Task Generalization on PASCAL VOC
%fromGaussianto
ImageNetlabels
Classification Detection Segmentation
Gaussian
Initialization
ImageNet
Labels
100%
0%
Pathak et al.
Donahue et al.
Doersch et al.Krähenbühl et al.
Ours
Autoencoder Wang & Gupta
Agrawal et al.
27
Amateur Family Photo, 1956. 28
Amateur Family Photo, 1956. 29
Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 30
Henri Cartier-Bresson, Sunday on the Banks of the River Seine, 1938. 31

Más contenido relacionado

La actualidad más candente

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisHyeongmin Lee
 
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!UnityTechnologiesJapan002
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersSungchul Kim
 
HDR Theory and practicce (JP)
HDR Theory and practicce (JP)HDR Theory and practicce (JP)
HDR Theory and practicce (JP)Hajime Uchimura
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSungjoon Choi
 
[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬KyeongWon Koo
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object TrackingVanya Valindria
 
Display color와 Digital texture format의 이해
Display color와 Digital texture format의 이해Display color와 Digital texture format의 이해
Display color와 Digital texture format의 이해SangYun Yi
 
Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4동석 김
 
Yann le cun
Yann le cunYann le cun
Yann le cunYandex
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAMYu Huang
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Yu Huang
 
[UniteKorea2013] Memory profiling in Unity
[UniteKorea2013] Memory profiling in Unity[UniteKorea2013] Memory profiling in Unity
[UniteKorea2013] Memory profiling in UnityWilliam Hugo Yang
 
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...JP Lee
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
Neural Scene Representation & Rendering: Introduction to Novel View Synthesis
Neural Scene Representation & Rendering: Introduction to Novel View SynthesisNeural Scene Representation & Rendering: Introduction to Novel View Synthesis
Neural Scene Representation & Rendering: Introduction to Novel View SynthesisVincent Sitzmann
 

La actualidad más candente (20)

PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisPR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!
【Unite Tokyo 2019】「禍つヴァールハイト」Timelineだから可能だった!モバイルに最適化されたリアルタイム3D演出!
 
Deep Learning
Deep Learning Deep Learning
Deep Learning
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
YOLO
YOLOYOLO
YOLO
 
HDR Theory and practicce (JP)
HDR Theory and practicce (JP)HDR Theory and practicce (JP)
HDR Theory and practicce (JP)
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
 
Display color와 Digital texture format의 이해
Display color와 Digital texture format의 이해Display color와 Digital texture format의 이해
Display color와 Digital texture format의 이해
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4Custom fabric shader for unreal engine 4
Custom fabric shader for unreal engine 4
 
Yann le cun
Yann le cunYann le cun
Yann le cun
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 
[UniteKorea2013] Memory profiling in Unity
[UniteKorea2013] Memory profiling in Unity[UniteKorea2013] Memory profiling in Unity
[UniteKorea2013] Memory profiling in Unity
 
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Neural Scene Representation & Rendering: Introduction to Novel View Synthesis
Neural Scene Representation & Rendering: Introduction to Novel View SynthesisNeural Scene Representation & Rendering: Introduction to Novel View Synthesis
Neural Scene Representation & Rendering: Introduction to Novel View Synthesis
 

Similar a Semantic Segmentation with Limited Annotation

Image segmentation ajal
Image segmentation ajalImage segmentation ajal
Image segmentation ajalAJAL A J
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetRishabh Indoria
 
Unsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationUnsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationJunho Cho
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Computer vision series
Computer vision seriesComputer vision series
Computer vision seriesPerry Lea
 
Image Translation with GAN
Image Translation with GANImage Translation with GAN
Image Translation with GANJunho Cho
 
Hill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell ConvictionHill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell Convictionozlael ozlael
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer VisionDavid Dao
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingYuanLi589586
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...JaeJun Yoo
 
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?Philip Zheng
 
Deep-Learning Based Stereo Super-Resolution
Deep-Learning Based Stereo Super-ResolutionDeep-Learning Based Stereo Super-Resolution
Deep-Learning Based Stereo Super-ResolutionNAVER Engineering
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniquesAbhineet Bhamra
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningCharles Deledalle
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용홍배 김
 

Similar a Semantic Segmentation with Limited Annotation (20)

Image segmentation ajal
Image segmentation ajalImage segmentation ajal
Image segmentation ajal
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Unsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image GenerationUnsupervised Cross-Domain Image Generation
Unsupervised Cross-Domain Image Generation
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Computer vision series
Computer vision seriesComputer vision series
Computer vision series
 
Image Translation with GAN
Image Translation with GANImage Translation with GAN
Image Translation with GAN
 
ilp-nlp-slides.pdf
ilp-nlp-slides.pdfilp-nlp-slides.pdf
ilp-nlp-slides.pdf
 
Hill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell ConvictionHill Stephen Rendering Tools Splinter Cell Conviction
Hill Stephen Rendering Tools Splinter Cell Conviction
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
 
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?
 
Human parsing
Human parsingHuman parsing
Human parsing
 
Deep-Learning Based Stereo Super-Resolution
Deep-Learning Based Stereo Super-ResolutionDeep-Learning Based Stereo Super-Resolution
Deep-Learning Based Stereo Super-Resolution
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
MLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, CaptioningMLIP - Chapter 5 - Detection, Segmentation, Captioning
MLIP - Chapter 5 - Detection, Segmentation, Captioning
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
 
Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용Convolutional neural networks 이론과 응용
Convolutional neural networks 이론과 응용
 

Más de 哲东 郑

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification哲东 郑
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...哲东 郑
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency哲东 郑
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style哲东 郑
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东 郑
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval哲东 郑
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation哲东 郑
 
Video object detection
Video object detectionVideo object detection
Video object detection哲东 郑
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition哲东 郑
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation哲东 郑
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding哲东 郑
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow哲东 郑
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic哲东 郑
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation哲东 郑
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks 哲东 郑
 

Más de 哲东 郑 (20)

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...
 
Step zhedong
Step zhedongStep zhedong
Step zhedong
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation
 
Video object detection
Video object detectionVideo object detection
Video object detection
 
Center nets
Center netsCenter nets
Center nets
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks
 
Style gan
Style ganStyle gan
Style gan
 
Vi2vi
Vi2viVi2vi
Vi2vi
 

Último

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Semantic Segmentation with Limited Annotation

Notas del editor

  1. So formally, we are working in the Lab color space. The grayscale information is contained in the L, or lightness channel of the image, and is the input to our system. The output is the ab, or color channels. We’re looking to learn the mapping from L to ab using a CNN. We can then take the predicted ab channels, concatenate them with the input, and hopefully get a plausible colorization of the input image. This is the graphics benefit of this problem.
  2. We note that any image can be broken up into its grayscale and color components, and in this manner, can serve as a free supervisory signal for training a CNN. So perhaps by learning to color, we can achieve a deep representation which has higher level abstractions, or semantics. Now, this learning problem is less straightforward than one may expect.
  3. For example, consider this grayscale image.
  4. This is the output after passing it through our system. Now, it seems to look plausible. Now here is the ground truth. So notice that these two look very different. But even though red and blue are far apart in ab space, we are just as happy with the red colorization as we are with the blue, and perhaps the red is even better...
  5. This indicates that any loss which assumes a unimodal output distribution, such as an L2 regression loss, is likely to be inadequate.
  6. We reformulate the problem as multinomial classification. We divide the output ab space into discrete bins of size 10.
  7. The system does have some interesting failure cases. We find that many man-made objects can be multiple colors. The system sometimes has a difficult time deciding which one to go with, leading to this type of tie-dye effect.
  8. Also, we find other curious behaviors and biases. For example, when the system sees a dog, it sometimes expects a tongue underneath. Even when there is none, it will just go ahead and hallucinate one for us anyways.
  9. Due to time constraints, we will not be able to discuss all of the tests, but please come by our poster for more details.
  10. We also see units which correspond to more “thing” categories, such as human and dog faces, and flowers. The network was able to discover these units in an unsupervised regime.
  11. The y=0 line shows the performance if we initialize the network using Gaussian weights. The performance we are hoping to match is if we use imagenet labels to train the system. We will see how well each of these methods make up the difference between Gaussian initialization and using Imagenet labels. One method for learning features is autoencoders, which rely on a bottleneck. The autoencoder features do not learn very semantically meaningful features. Using stacked k-means, as implemented by Krahenbuhl et al, makes up some of the ground. Previous self-supervision methods are shown here: inpainting, bidirectional GAN, relative context prediction. Finally, our method, outside of the Doersch detection result, performs competitively relative to other self-supervision methods. We found this result surprising, as our project was primarily focused on the graphics task of colorization. However, note the large gap between self-supervision methods and pre-training on ImageNet. There is still work to be done to achieve strong semantic representations without the benefit of labels.
  12. This is an amateur family photo from the 1950s of my father and great grand-father.
  13. This is a professional photograph from Henri Cartier-Bresson.