SlideShare una empresa de Scribd logo
1 de 15
Deep learning for object
detection
Wenjing Chen
*Created in March 2017, might be outdated the time you read.
Slide credit: CS231n
Outline
1. Introduction
2. Common methods
Region proposal based methods
R-CNN, Fast R-CNN, Faster R-CNN, R-FCN, Mask R-CNN
Single shot based methods
YOLO, YOLOv2, SSD
1. Comparison
Introduction
one image -> one label one image -> labels + bounding boxes
Region based methods - R-CNN
Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer
vision and pattern recognition. 2014.
Region based methods - Fast R-CNN
Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.
Region based methods - Faster R-CNN
Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems.
2015.
Region based methods - Faster R-CNN
Region based methods - R-FCN
Li, Yi, Kaiming He, and Jian Sun. "R-fcn: Object detection via region-based fully convolutional networks." Advances in Neural Information Processing Systems.
2016.
Average
pooling
Region based methods - Mask R-CNN
He, Kaiming, et al. "Mask R-CNN." arXiv preprint arXiv:1703.06870 (2017).
Object instance segmentation:
 Extend Faster R-CNN by adding a
branch for predicting segmentation
masks on each RoI
 Running at 5 fps
 Without tricks, outperforms all existing,
single-model entries on every task in
all three tracks of the COCO suite of
challenges, including instance
segmentation, bounding-box object
detection, and person keypoint
detection !!!
Single shot based method - YOLO
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
2016.
1. Resize input image to 448*448.
1. Run a single convolutional network.
Predicts B bounding boxes (4 coordinates + confidence) and
C class probabilities for S*S grids, encoded as an
S*S*(B*5+C) tensor.
1. Non-maximum suppression.
S*S*B bounding boxes per image and C class probabilities
for each box.
Single shot based method - YOLOv2
Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." arXiv preprint arXiv:1612.08242 (2016).
YOLO problem:
1. Significant number of localization errors.
2. Low recall compared to region proposal based methods.
Improvements:
Single shot based method - SSD
Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016.
Improvements:
1. Use a small convolutional filter to predict object categories and offsets in bounding box
locations
2. Use multiple layers for prediction at different scales.
Comparison
From YOLOv2 From SSD
R-FCN
83.6% mAP
5.8fps
R-FCN
PASCAL VOC 2012
http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
Comparison
Speed
single shot > region based
Accuracy
region based > single shot
Complexity
YOLO < SSD ≤ Faster R-CNN < R-FCN < YOLOv2(?)

Más contenido relacionado

La actualidad más candente

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detectionchettykulkarni
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance SegmentationHichem Felouat
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
Object detection
Object detectionObject detection
Object detectionSomesh Vyas
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421穗碧 陳
 

La actualidad más candente (20)

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detection
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Yolo
YoloYolo
Yolo
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 
Yolo releases gianmaria
Yolo releases gianmariaYolo releases gianmaria
Yolo releases gianmaria
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Object detection
Object detectionObject detection
Object detection
 
Yolo
YoloYolo
Yolo
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
Object detection
Object detectionObject detection
Object detection
 
YOLO
YOLOYOLO
YOLO
 
Yolo
YoloYolo
Yolo
 

Similar a Deep learning for object detection

Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Jihong Kang
 
Object Detection An Overview
Object Detection An OverviewObject Detection An Overview
Object Detection An Overviewijtsrd
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering
 
Modern convolutional object detectors
Modern convolutional object detectorsModern convolutional object detectors
Modern convolutional object detectorsKwanghee Choi
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPjaewon lee
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detectionMonicaDommaraju
 
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual RepresentationsSimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual Representationsynxm25hpxp
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET Journal
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET Journal
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IWanjin Yu
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetIRJET Journal
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generationijtsrd
 

Similar a Deep learning for object detection (20)

Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
object-detection.pptx
object-detection.pptxobject-detection.pptx
object-detection.pptx
 
Object Detection An Overview
Object Detection An OverviewObject Detection An Overview
Object Detection An Overview
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
 
Modern convolutional object detectors
Modern convolutional object detectorsModern convolutional object detectors
Modern convolutional object detectors
 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detection
 
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual RepresentationsSimCLR: A Simple Framework for Contrastive Learning of Visual Representations
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and Python
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A Survey
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet I
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNet
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generation
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Deep learning for object detection

  • 1. Deep learning for object detection Wenjing Chen *Created in March 2017, might be outdated the time you read. Slide credit: CS231n
  • 2. Outline 1. Introduction 2. Common methods Region proposal based methods R-CNN, Fast R-CNN, Faster R-CNN, R-FCN, Mask R-CNN Single shot based methods YOLO, YOLOv2, SSD 1. Comparison
  • 3. Introduction one image -> one label one image -> labels + bounding boxes
  • 4. Region based methods - R-CNN Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
  • 5. Region based methods - Fast R-CNN Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.
  • 6. Region based methods - Faster R-CNN Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
  • 7. Region based methods - Faster R-CNN
  • 8. Region based methods - R-FCN Li, Yi, Kaiming He, and Jian Sun. "R-fcn: Object detection via region-based fully convolutional networks." Advances in Neural Information Processing Systems. 2016. Average pooling
  • 9. Region based methods - Mask R-CNN He, Kaiming, et al. "Mask R-CNN." arXiv preprint arXiv:1703.06870 (2017). Object instance segmentation:  Extend Faster R-CNN by adding a branch for predicting segmentation masks on each RoI  Running at 5 fps  Without tricks, outperforms all existing, single-model entries on every task in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection !!!
  • 10. Single shot based method - YOLO Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016. 1. Resize input image to 448*448. 1. Run a single convolutional network. Predicts B bounding boxes (4 coordinates + confidence) and C class probabilities for S*S grids, encoded as an S*S*(B*5+C) tensor. 1. Non-maximum suppression. S*S*B bounding boxes per image and C class probabilities for each box.
  • 11. Single shot based method - YOLOv2 Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." arXiv preprint arXiv:1612.08242 (2016). YOLO problem: 1. Significant number of localization errors. 2. Low recall compared to region proposal based methods. Improvements:
  • 12. Single shot based method - SSD Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016. Improvements: 1. Use a small convolutional filter to predict object categories and offsets in bounding box locations 2. Use multiple layers for prediction at different scales.
  • 13. Comparison From YOLOv2 From SSD R-FCN 83.6% mAP 5.8fps R-FCN
  • 15. Comparison Speed single shot > region based Accuracy region based > single shot Complexity YOLO < SSD ≤ Faster R-CNN < R-FCN < YOLOv2(?)

Notas del editor

  1. Batch normalization. 2% more in mAP. High resolution classifier. 4% more in mAP. Convolutional with anchor boxes. 69.5 mAP 81% recall to 69.2 mAP 88% recall. Dimension clusters. Better anchor boxes priors. 60.9% to 67.2% in Avg IOU. Direct location prediction. Solve model instability. Fine-Grained features. 1% more in mAP. Multi-scale training.