SlideShare una empresa de Scribd logo
1 de 24
From Transformer to Detection:
End-to-End Object Detection with
Transformers
Presented by Frost
IVUL@KAUST
Content
• What is DETR?
• DETR pipeline
• Background: Transformer
• Transformer in DETR
• Set-Based global loss
• Main Results
• Conclusion
• Ideas
• References
What is DETR?
• Approaches object detection as a direct set prediction problem.
• Transformer encoder-decoder architecture.
• Set-based global loss: forces unique predictions via bipartite matching.
• Due to this parallel nature, DETR is very fast and efficient. (in testing
time)
DETR pipeline
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
• Encoder: Self-Attention + FFN
Background: Self-Attention
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
• Encoder: Self-Attention + FFN
• Decoder: Self-Attention +
Encoder-Decoder Attention +
FFN
Background: Encoder-Decoder-Attention
Y
Y
Encoder
Encoder
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
• Encoder: Self-Attention + FFN
• Decoder: ... + E-D Att + ...
• Multi-Headed Attention
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
• Encoder: Self-Attention + FFN
• Decoder: ... + E-D Att + ...
• Multi-Headed Attention
• Positional Encoding
Background: Transformer
• Sequence to Sequence
• Encoder-Decoder
• Encoder: Self-Attention + FFN
• Decoder: ... + E-D Att + ...
• Multi-Headed Attention
• Positional Encoding
• That’s it!
Content
• What is DETR?
• DETR pipeline
• Background: Transformer
• Transformer in DETR
• Set-Based global loss
• Main Results
• Conclusion
• Ideas
• References
Back to DETR
Transformer in DETR
1. positional encodings passed
at every attention layer
2. queries are initially set to
zero
Transformer in DETR
1. positional encodings passed
at every attention layer
2. queries are initially set to
zero
Set-based global loss
Maximize iou
Set-based global loss
a permutation of N elements
predictions
Maximize iou
bipartite matching
Set-based global loss
a permutation of N elements
predictions
Maximize iou
Content
• What is DETR?
• DETR pipeline
• Background: Transformer
• Transformer in DETR
• Set-Based global loss
• Main Results
• Conclusion
• Ideas
• References
Main Results
Conclusion
• A fresh design for object detection systems based on transformers
and bipartite matching loss for direct set prediction.
• Significantly better performance on large objects than Faster R-CNN,
• Global information performed by the self-attention.
• It under performs on a smaller objects compared to other object
detectors of same magnitude.
• It takes long training hours and is not real time.
• The transformer architecture leads to significant overhead in
training/inference
Ideas
• Detection in 3D
• Temporal Action Localization
• Segmentation / Super-resolution
• ...
• ...
• ...
• ...
• ...
References
• DETR Paper
• http://jalammar.github.io/illustrated-transformer/
• https://ai.facebook.com/blog/end-to-end-object-detection-with-
transformers/
• https://medium.com/inside-machine-learning/what-is-a-transformer-
d07dd1fbec04
• https://www.geeksforgeeks.org/maximum-bipartite-matching/
• https://medium.com/visionwizard/detr-b677c7016a47

Más contenido relacionado

La actualidad más candente

Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksSungminYou
 
Modern face recognition with deep learning
Modern face recognition with deep learningModern face recognition with deep learning
Modern face recognition with deep learningmarada0033
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee
 
MATLAB Code + Description : Real-Time Object Motion Detection and Tracking
MATLAB Code + Description : Real-Time Object Motion Detection and TrackingMATLAB Code + Description : Real-Time Object Motion Detection and Tracking
MATLAB Code + Description : Real-Time Object Motion Detection and TrackingAhmed Gad
 
Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...宏毅 李
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionDADAJONJURAKUZIEV
 
Demystifying NLP Transformers: Understanding the Power and Architecture behin...
Demystifying NLP Transformers: Understanding the Power and Architecture behin...Demystifying NLP Transformers: Understanding the Power and Architecture behin...
Demystifying NLP Transformers: Understanding the Power and Architecture behin...NILESH VERMA
 
Gradient-Based Low-Light Image Enhancement
Gradient-Based Low-Light Image EnhancementGradient-Based Low-Light Image Enhancement
Gradient-Based Low-Light Image EnhancementMasayuki Tanaka
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Understanding cnn
Understanding cnnUnderstanding cnn
Understanding cnnRucha Gole
 
Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Herman Kurnadi
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
IRJET- Object Detection and Recognition for Blind Assistance
IRJET- Object Detection and Recognition for Blind AssistanceIRJET- Object Detection and Recognition for Blind Assistance
IRJET- Object Detection and Recognition for Blind AssistanceIRJET Journal
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognitionYUNG-KUEI CHEN
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detectionWenjing Chen
 

La actualidad más candente (20)

Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networks
 
Modern face recognition with deep learning
Modern face recognition with deep learningModern face recognition with deep learning
Modern face recognition with deep learning
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 
MATLAB Code + Description : Real-Time Object Motion Detection and Tracking
MATLAB Code + Description : Real-Time Object Motion Detection and TrackingMATLAB Code + Description : Real-Time Object Motion Detection and Tracking
MATLAB Code + Description : Real-Time Object Motion Detection and Tracking
 
Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...Generative adversarial network and its applications to speech signal and natu...
Generative adversarial network and its applications to speech signal and natu...
 
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object DetectionYou Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
 
Demystifying NLP Transformers: Understanding the Power and Architecture behin...
Demystifying NLP Transformers: Understanding the Power and Architecture behin...Demystifying NLP Transformers: Understanding the Power and Architecture behin...
Demystifying NLP Transformers: Understanding the Power and Architecture behin...
 
Gradient-Based Low-Light Image Enhancement
Gradient-Based Low-Light Image EnhancementGradient-Based Low-Light Image Enhancement
Gradient-Based Low-Light Image Enhancement
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Understanding cnn
Understanding cnnUnderstanding cnn
Understanding cnn
 
Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)Deep learning on face recognition (use case, development and risk)
Deep learning on face recognition (use case, development and risk)
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Yol ov2
Yol ov2Yol ov2
Yol ov2
 
IRJET- Object Detection and Recognition for Blind Assistance
IRJET- Object Detection and Recognition for Blind AssistanceIRJET- Object Detection and Recognition for Blind Assistance
IRJET- Object Detection and Recognition for Blind Assistance
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognition
 
Yolov5
Yolov5 Yolov5
Yolov5
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 

Similar a DETR ECCV20

2015_04_13_CDR FINAL REVISION
2015_04_13_CDR FINAL REVISION2015_04_13_CDR FINAL REVISION
2015_04_13_CDR FINAL REVISIONJoel Nielsen
 
EE 330 Lect 3 Spring 2022.pdf
EE 330 Lect 3 Spring 2022.pdfEE 330 Lect 3 Spring 2022.pdf
EE 330 Lect 3 Spring 2022.pdfPatriciaTutuani1
 
adaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxadaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxssuser6f1a8e1
 
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...AlameluPriyadharshini
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool developmentAnubhav Jain
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPROIDEA
 
Layout design on MICROWIND
Layout design on MICROWINDLayout design on MICROWIND
Layout design on MICROWINDvaibhav jindal
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceESUG
 
Julianna Ricci Portfolio 2016
Julianna Ricci Portfolio 2016Julianna Ricci Portfolio 2016
Julianna Ricci Portfolio 2016Julianna Ricci
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IWanjin Yu
 
IBIS MODELING FOR WIDEBAND EMC APPLICATIONS
IBIS MODELING FOR WIDEBAND EMC APPLICATIONSIBIS MODELING FOR WIDEBAND EMC APPLICATIONS
IBIS MODELING FOR WIDEBAND EMC APPLICATIONSPiero Belforte
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Mahdi Hosseini Moghaddam
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...Continuent
 
Line following robot
Line following robotLine following robot
Line following robotarul jothi
 

Similar a DETR ECCV20 (20)

2015_04_13_CDR FINAL REVISION
2015_04_13_CDR FINAL REVISION2015_04_13_CDR FINAL REVISION
2015_04_13_CDR FINAL REVISION
 
EE 330 Lect 3 Spring 2022.pdf
EE 330 Lect 3 Spring 2022.pdfEE 330 Lect 3 Spring 2022.pdf
EE 330 Lect 3 Spring 2022.pdf
 
adaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptxadaptive_ecg_cdr_edittedforpublic.pptx
adaptive_ecg_cdr_edittedforpublic.pptx
 
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...
IMPLEMENTATION OF DYNAMIC REMOTE OPERATED USING BAT ALGORITHMNAVIGATION EQUIP...
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
l1_introduction.pdf
l1_introduction.pdfl1_introduction.pdf
l1_introduction.pdf
 
es_hardware_handout
es_hardware_handoutes_hardware_handout
es_hardware_handout
 
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
 
ROBOTICS - Introduction to Robotics Microcontroller
ROBOTICS -  Introduction to Robotics MicrocontrollerROBOTICS -  Introduction to Robotics Microcontroller
ROBOTICS - Introduction to Robotics Microcontroller
 
Layout design on MICROWIND
Layout design on MICROWINDLayout design on MICROWIND
Layout design on MICROWIND
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performance
 
Julianna Ricci Portfolio 2016
Julianna Ricci Portfolio 2016Julianna Ricci Portfolio 2016
Julianna Ricci Portfolio 2016
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet I
 
IBIS MODELING FOR WIDEBAND EMC APPLICATIONS
IBIS MODELING FOR WIDEBAND EMC APPLICATIONSIBIS MODELING FOR WIDEBAND EMC APPLICATIONS
IBIS MODELING FOR WIDEBAND EMC APPLICATIONS
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...
Training Slides: Basics 107: Simple Tungsten Replicator Installation to Extra...
 
Big Data Visualization With ParaView
Big Data Visualization With ParaViewBig Data Visualization With ParaView
Big Data Visualization With ParaView
 
Line following robot
Line following robotLine following robot
Line following robot
 

Más de Mengmeng Xu

Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talk
Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talkBoundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talk
Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talkMengmeng Xu
 
Graph-Based Global Reasoning Networks
Graph-Based Global Reasoning NetworksGraph-Based Global Reasoning Networks
Graph-Based Global Reasoning NetworksMengmeng Xu
 
Making figures on ppt
Making figures on pptMaking figures on ppt
Making figures on pptMengmeng Xu
 
What Makes Training Multi-modal Classification Networks Hard?​
What Makes Training Multi-modal Classification Networks Hard?​What Makes Training Multi-modal Classification Networks Hard?​
What Makes Training Multi-modal Classification Networks Hard?​Mengmeng Xu
 
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionG-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionMengmeng Xu
 
A flexible model for training action localization with varying levels of su...
A flexible model for training action localization with varying levels of su...A flexible model for training action localization with varying levels of su...
A flexible model for training action localization with varying levels of su...Mengmeng Xu
 

Más de Mengmeng Xu (6)

Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talk
Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talkBoundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talk
Boundary-sensitive Pre-training for Temporal Localization in Videos cvpr21-talk
 
Graph-Based Global Reasoning Networks
Graph-Based Global Reasoning NetworksGraph-Based Global Reasoning Networks
Graph-Based Global Reasoning Networks
 
Making figures on ppt
Making figures on pptMaking figures on ppt
Making figures on ppt
 
What Makes Training Multi-modal Classification Networks Hard?​
What Makes Training Multi-modal Classification Networks Hard?​What Makes Training Multi-modal Classification Networks Hard?​
What Makes Training Multi-modal Classification Networks Hard?​
 
G-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action DetectionG-TAD: Sub-Graph Localization for Temporal Action Detection
G-TAD: Sub-Graph Localization for Temporal Action Detection
 
A flexible model for training action localization with varying levels of su...
A flexible model for training action localization with varying levels of su...A flexible model for training action localization with varying levels of su...
A flexible model for training action localization with varying levels of su...
 

Último

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 

Último (20)

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 

DETR ECCV20

  • 1. From Transformer to Detection: End-to-End Object Detection with Transformers Presented by Frost IVUL@KAUST
  • 2. Content • What is DETR? • DETR pipeline • Background: Transformer • Transformer in DETR • Set-Based global loss • Main Results • Conclusion • Ideas • References
  • 3. What is DETR? • Approaches object detection as a direct set prediction problem. • Transformer encoder-decoder architecture. • Set-based global loss: forces unique predictions via bipartite matching. • Due to this parallel nature, DETR is very fast and efficient. (in testing time)
  • 5. Background: Transformer • Sequence to Sequence • Encoder-Decoder
  • 6. Background: Transformer • Sequence to Sequence • Encoder-Decoder • Encoder: Self-Attention + FFN
  • 8. Background: Transformer • Sequence to Sequence • Encoder-Decoder • Encoder: Self-Attention + FFN • Decoder: Self-Attention + Encoder-Decoder Attention + FFN
  • 10. Background: Transformer • Sequence to Sequence • Encoder-Decoder • Encoder: Self-Attention + FFN • Decoder: ... + E-D Att + ... • Multi-Headed Attention
  • 11. Background: Transformer • Sequence to Sequence • Encoder-Decoder • Encoder: Self-Attention + FFN • Decoder: ... + E-D Att + ... • Multi-Headed Attention • Positional Encoding
  • 12. Background: Transformer • Sequence to Sequence • Encoder-Decoder • Encoder: Self-Attention + FFN • Decoder: ... + E-D Att + ... • Multi-Headed Attention • Positional Encoding • That’s it!
  • 13. Content • What is DETR? • DETR pipeline • Background: Transformer • Transformer in DETR • Set-Based global loss • Main Results • Conclusion • Ideas • References
  • 15. Transformer in DETR 1. positional encodings passed at every attention layer 2. queries are initially set to zero
  • 16. Transformer in DETR 1. positional encodings passed at every attention layer 2. queries are initially set to zero
  • 18. Set-based global loss a permutation of N elements predictions Maximize iou bipartite matching
  • 19. Set-based global loss a permutation of N elements predictions Maximize iou
  • 20. Content • What is DETR? • DETR pipeline • Background: Transformer • Transformer in DETR • Set-Based global loss • Main Results • Conclusion • Ideas • References
  • 22. Conclusion • A fresh design for object detection systems based on transformers and bipartite matching loss for direct set prediction. • Significantly better performance on large objects than Faster R-CNN, • Global information performed by the self-attention. • It under performs on a smaller objects compared to other object detectors of same magnitude. • It takes long training hours and is not real time. • The transformer architecture leads to significant overhead in training/inference
  • 23. Ideas • Detection in 3D • Temporal Action Localization • Segmentation / Super-resolution • ... • ... • ... • ... • ...
  • 24. References • DETR Paper • http://jalammar.github.io/illustrated-transformer/ • https://ai.facebook.com/blog/end-to-end-object-detection-with- transformers/ • https://medium.com/inside-machine-learning/what-is-a-transformer- d07dd1fbec04 • https://www.geeksforgeeks.org/maximum-bipartite-matching/ • https://medium.com/visionwizard/detr-b677c7016a47

Notas del editor

  1. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions.