SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Lecture 1 -Fei-Fei Li & Justin Johnson & Serena Yeung
Computer
Vision
Neuroscience
Machine learning
Speech, NLP
Information retrieval
Mathematics
Computer

Science
Biology
Engineering
Physics
Robotics
Cognitive
sciences
Psychology
graphics, algorithms,
theory,…
Image
processing
4/4/20174
systems,
architecture, …
optics
最近の研究情勢についていくために - Deep Learningを中心に -
R-CNN
Piotr Doll´ar Ross Girshick
search (FAIR)
RoIAlignRoIAlign
class
box
convconv convconv
Figure 1. The MaskR-CNN framework for instance segmentation.
a fixed set of categories without differentiating object in-
stances.1
Given this, one might expect a complex method
is required to achieve good results. However, we show that
a surprisingly simple, flexible, and fast system can surpass
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Google
vinyals@google.com
Alexander Toshev
Google
toshev@google.com
Samy Bengio
Google
bengio@google.com
Dumitru Erhan
Google
dumitru@google.com
Abstract
Automatically describing the content of an image is a
fundamental problem in artificial intelligence that connects
computer vision and natural language processing. In this
paper, we present a generative model based on a deep re-
current architecture that combines recent advances in com-
puter vision and machine translation and that can be used
to generate natural sentences describing an image. The
model is trained to maximize the likelihood of the target de-
scription sentence given the training image. Experiments
on several datasets show the accuracy of the model and the
fluency of the language it learns solely from image descrip-
tions. Our model is often quite accurate, which we verify
both qualitatively and quantitatively. For instance, while
the current state-of-the-art BLEU-1 score (the higher the
A group of people
shopping at an
outdoor market.
!
There are many
vegetables at the
fruit stand.
Vision!
Deep CNN
Language !
Generating!
RNN
Figure 1. NIC, our model, is based end-to-end on a neural net-
work consisting of a vision CNN followed by a language gener-
ating RNN. It generates complete sentences in natural language
from an input image, as shown on the example above.
existing solutions of the above sub-problems, in order to go
from an image to its description [6, 16]. In contrast, we
Perceptual Generative Adversarial Networks for Small Object Detection
Jianan Li Xiaodan Liang Yunchao Wei Tingfa Xu Jiashi Feng Shuicheng Yan
Abstract
Detecting small objects is notoriously challenging due
to their low resolution and noisy representation. Exist-
ing object detection pipelines usually detect small objects
through learning representations of all the objects at multi-
ple scales. However, the performance gain of such ad hoc
architectures is usually limited to pay off the computational
cost. In this work, we address the small object detection
problem by developing a single architecture that internally
lifts representations of small objects to “super-resolved”
ones, achieving similar characteristics as large objects and
thus more discriminative for detection. For this purpose,
we propose a new Perceptual Generative Adversarial Net-
work (Perceptual GAN) model that improves small object
Perceptual
GAN
Features For
Small Instance
Super-resolved
Features
Features For
Large Instance
≈
Figure 1. Large and small objects exhibit different representation
from high-level convolutional layers of a CNN detector. The repr
sentations of large objects are discriminative while those of sma
objects are of low resolution, which hurts the detection accurac
In this work, we introduce the Perceptual GAN model to enhanc
the representations for small objects to be similar to real large ob
jects, thus improve detection performance on the small objects.
cs.CV]20Jun2017
and Cityscapes (bottom) using a single ResNet-101-FPN network.
PQ PQTh
PQSt
mIoU AP
DIN [1] 53.8 42.5 62.1 - 28.6
Panoptic FPN 58.1 52.0 62.5 75.7 33.0
O (top) and Cityscapes (bottom) using a single ResNet-101-FPN network.
PQSt
PQ PQTh
PQSt
mIoU AP
Features for Amodal 3D Object Detection
Zhixin Wang and Kui Jia
Abstract— In this work, we propose a novel method termed
Frustum ConvNet (F-ConvNet) for amodal 3D object detection
from point clouds. Given 2D region proposals in a RGB image,
our method first generates a sequence of frustums for each
region proposal, and uses the obtained frustums to group local
points. F-ConvNet aggregates point-wise features as frustum-
level feature vectors, and arrays these feature vectors as a
feature map for use of its subsequent component of fully
convolutional network (FCN), which spatially fuses frustum-
level features and supports an end-to-end and continuous
estimation of oriented boxes in the 3D space. We also propose
component variants of L-ConvNet, including a FCN variant
that extracts multi-resolution frustum features, and a refined
use of L-ConvNet over a reduced 3D space. Careful ablation
studies verify the efficacy of these component variants. L-
ConvNet assumes no prior knowledge of the working 3D envi-
ronment, and is thus dataset-agnostic. We present experiments
on both the indoor SUN-RGBD and outdoor KITTI datasets. L-
ConvNet outperforms all existing methods on SUN-RGBD, and
at the time of submission it outperforms all published works on
the KITTI benchmark. We will make the code of L-ConvNet
publicly available.
I. INTRODUCTION
Detection of object instances in 3D sensory data has
tremendous importance in many applications including au-
tonomous driving, robotic object manipulation, and aug-
mented reality. Among others, RGB-D images and LiDAR
point clouds are the most representative formats of 3D
Fig. 1: Illustration for how a sequence of frustums are
generated for a region proposal in a RGB image.
or volumes, these methods suffer from loss of critical 3D
information in the projection or quantization process.
With the progress of point set deep learning [11], [12],
recent methods [13], [14] resort to learning features directly
from raw point clouds. For example, the seminal work of
F-PointNet [13] first finds local points corresponding to
pixels inside a 2D region proposal, and then uses PointNet
[11] to segment from these local points the foreground
ones; the amodal 3D box is finally estimated from the
foreground points. Performance of this method is limited
due to the reasons that (1) it is not of end-to-end learning,
.01864v1[cs.CV]5Mar2019
Method
MV3D [5]
VoxelNet [14]
F-PointNet [13]
AVOD-FPN [6]
SECOND [15]
IPOD [22]
PointPillars [16]
PointRCNN-v1.1 [23]
Ours
TABLE
Fig. 7: Qualitative results on the
different categories, with green f
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
Chen Wang2
Danfei Xu1
Yuke Zhu1
Roberto Mart´ın-Mart´ın1
Cewu Lu2
Li Fei-Fei1
Silvio Savarese1
1
Department of Computer Science, Stanford University
2
Department of Computer Science, Shanghai Jiao Tong University
Abstract
A key technical challenge in performing 6D object pose
estimation from RGB-D image is to fully leverage the two
complementary data sources. Prior works either extract in-
formation from the RGB image and depth separately or use
costly post-processing steps, limiting their performances in
highly cluttered scenes and real-time applications. In this
work, we present DenseFusion, a generic framework for
estimating 6D pose of a set of known objects from RGB-
D images. DenseFusion is a heterogeneous architecture
that processes the two data sources individually and uses a
novel dense fusion network to extract pixel-wise dense fea-
ture embedding, from which the pose is estimated. Further-
more, we integrate an end-to-end iterative pose refinement
RGB-D
DenseFusion
Figure 1. We develop an end-to-end deep network model for 6D
1[cs.CV]15Jan2019
Deep Learning for Generic Object Detection: A Survey
Li Liu 1,2
· Wanli Ouyang 3
· Xiaogang Wang 4
·
Paul Fieguth 5
· Jie Chen 2
· Xinwang Liu 1
· Matti Pietik¨ainen 2
Received: 12 September 2018
Abstract Generic object detection, aiming at locating object in-
stances from a large number of predefined categories in natural
images, is one of the most fundamental and challenging problems
in computer vision. Deep learning techniques have emerged in re-
cent years as powerful methods for learning feature representations
directly from data, and have led to remarkable breakthroughs in
the field of generic object detection. Given this time of rapid evo-
lution, the goal of this paper is to provide a comprehensive sur-
vey of the recent achievements in this field brought by deep learn-
ing techniques. More than 250 key contributions are included in
this survey, covering many aspects of generic object detection re-
search: leading detection frameworks and fundamental subprob-
lems including object feature representation, object proposal gen-
eration, context information modeling and training strategies; eval-
uation issues, specifically benchmark datasets, evaluation metrics,
and state of the art performance. We finish by identifying promis-
ing directions for future research.
Keywords Object detection · deep learning · convolutional neural
networks · object recognition
1 Introduction
As a longstanding, fundamental and challenging problem in com-
puter vision, object detection has been an active area of research
for several decades. The goal of object detection is to determine
whether or not there are any instances of objects from the given
categories (such as humans, cars, bicycles, dogs and cats) in some
Li Liu (li.liu@oulu.fi)
Wanli Ouyang (wanli.ouyang@sydney.edu.au)
Xiaogang Wang (xgwang@ee.cuhk.edu.hk)
Paul Fieguth (pfieguth@uwaterloo.ca)
Jie Chen (jie.chen@oulu.fi)
Xinwang Liu (xinwangliu@nudt.edu.cn)
Matti Pietik¨ainen (matti.pietikainen@oulu.fi)
1 National University of Defense Technology, China
2 University of Oulu, Finland
3 University of Sydney, Australia
4 Chinese University of Hong Kong, China
ILSVRC yearVOC year Results on VOC2012 Data
(a) (b)
Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result
Fig. 1 Recent evolution of object detection performance. We can observe sig-
nificant performance (mean average precision) improvement since deep learn-
ing entered the scene in 2012. The performance of the best detector has been
steadily increasing by a significant amount on a yearly basis. (a) Results on the
PASCAL VOC datasets: Detection results of winning entries in the VOC2007-
2012 competitions (using only provided training data). (b) Top object detection
competition results in ILSVRC2013-2017 (using only provided training data).
given image and, if present, to return the spatial location and ex-
tent of each object instance (e.g., via a bounding box [53, 179]).
As the cornerstone of image understanding and computer vision,
object detection forms the basis for solving more complex or high
level vision tasks such as segmentation, scene understanding, ob-
ject tracking, image captioning, event detection, and activity recog-
nition. Object detection has a wide range of applications in many
areas of artificial intelligence and information technologies, in-
cluding robot vision, consumer electronics, security, autonomous
driving, human computer interaction, content based image retrieval,
intelligent video surveillance, and augmented reality.
Recently, deep learning techniques [81, 116] have emerged as
powerful methods for learning feature representations automati-
cally from data. In particular, these techniques have provided sig-
nificant improvement for object detection, a problem which has
attracted enormous attention in the last five years, even though it
has been studied for decades by psychophysicists, neuroscientists,
and engineers.
Object detection can be grouped into one of two types [69,
240]: detection of specific instance and detection of specific cat-
egories. The first type aims at detecting instances of a particular
object (such as Donald Trump’s face, the Pentagon building, or my
arXiv:1809.02165v1[cs.CV]6Sep2018
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Deep Learning for Generic Object Detection: A Survey
Li Liu 1,2
· Wanli Ouyang 3
· Xiaogang Wang 4
·
Paul Fieguth 5
· Jie Chen 2
· Xinwang Liu 1
· Matti Pietik¨ainen 2
Received: 12 September 2018
Abstract Generic object detection, aiming at locating object in-
stances from a large number of predefined categories in natural
images, is one of the most fundamental and challenging problems
in computer vision. Deep learning techniques have emerged in re-
cent years as powerful methods for learning feature representations
directly from data, and have led to remarkable breakthroughs in
the field of generic object detection. Given this time of rapid evo-
lution, the goal of this paper is to provide a comprehensive sur-
vey of the recent achievements in this field brought by deep learn-
ing techniques. More than 250 key contributions are included in
this survey, covering many aspects of generic object detection re-
search: leading detection frameworks and fundamental subprob-
lems including object feature representation, object proposal gen-
eration, context information modeling and training strategies; eval-
uation issues, specifically benchmark datasets, evaluation metrics,
and state of the art performance. We finish by identifying promis-
ing directions for future research.
Keywords Object detection · deep learning · convolutional neural
networks · object recognition
1 Introduction
As a longstanding, fundamental and challenging problem in com-
puter vision, object detection has been an active area of research
for several decades. The goal of object detection is to determine
whether or not there are any instances of objects from the given
categories (such as humans, cars, bicycles, dogs and cats) in some
Li Liu (li.liu@oulu.fi)
Wanli Ouyang (wanli.ouyang@sydney.edu.au)
Xiaogang Wang (xgwang@ee.cuhk.edu.hk)
Paul Fieguth (pfieguth@uwaterloo.ca)
Jie Chen (jie.chen@oulu.fi)
Xinwang Liu (xinwangliu@nudt.edu.cn)
Matti Pietik¨ainen (matti.pietikainen@oulu.fi)
1 National University of Defense Technology, China
2 University of Oulu, Finland
3 University of Sydney, Australia
4 Chinese University of Hong Kong, China
ILSVRC yearVOC year Results on VOC2012 Data
(a) (b)
Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result
Fig. 1 Recent evolution of object detection performance. We can observe sig-
nificant performance (mean average precision) improvement since deep learn-
ing entered the scene in 2012. The performance of the best detector has been
steadily increasing by a significant amount on a yearly basis. (a) Results on the
PASCAL VOC datasets: Detection results of winning entries in the VOC2007-
2012 competitions (using only provided training data). (b) Top object detection
competition results in ILSVRC2013-2017 (using only provided training data).
given image and, if present, to return the spatial location and ex-
tent of each object instance (e.g., via a bounding box [53, 179]).
As the cornerstone of image understanding and computer vision,
object detection forms the basis for solving more complex or high
level vision tasks such as segmentation, scene understanding, ob-
ject tracking, image captioning, event detection, and activity recog-
nition. Object detection has a wide range of applications in many
areas of artificial intelligence and information technologies, in-
cluding robot vision, consumer electronics, security, autonomous
driving, human computer interaction, content based image retrieval,
intelligent video surveillance, and augmented reality.
Recently, deep learning techniques [81, 116] have emerged as
powerful methods for learning feature representations automati-
cally from data. In particular, these techniques have provided sig-
nificant improvement for object detection, a problem which has
attracted enormous attention in the last five years, even though it
has been studied for decades by psychophysicists, neuroscientists,
and engineers.
Object detection can be grouped into one of two types [69,
240]: detection of specific instance and detection of specific cat-
egories. The first type aims at detecting instances of a particular
object (such as Donald Trump’s face, the Pentagon building, or my
arXiv:1809.02165v1[cs.CV]6Sep2018
Deep Learning for Generic Object Detection: A Survey
Li Liu 1,2
· Wanli Ouyang 3
· Xiaogang Wang 4
·
Paul Fieguth 5
· Jie Chen 2
· Xinwang Liu 1
· Matti Pietik¨ainen 2
Received: 12 September 2018
Abstract Generic object detection, aiming at locating object in-
stances from a large number of predefined categories in natural
images, is one of the most fundamental and challenging problems
in computer vision. Deep learning techniques have emerged in re-
cent years as powerful methods for learning feature representations
directly from data, and have led to remarkable breakthroughs in
the field of generic object detection. Given this time of rapid evo-
lution, the goal of this paper is to provide a comprehensive sur-
vey of the recent achievements in this field brought by deep learn-
ing techniques. More than 250 key contributions are included in
this survey, covering many aspects of generic object detection re-
search: leading detection frameworks and fundamental subprob-
lems including object feature representation, object proposal gen-
eration, context information modeling and training strategies; eval-
uation issues, specifically benchmark datasets, evaluation metrics,
and state of the art performance. We finish by identifying promis-
ing directions for future research.
Keywords Object detection · deep learning · convolutional neural
networks · object recognition
1 Introduction
As a longstanding, fundamental and challenging problem in com-
puter vision, object detection has been an active area of research
for several decades. The goal of object detection is to determine
whether or not there are any instances of objects from the given
categories (such as humans, cars, bicycles, dogs and cats) in some
Li Liu (li.liu@oulu.fi)
Wanli Ouyang (wanli.ouyang@sydney.edu.au)
Xiaogang Wang (xgwang@ee.cuhk.edu.hk)
Paul Fieguth (pfieguth@uwaterloo.ca)
Jie Chen (jie.chen@oulu.fi)
Xinwang Liu (xinwangliu@nudt.edu.cn)
Matti Pietik¨ainen (matti.pietikainen@oulu.fi)
1 National University of Defense Technology, China
2 University of Oulu, Finland
3 University of Sydney, Australia
4 Chinese University of Hong Kong, China
5 University of Waterloo, Canada
ILSVRC yearVOC year Results on VOC2012 Data
(a) (b)
Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result
Fig. 1 Recent evolution of object detection performance. We can observe sig-
nificant performance (mean average precision) improvement since deep learn-
ing entered the scene in 2012. The performance of the best detector has been
steadily increasing by a significant amount on a yearly basis. (a) Results on the
PASCAL VOC datasets: Detection results of winning entries in the VOC2007-
2012 competitions (using only provided training data). (b) Top object detection
competition results in ILSVRC2013-2017 (using only provided training data).
given image and, if present, to return the spatial location and ex-
tent of each object instance (e.g., via a bounding box [53, 179]).
As the cornerstone of image understanding and computer vision,
object detection forms the basis for solving more complex or high
level vision tasks such as segmentation, scene understanding, ob-
ject tracking, image captioning, event detection, and activity recog-
nition. Object detection has a wide range of applications in many
areas of artificial intelligence and information technologies, in-
cluding robot vision, consumer electronics, security, autonomous
driving, human computer interaction, content based image retrieval,
intelligent video surveillance, and augmented reality.
Recently, deep learning techniques [81, 116] have emerged as
powerful methods for learning feature representations automati-
cally from data. In particular, these techniques have provided sig-
nificant improvement for object detection, a problem which has
attracted enormous attention in the last five years, even though it
has been studied for decades by psychophysicists, neuroscientists,
and engineers.
Object detection can be grouped into one of two types [69,
240]: detection of specific instance and detection of specific cat-
egories. The first type aims at detecting instances of a particular
object (such as Donald Trump’s face, the Pentagon building, or my
dog Penny), whereas the goal of the second type is to detect differ-
ent instances of predefined object categories (for example humans,
arXiv:1809.02165v1[cs.CV]6Sep2018
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
🍆
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -

Más contenido relacionado

La actualidad más candente

Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)IJERA Editor
 
Color Based Object Tracking with OpenCV A Survey
Color Based Object Tracking with OpenCV A SurveyColor Based Object Tracking with OpenCV A Survey
Color Based Object Tracking with OpenCV A SurveyYogeshIJTSRD
 
Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015belltailjp
 
Visual Object Category Recognition
Visual Object Category RecognitionVisual Object Category Recognition
Visual Object Category RecognitionAshish Gupta
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognitionAshiq Ullah
 
Artificial Neural Network For Recognition Of Handwritten Devanagari Character
Artificial Neural Network For Recognition Of Handwritten Devanagari CharacterArtificial Neural Network For Recognition Of Handwritten Devanagari Character
Artificial Neural Network For Recognition Of Handwritten Devanagari CharacterIOSR Journals
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: reviewDmytro Mishkin
 
Survey on video object detection & tracking
Survey on video object detection & trackingSurvey on video object detection & tracking
Survey on video object detection & trackingijctet
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsEditor IJCATR
 
Object tracking a survey
Object tracking a surveyObject tracking a survey
Object tracking a surveyHaseeb Hassan
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & TrackingAkshay Gujarathi
 
Object Detection and tracking in Video Sequences
Object Detection and tracking in Video SequencesObject Detection and tracking in Video Sequences
Object Detection and tracking in Video SequencesIDES Editor
 
Object Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature MatchingObject Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature MatchingIJERA Editor
 
Moving object detection
Moving object detectionMoving object detection
Moving object detectionManav Mittal
 
Object tracking
Object trackingObject tracking
Object trackingchirase44
 
Presentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPresentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPrathamesh Joshi
 

La actualidad más candente (20)

Object detection
Object detectionObject detection
Object detection
 
[IJET V2I3P2] Authors: Shraddha Kallappa Walikar, Dr. Aswatha Kumar M
[IJET V2I3P2] Authors: Shraddha Kallappa Walikar,  Dr. Aswatha Kumar M[IJET V2I3P2] Authors: Shraddha Kallappa Walikar,  Dr. Aswatha Kumar M
[IJET V2I3P2] Authors: Shraddha Kallappa Walikar, Dr. Aswatha Kumar M
 
Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)
 
Jw2517421747
Jw2517421747Jw2517421747
Jw2517421747
 
Color Based Object Tracking with OpenCV A Survey
Color Based Object Tracking with OpenCV A SurveyColor Based Object Tracking with OpenCV A Survey
Color Based Object Tracking with OpenCV A Survey
 
Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015Digest of Human Detection from CVPR2015
Digest of Human Detection from CVPR2015
 
Visual Object Category Recognition
Visual Object Category RecognitionVisual Object Category Recognition
Visual Object Category Recognition
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognition
 
Artificial Neural Network For Recognition Of Handwritten Devanagari Character
Artificial Neural Network For Recognition Of Handwritten Devanagari CharacterArtificial Neural Network For Recognition Of Handwritten Devanagari Character
Artificial Neural Network For Recognition Of Handwritten Devanagari Character
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Survey on video object detection & tracking
Survey on video object detection & trackingSurvey on video object detection & tracking
Survey on video object detection & tracking
 
Occlusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance ApplicationsOcclusion and Abandoned Object Detection for Surveillance Applications
Occlusion and Abandoned Object Detection for Surveillance Applications
 
Object tracking a survey
Object tracking a surveyObject tracking a survey
Object tracking a survey
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & Tracking
 
Object Detection and tracking in Video Sequences
Object Detection and tracking in Video SequencesObject Detection and tracking in Video Sequences
Object Detection and tracking in Video Sequences
 
Object Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature MatchingObject Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature Matching
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Object tracking
Object trackingObject tracking
Object tracking
 
Object recognition
Object recognitionObject recognition
Object recognition
 
Presentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking ProjectPresentation Object Recognition And Tracking Project
Presentation Object Recognition And Tracking Project
 

Similar a 最近の研究情勢についていくために - Deep Learningを中心に -

IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET Journal
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectIOSR Journals
 
fuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptfuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptAbdullah Gubbi
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetIRJET Journal
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfAbishek86232
 
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...paperpublications3
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionvivatechijri
 
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...dbpublications
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdfmokamojah
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...IJECEIAES
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To SentencesIRJET Journal
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generationijtsrd
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...ijscai
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...ijscai
 
Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...IJSCAI Journal
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
 
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET- Weakly Supervised Object Detection by using Fast R-CNNIRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET- Weakly Supervised Object Detection by using Fast R-CNNIRJET Journal
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
 

Similar a 最近の研究情勢についていくために - Deep Learningを中心に - (20)

IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and Python
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an Object
 
fuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptfuzzy LBP for face recognition ppt
fuzzy LBP for face recognition ppt
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNet
 
ArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdfArtificialIntelligenceInObjectDetection-Report.pdf
ArtificialIntelligenceInObjectDetection-Report.pdf
 
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detection
 
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...Machine learning based augmented reality for improved learning application th...
Machine learning based augmented reality for improved learning application th...
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To Sentences
 
Deep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text GenerationDeep Learning for X ray Image to Text Generation
Deep Learning for X ray Image to Text Generation
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
 
Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET- Weakly Supervised Object Detection by using Fast R-CNNIRJET- Weakly Supervised Object Detection by using Fast R-CNN
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
 

Último

Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
Latin American Revolutions, c. 1789-1830
Latin American Revolutions, c. 1789-1830Latin American Revolutions, c. 1789-1830
Latin American Revolutions, c. 1789-1830Dave Phillips
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxEduSkills OECD
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxSaurabhParmar42
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17Celine George
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfTechSoup
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice documentXsasf Sfdfasd
 
3.21.24 The Origins of Black Power.pptx
3.21.24  The Origins of Black Power.pptx3.21.24  The Origins of Black Power.pptx
3.21.24 The Origins of Black Power.pptxmary850239
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 

Último (20)

Personal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdfPersonal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
Latin American Revolutions, c. 1789-1830
Latin American Revolutions, c. 1789-1830Latin American Revolutions, c. 1789-1830
Latin American Revolutions, c. 1789-1830
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptxPISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
PISA-VET launch_El Iza Mohamedou_19 March 2024.pptx
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
CAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptxCAULIFLOWER BREEDING 1 Parmar pptx
CAULIFLOWER BREEDING 1 Parmar pptx
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
The Singapore Teaching Practice document
The Singapore Teaching Practice documentThe Singapore Teaching Practice document
The Singapore Teaching Practice document
 
3.21.24 The Origins of Black Power.pptx
3.21.24  The Origins of Black Power.pptx3.21.24  The Origins of Black Power.pptx
3.21.24 The Origins of Black Power.pptx
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 

最近の研究情勢についていくために - Deep Learningを中心に -

  • 15. Lecture 1 -Fei-Fei Li & Justin Johnson & Serena Yeung Computer Vision Neuroscience Machine learning Speech, NLP Information retrieval Mathematics Computer
 Science Biology Engineering Physics Robotics Cognitive sciences Psychology graphics, algorithms, theory,… Image processing 4/4/20174 systems, architecture, … optics
  • 17. R-CNN Piotr Doll´ar Ross Girshick search (FAIR) RoIAlignRoIAlign class box convconv convconv Figure 1. The MaskR-CNN framework for instance segmentation. a fixed set of categories without differentiating object in- stances.1 Given this, one might expect a complex method is required to achieve good results. However, we show that a surprisingly simple, flexible, and fast system can surpass Show and Tell: A Neural Image Caption Generator Oriol Vinyals Google vinyals@google.com Alexander Toshev Google toshev@google.com Samy Bengio Google bengio@google.com Dumitru Erhan Google dumitru@google.com Abstract Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep re- current architecture that combines recent advances in com- puter vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target de- scription sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descrip- tions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. For instance, while the current state-of-the-art BLEU-1 score (the higher the A group of people shopping at an outdoor market. ! There are many vegetables at the fruit stand. Vision! Deep CNN Language ! Generating! RNN Figure 1. NIC, our model, is based end-to-end on a neural net- work consisting of a vision CNN followed by a language gener- ating RNN. It generates complete sentences in natural language from an input image, as shown on the example above. existing solutions of the above sub-problems, in order to go from an image to its description [6, 16]. In contrast, we Perceptual Generative Adversarial Networks for Small Object Detection Jianan Li Xiaodan Liang Yunchao Wei Tingfa Xu Jiashi Feng Shuicheng Yan Abstract Detecting small objects is notoriously challenging due to their low resolution and noisy representation. Exist- ing object detection pipelines usually detect small objects through learning representations of all the objects at multi- ple scales. However, the performance gain of such ad hoc architectures is usually limited to pay off the computational cost. In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to “super-resolved” ones, achieving similar characteristics as large objects and thus more discriminative for detection. For this purpose, we propose a new Perceptual Generative Adversarial Net- work (Perceptual GAN) model that improves small object Perceptual GAN Features For Small Instance Super-resolved Features Features For Large Instance ≈ Figure 1. Large and small objects exhibit different representation from high-level convolutional layers of a CNN detector. The repr sentations of large objects are discriminative while those of sma objects are of low resolution, which hurts the detection accurac In this work, we introduce the Perceptual GAN model to enhanc the representations for small objects to be similar to real large ob jects, thus improve detection performance on the small objects. cs.CV]20Jun2017
  • 18. and Cityscapes (bottom) using a single ResNet-101-FPN network. PQ PQTh PQSt mIoU AP DIN [1] 53.8 42.5 62.1 - 28.6 Panoptic FPN 58.1 52.0 62.5 75.7 33.0 O (top) and Cityscapes (bottom) using a single ResNet-101-FPN network. PQSt PQ PQTh PQSt mIoU AP Features for Amodal 3D Object Detection Zhixin Wang and Kui Jia Abstract— In this work, we propose a novel method termed Frustum ConvNet (F-ConvNet) for amodal 3D object detection from point clouds. Given 2D region proposals in a RGB image, our method first generates a sequence of frustums for each region proposal, and uses the obtained frustums to group local points. F-ConvNet aggregates point-wise features as frustum- level feature vectors, and arrays these feature vectors as a feature map for use of its subsequent component of fully convolutional network (FCN), which spatially fuses frustum- level features and supports an end-to-end and continuous estimation of oriented boxes in the 3D space. We also propose component variants of L-ConvNet, including a FCN variant that extracts multi-resolution frustum features, and a refined use of L-ConvNet over a reduced 3D space. Careful ablation studies verify the efficacy of these component variants. L- ConvNet assumes no prior knowledge of the working 3D envi- ronment, and is thus dataset-agnostic. We present experiments on both the indoor SUN-RGBD and outdoor KITTI datasets. L- ConvNet outperforms all existing methods on SUN-RGBD, and at the time of submission it outperforms all published works on the KITTI benchmark. We will make the code of L-ConvNet publicly available. I. INTRODUCTION Detection of object instances in 3D sensory data has tremendous importance in many applications including au- tonomous driving, robotic object manipulation, and aug- mented reality. Among others, RGB-D images and LiDAR point clouds are the most representative formats of 3D Fig. 1: Illustration for how a sequence of frustums are generated for a region proposal in a RGB image. or volumes, these methods suffer from loss of critical 3D information in the projection or quantization process. With the progress of point set deep learning [11], [12], recent methods [13], [14] resort to learning features directly from raw point clouds. For example, the seminal work of F-PointNet [13] first finds local points corresponding to pixels inside a 2D region proposal, and then uses PointNet [11] to segment from these local points the foreground ones; the amodal 3D box is finally estimated from the foreground points. Performance of this method is limited due to the reasons that (1) it is not of end-to-end learning, .01864v1[cs.CV]5Mar2019 Method MV3D [5] VoxelNet [14] F-PointNet [13] AVOD-FPN [6] SECOND [15] IPOD [22] PointPillars [16] PointRCNN-v1.1 [23] Ours TABLE Fig. 7: Qualitative results on the different categories, with green f DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion Chen Wang2 Danfei Xu1 Yuke Zhu1 Roberto Mart´ın-Mart´ın1 Cewu Lu2 Li Fei-Fei1 Silvio Savarese1 1 Department of Computer Science, Stanford University 2 Department of Computer Science, Shanghai Jiao Tong University Abstract A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources. Prior works either extract in- formation from the RGB image and depth separately or use costly post-processing steps, limiting their performances in highly cluttered scenes and real-time applications. In this work, we present DenseFusion, a generic framework for estimating 6D pose of a set of known objects from RGB- D images. DenseFusion is a heterogeneous architecture that processes the two data sources individually and uses a novel dense fusion network to extract pixel-wise dense fea- ture embedding, from which the pose is estimated. Further- more, we integrate an end-to-end iterative pose refinement RGB-D DenseFusion Figure 1. We develop an end-to-end deep network model for 6D 1[cs.CV]15Jan2019
  • 19. Deep Learning for Generic Object Detection: A Survey Li Liu 1,2 · Wanli Ouyang 3 · Xiaogang Wang 4 · Paul Fieguth 5 · Jie Chen 2 · Xinwang Liu 1 · Matti Pietik¨ainen 2 Received: 12 September 2018 Abstract Generic object detection, aiming at locating object in- stances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in re- cent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evo- lution, the goal of this paper is to provide a comprehensive sur- vey of the recent achievements in this field brought by deep learn- ing techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection re- search: leading detection frameworks and fundamental subprob- lems including object feature representation, object proposal gen- eration, context information modeling and training strategies; eval- uation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promis- ing directions for future research. Keywords Object detection · deep learning · convolutional neural networks · object recognition 1 Introduction As a longstanding, fundamental and challenging problem in com- puter vision, object detection has been an active area of research for several decades. The goal of object detection is to determine whether or not there are any instances of objects from the given categories (such as humans, cars, bicycles, dogs and cats) in some Li Liu (li.liu@oulu.fi) Wanli Ouyang (wanli.ouyang@sydney.edu.au) Xiaogang Wang (xgwang@ee.cuhk.edu.hk) Paul Fieguth (pfieguth@uwaterloo.ca) Jie Chen (jie.chen@oulu.fi) Xinwang Liu (xinwangliu@nudt.edu.cn) Matti Pietik¨ainen (matti.pietikainen@oulu.fi) 1 National University of Defense Technology, China 2 University of Oulu, Finland 3 University of Sydney, Australia 4 Chinese University of Hong Kong, China ILSVRC yearVOC year Results on VOC2012 Data (a) (b) Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result Fig. 1 Recent evolution of object detection performance. We can observe sig- nificant performance (mean average precision) improvement since deep learn- ing entered the scene in 2012. The performance of the best detector has been steadily increasing by a significant amount on a yearly basis. (a) Results on the PASCAL VOC datasets: Detection results of winning entries in the VOC2007- 2012 competitions (using only provided training data). (b) Top object detection competition results in ILSVRC2013-2017 (using only provided training data). given image and, if present, to return the spatial location and ex- tent of each object instance (e.g., via a bounding box [53, 179]). As the cornerstone of image understanding and computer vision, object detection forms the basis for solving more complex or high level vision tasks such as segmentation, scene understanding, ob- ject tracking, image captioning, event detection, and activity recog- nition. Object detection has a wide range of applications in many areas of artificial intelligence and information technologies, in- cluding robot vision, consumer electronics, security, autonomous driving, human computer interaction, content based image retrieval, intelligent video surveillance, and augmented reality. Recently, deep learning techniques [81, 116] have emerged as powerful methods for learning feature representations automati- cally from data. In particular, these techniques have provided sig- nificant improvement for object detection, a problem which has attracted enormous attention in the last five years, even though it has been studied for decades by psychophysicists, neuroscientists, and engineers. Object detection can be grouped into one of two types [69, 240]: detection of specific instance and detection of specific cat- egories. The first type aims at detecting instances of a particular object (such as Donald Trump’s face, the Pentagon building, or my arXiv:1809.02165v1[cs.CV]6Sep2018
  • 22. Deep Learning for Generic Object Detection: A Survey Li Liu 1,2 · Wanli Ouyang 3 · Xiaogang Wang 4 · Paul Fieguth 5 · Jie Chen 2 · Xinwang Liu 1 · Matti Pietik¨ainen 2 Received: 12 September 2018 Abstract Generic object detection, aiming at locating object in- stances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in re- cent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evo- lution, the goal of this paper is to provide a comprehensive sur- vey of the recent achievements in this field brought by deep learn- ing techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection re- search: leading detection frameworks and fundamental subprob- lems including object feature representation, object proposal gen- eration, context information modeling and training strategies; eval- uation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promis- ing directions for future research. Keywords Object detection · deep learning · convolutional neural networks · object recognition 1 Introduction As a longstanding, fundamental and challenging problem in com- puter vision, object detection has been an active area of research for several decades. The goal of object detection is to determine whether or not there are any instances of objects from the given categories (such as humans, cars, bicycles, dogs and cats) in some Li Liu (li.liu@oulu.fi) Wanli Ouyang (wanli.ouyang@sydney.edu.au) Xiaogang Wang (xgwang@ee.cuhk.edu.hk) Paul Fieguth (pfieguth@uwaterloo.ca) Jie Chen (jie.chen@oulu.fi) Xinwang Liu (xinwangliu@nudt.edu.cn) Matti Pietik¨ainen (matti.pietikainen@oulu.fi) 1 National University of Defense Technology, China 2 University of Oulu, Finland 3 University of Sydney, Australia 4 Chinese University of Hong Kong, China ILSVRC yearVOC year Results on VOC2012 Data (a) (b) Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result Fig. 1 Recent evolution of object detection performance. We can observe sig- nificant performance (mean average precision) improvement since deep learn- ing entered the scene in 2012. The performance of the best detector has been steadily increasing by a significant amount on a yearly basis. (a) Results on the PASCAL VOC datasets: Detection results of winning entries in the VOC2007- 2012 competitions (using only provided training data). (b) Top object detection competition results in ILSVRC2013-2017 (using only provided training data). given image and, if present, to return the spatial location and ex- tent of each object instance (e.g., via a bounding box [53, 179]). As the cornerstone of image understanding and computer vision, object detection forms the basis for solving more complex or high level vision tasks such as segmentation, scene understanding, ob- ject tracking, image captioning, event detection, and activity recog- nition. Object detection has a wide range of applications in many areas of artificial intelligence and information technologies, in- cluding robot vision, consumer electronics, security, autonomous driving, human computer interaction, content based image retrieval, intelligent video surveillance, and augmented reality. Recently, deep learning techniques [81, 116] have emerged as powerful methods for learning feature representations automati- cally from data. In particular, these techniques have provided sig- nificant improvement for object detection, a problem which has attracted enormous attention in the last five years, even though it has been studied for decades by psychophysicists, neuroscientists, and engineers. Object detection can be grouped into one of two types [69, 240]: detection of specific instance and detection of specific cat- egories. The first type aims at detecting instances of a particular object (such as Donald Trump’s face, the Pentagon building, or my arXiv:1809.02165v1[cs.CV]6Sep2018
  • 23. Deep Learning for Generic Object Detection: A Survey Li Liu 1,2 · Wanli Ouyang 3 · Xiaogang Wang 4 · Paul Fieguth 5 · Jie Chen 2 · Xinwang Liu 1 · Matti Pietik¨ainen 2 Received: 12 September 2018 Abstract Generic object detection, aiming at locating object in- stances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in re- cent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evo- lution, the goal of this paper is to provide a comprehensive sur- vey of the recent achievements in this field brought by deep learn- ing techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection re- search: leading detection frameworks and fundamental subprob- lems including object feature representation, object proposal gen- eration, context information modeling and training strategies; eval- uation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promis- ing directions for future research. Keywords Object detection · deep learning · convolutional neural networks · object recognition 1 Introduction As a longstanding, fundamental and challenging problem in com- puter vision, object detection has been an active area of research for several decades. The goal of object detection is to determine whether or not there are any instances of objects from the given categories (such as humans, cars, bicycles, dogs and cats) in some Li Liu (li.liu@oulu.fi) Wanli Ouyang (wanli.ouyang@sydney.edu.au) Xiaogang Wang (xgwang@ee.cuhk.edu.hk) Paul Fieguth (pfieguth@uwaterloo.ca) Jie Chen (jie.chen@oulu.fi) Xinwang Liu (xinwangliu@nudt.edu.cn) Matti Pietik¨ainen (matti.pietikainen@oulu.fi) 1 National University of Defense Technology, China 2 University of Oulu, Finland 3 University of Sydney, Australia 4 Chinese University of Hong Kong, China 5 University of Waterloo, Canada ILSVRC yearVOC year Results on VOC2012 Data (a) (b) Turning Point in 2012: Deep Learning Achieved Record Breaking Image Classification Result Fig. 1 Recent evolution of object detection performance. We can observe sig- nificant performance (mean average precision) improvement since deep learn- ing entered the scene in 2012. The performance of the best detector has been steadily increasing by a significant amount on a yearly basis. (a) Results on the PASCAL VOC datasets: Detection results of winning entries in the VOC2007- 2012 competitions (using only provided training data). (b) Top object detection competition results in ILSVRC2013-2017 (using only provided training data). given image and, if present, to return the spatial location and ex- tent of each object instance (e.g., via a bounding box [53, 179]). As the cornerstone of image understanding and computer vision, object detection forms the basis for solving more complex or high level vision tasks such as segmentation, scene understanding, ob- ject tracking, image captioning, event detection, and activity recog- nition. Object detection has a wide range of applications in many areas of artificial intelligence and information technologies, in- cluding robot vision, consumer electronics, security, autonomous driving, human computer interaction, content based image retrieval, intelligent video surveillance, and augmented reality. Recently, deep learning techniques [81, 116] have emerged as powerful methods for learning feature representations automati- cally from data. In particular, these techniques have provided sig- nificant improvement for object detection, a problem which has attracted enormous attention in the last five years, even though it has been studied for decades by psychophysicists, neuroscientists, and engineers. Object detection can be grouped into one of two types [69, 240]: detection of specific instance and detection of specific cat- egories. The first type aims at detecting instances of a particular object (such as Donald Trump’s face, the Pentagon building, or my dog Penny), whereas the goal of the second type is to detect differ- ent instances of predefined object categories (for example humans, arXiv:1809.02165v1[cs.CV]6Sep2018
  • 28. 🍆