SlideShare una empresa de Scribd logo
1 de 68
OneStage
DeTectors
Here is where your presentations begins!
RETINANETSSD
01 02 03
NAS-FPN
04
EFFICIENTDET
SSD:sINGLEsHOT
mULTIBOX dETECTOR
01
SSD : Introduction
Object Detection 역사
Faster RCNN과 YOLO비교
SSD : Introduction
SSD : Introduction
SOTA는 FASTER RCNN(2 Stage Detector)
- BoundingBox 가설을 통해 각 Box에 대한 픽셀이나 피처의 Resample하고 Class를 분류하는 방법
Too computationally intensive for embedded systems
- Faster RCNN도 7fps밖에 안나옴
Significantly increased speed
- 정확도가 떨어짐, YOLO
- Faster R-CNN 7 FPS with mAP 73.2% or YOLO 45 FPS with mAP 63.4%
The first deep network based object detector
- does not resample pixels features for bounding box
- accurate as approaches
두마리 토끼(속도와 정합성)을 잡자!
SSD : Single shot Detector
- 여러개의 Default Box 사용, 여러개의 피처에 Prediction 진행
- 높은 레벨의 피처는 추상화가 잘되어 있어서 큰 물체를 잘 찾음
- 낮은 레벨의 피처는 위치정보가 정확함
이런 느낌?
마지막 피처에서만 찾지 말고, 처음, 중간, 마지막 피처에서 찾아보자
SSD : Model
- VGG 16 의 변경
- VGG 16의 Conv5_3
Conv_7, Conv8_2, Conv9_2
Conv10_2, Conv11_2에서 추출
- Clasifier : 3x3x
- Detections : 8732
- 74.3 mAP, 59FPS
- 다양한 피처맵
SSD
- 중간에 FC(?)
- Detecion 98
Conv_7, Conv8_2, Conv9_2
Conv10_2, Conv11_2에서 추출
- Clasifier : 3x3x
- Detections : 8732
- 63.4mAP, 45FPS(?)
- 마지막 피처맵만
YOLO
SSD : Model
Multi-scale feature maps for detection
- 다른 Feature map에서 detection을 수행함
- 낮은 레이어는 물체의 위치가 더 정확히, 높은 레이어에서는 추상화가 잘되어 있으므로, 두개를 잘 섞자.
Convolutional predictors for detection
The SSD approach is based on a feed-forward convolutional network that
produces a fixed-size collection of bounding boxes and scores for the presence
of object class instances in those boxes, followed by a non-maximum
suppression step to produce the final detections
- Detection을 할때는 3x3xP개의 Conv필터를 사용함
- 출력은 a score for a category(1개), or a shape offset relative to the default box coordinates(4개)
Default boxes and aspect ratios
- Our default boxes are similar to the anchor boxes used in Faster R-CNN
- 마치 Faster RCNN처럼 기본 박스를 initial로 정하고, x, y, dw dh의 변화량을 학습함
SSD : Model
Convolutional predictors for detection 좀더 자세히
- Classifier : Conv: 3x3x(4x(Classes+4))
- 구조 : 첫번째 박스[(4개(dx, dy, dh, dw), 20개(Poscal voc기준 20 class), + 1개(bg)]
두번째, 세번째 , ~6번재박스까지
- 출력 채널 : 150 = 6 x (21 = 4)
SSD : Model
Yolo v3 참고 : 먼가 SSD랑 비슷함..(?)
SSD : Training
Matching strategy
- 많은 Default Boxes에서 GT랑 많이 겹치는 부분을 찾아내고 나머지는 Background처리 하는 기준이 IOU 0.5
- we then match default boxes to any ground truth with jaccard overlap higher than a threshold (0.5)
- Jaccard overlap이 iou임
The key difference between training SSD and training a typical detector that
uses region proposals, is that ground truth information needs to be assigned to
specific outputs in the fixed set of detector outputs. YOLO and for the region
proposal stage of Faster R-CNN
SSD : Training
Training objective
- Faster RCNN이랑 비슷함
● L conf : The confidence loss is the softmax loss over multiple classes confidences
● L Loc : we regress to offsets for the center (cx, cy) of the default bounding box (d) and for its width (w) and
height (h), default box에서 얼마나 이동시키면 되는건가를 학습하는것임
Width와 height는 log임
스케일이 커질수 있으니까.
N : the number of matched default boxes
SSD : Training
- 고양이와 개가 존재(고양이는 작고, 개는 큼)
- 8 x 8(낮은 레벨의 피처) 에서 iou가 0,5이상인것은 고양이만 검출(개는 더 크게 봐야함)
- 4 x 4(높은 레벨의 피처) 에서는 iou사 0.5이상인것은 개만 검출(고양이는 너무 작음)
- 피처에 따라 한 픽셀이 담당하는 원본이미지의 영역이 달라짐
Maching 알고리즘과 로스를 보고 다시한번 첫번째 그림을 해석하면
SSD : Training
여러 피처 맵에서 동일 물체를 찾을려고 서로 노력함
SSD : Training
- 디폴트 박스를 만드는 식 설명
Choosing scales and aspect ratios for default boxes
● M : 몇개의 feature map에서 박스를 뽑아 낼것이냐
● Smin, Smax는 상수(0.2~0.9)
● K는 선택하는 값
● Example PASCAL VOC : sk 0.1, 0.2, 0.55, 0.725, 0.9
- Sk 계산이 끝나면 박스의 비율을 선택
● ar ∈ {1, 2, 3, 1/2 , 1/3 }.
● 비율을 계산 width= sk √ ar, height = sk / √ ar 1이면, 정사각형 2 이면은 세로가 작은, 1/2이면 세로가 큰
● 5개의 비율이 다른 박스를 생성
● 바운딩 박스를 6개나 4개를 뽑았는데 1개는 sk만 가지고 추가로 만듬
● 4개는 3이랑 1/3이 빠져서 4개가 됨
SSD : Training
- After the matching step, most of the default boxes are negatives, especially when the number of possible
default boxes is large
- 모든 Detection에 대한 공통적인 문제, Bounding Box가 8732개인데 iou 0.5만 추려내서 사용한다면은
8732개중에 대부분이 Negative Sample이므로 거의 대부분의 데이터가 배경임
- Using the highest confidence loss for each default box
- Thee ratio between the negatives and positives is at most 3:1.
- 그래서 confidence로 순서를 세우고, Negative중에 높은것들중에 Positive의 3배만 선택
Hard negative mining
- Use the entire original input image.
- Sample a patch so that the minimum jaccard overlap with the objects is 0.1, 0.3, 0.5, 0.7, or 0.9
- Randomly sample a patch.
- The aspect ratio is between 1 2 and 2
- Horizontally flipped with probability of 0.5
- Applying some photo-metric distortions
Data augmentation
SSD : Experimental Results
- VGG16
- We convert fc6 and fc7 to convolutional layers
- Using the highest confidence loss for each default box
- Subsample parameters from fc6 and fc7, change pool5 from 2 × 2 − s2 to 3 × 3 − s1
- We remove all the dropout layers and the fc8 layer
- We fine-tune the resulting model using SGD with initial learning rate 10−3 , 0.9 momentum, 0.0005 weight
decay, and batch size 32
Base network
SSD : Experimental Results
- Both Fast and Faster R-CNN use input images whose minimum dimension is 600
- The two SSD models have exactly the same settings except that they have different input sizes (300×300 vs.
512×512)
SSD : Experimental Results
- XS=extra-small; S=small; M=medium; L=large; XL =extra-large. Aspect Ratio: XT=extra-tall/narrow; T=tall;
M=medium; W=wide; XW =extra-wide
- SSD는 작은 물체를 잘 검출하지 못한다.
- 비율은 일그러져도 나름 잘 찾음
SSD : Experimental Results
- 이 논문에서는 Data Augmentation 으로 해결 할려함. 작은 이미지를 train data에 추가함
Sensitivity and impact of different object
● we first randomly place an image on a canvas of 16× of the original image size filled with mean values
원본이미지에 16배 큰 캔버스에 붙여 넣기할 이미지의 평균값으로 채운다
● We we do any random crop operation
● 그리고 이미지를 붙여 넣음
나름 잘 찾음
SSD : Experimental Results
Other reasons? FPN의 시작
- 작은 물체는 낮은 레이어에서 검출됨.
- 낮은 레이어는 충분하게 Abstraction 이 되어 있지 않아서 검출이 힘듬
- 높은 레이어에서는 충분한 Abtration이 되어 있으나 작은 물체는 검출이 힘듬(큰물체는 잘 찾음)
- 높은 레이어의 Abtration결과를 낮은 레이어로 전파해주자. 다시 거꾸로 올려줌
- FPN의 시작. 그중 Retina를 살펴보겠음
RETINANET:FocalLossfor
DenseObjectDetection
02
RETINA : Introduction
SOTA는 Two Stage Detector(FASTER FCNN …)
Could a simple one-stage detector achieve similar accuracy?
Class imbalance가 문제인데 (Negative : 배경이 너무 많음)
We propose a new loss function that acts as a more effective alternative to
previous approaches for dealing with class imbalance
- Faster RCNN은 RPN을 통해 바운딩 박스를 휴리스틱방법을 통해 줄여줌
- Single Stage Detector는 제안하는 박스가 너무 많고 대부분이 배경임
- One Stage : Fast, Simple
- Two Stage : 10~40% better accuracy
- CE(Cross Entropy)에 몇개 Term을 추가한 focal loss를 제안
- 쉬운 샘플을 더욱더 쉽게 만들어서 어려운 샘플에 더 focus하게 만드는 loss
- YOLOv1(98 boxes), YOLOv2(1K), OverFeat(1~2K), SSD(~8-26k)
- Default boxes가 많을수록 성능이 좋음
RETINA : Introduction
Cross Entropy with Imbalance Data
We propose a new loss function that acts as a more effective alternative to
previous approaches for dealing with class imbalance
- CE(Cross Entropy)에 몇개 Term을 추가한 focal loss를 제안
- 쉬운 샘플을 더욱더 쉽게 만들어서 어려운 샘플에 더 focus하게 만드는 loss
- 100000 easy, 100 hard examples
- 40x bigger loss from easy examples
- 그래서 CE를 살짝 변경함
RETINA : Focal loss
RETINA : Focal loss
Focal Loss
- We introduce the focal loss starting from the cross entropy (CE) loss for binary classification
● y ∈ {±1} specifies the ground-truth class
● p ∈ [0, 1] is the model’s estimated probability for the class with label y = 1
RETINA : Focal loss
Balanced Cross Entropy
● For instance, with γ = 2, an example classified with pt = 0.9 would have 100× lower loss compared with
CE and with pt ≈ 0.968
Focal Loss Definition
쉬운것을 더 쉽게 만들어서 Hard sample에 더 집중하게 만드는 loss
RETINA : Retinanet Detector
RetinaNet Detector
- RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks
- The backbone is responsible for computing a convolutional feature map over an entire input image
- The second subnet performs convolutional bounding box regression
- We construct a pyramid with levels P3 through P7
- the spatial resolution is upsampled by a factor of 2 using the nearest neighbor for simplicity.(FPN), 1 by 1 Conv
추상화가 잘된 피처를 낮은 레이어로 내려서 작은 물체도 잘 디텍션 하게
RETINA : Retinanet Detector
Experiments
RETINA : Retinanet Detector
추가 고민사항
- Backbone을 유지한채로 FPN부분만 잘 설계하면 성능이 좋아지지 않을까?
- 꼭 FPN을 top-down으로 섞어야 하는가?
- 어떻게 섞는것이 효율적일까?
- 잘 모르겠으니 Automl로 이것저것 다 섞어서 테스트를 해보자
NAS-FPN으로 넘어감
RETINA : Retinanet Detector
추가 고민사항
- Backbone을 유지한채로 FPN부분만 잘 설계하면 성능이 좋아지지 않을까?
- 꼭 FPN을 top-down으로 섞어야 하는가?
- 어떻게 섞는것이 효율적일까?
- 잘 모르겠으니 Automl로 이것저것 다 섞어서 테스트를 해보자
NAS-FPN으로 넘어감
NAS-FPN:
LearningScalableFeaturePyramid
ArchitectureforObjectDetection
03
NAS-FAN : Introduction
The challenge of designing feature pyramid architecture is in its huge design space
The key contribution of our work is in designing the search space that
covers all possible cross-scale connections to generate multiscale feature
representations.
The discovered architecture, named NAS-FPN, offers great flexibility in
building object detection architecture.
- Recently, Neural Architecture Search algorithm demonstrates promising results on efficiently
discovering top-performing architectures for image classification in a huge search space
Current state-of-the-art convolutional architectures for object detection are
manually designed. Here we aim to learn a better architecture of feature
pyramid network for object detection.
NAS-FAN : Method
- The architecture of FPN can be stacked N times for better accuracy
- The backbone model and the subnets for class and box predictions follow the original design in RetinaNet
RetinaNet with NAS-FPN
NAS-FAN : Method
- 5 scales {C3, C4, C5, C6, C7} with corresponding feature stride of {8, 16, 32, 64, 128} pixels
- The C6 and C7 are created by simply applying stride 2 and stride 4 max pooling to C5
- 피처맵 2개 선택해서 적당한 연산을 통해 합쳐주는 방법 MergingCell을 제안
Merging Cell
- Feature map을 2개 뽑고, output resolution 선택하고, Binary op를 해서 합친다.
- The input feature layers are adjusted to the output resolution by nearest neighbor
upsampling or max pooling if needed before applying the binary operation
- The merged feature layer is always followed by a ReLU, a 3x3 convolution, and a
batch normalization layer
- 다시 피처맵에 넣고 N time 반복
NAS-FAN : Method
Merging Cell
NAS-FAN : Experiments
Architecture Search for NAS-FPN
- To speed up the training of the RNN controller we need a proxy task
- Proxy task for 10 epochs, instead of 50 epochs
- A small backbone architecture of ResNet-10 with input 512 × 512 image size
- Reward : We reserve a randomly selected 7392 images from the COCO train2017 set as the validation set,
which we use to obtain rewards
Proxy Task
- Similar to our controller is a recurrent neural network (RNN) and it is trained using the Proximal Policy
Optimization (PPO) algorithm.
- The total number of unique architectures generated by the RNN controller
Contoller
NAS-FAN : Experiments
Architecture Search for NAS-FPN
- Left : The reward is computed as the AP of sampled architectures on the proxy task
- Right: The number of sampled unique architectures to the total number of sampled architectures
- Unique 한 FPN 구조는 대충 8000개 정도에서 수렴함
- 수많은 TPUs 사용해서 만들어낸 결과는?(100 TPUs,? 1000 TPUs??)
NAS-FAN : Experiments
Scalable Feature Pyramid Architecture
- 7 merging cell
- RCB : Relu, Conv, BatchNorm
- GP : Global pooling
- 파란색(서로다른 스케일의 feature map)에서 feature에서 Box Regression
NAS-FAN : Experiments
Architecture graph of NAS-FPN
- Feature layers in the same row have identical resolution
- The resolution decreases in the bottom-up direction
- 해석을 하자면 FPN은 low 에서 high resolution 으로만 연결이 있음
- NAS가 AP가 높은것을 찾을수록 High resolution을 low resolution으로 연결할려는 모습을 보임
작은 물체를 감지하는 고해상도 피처를 연결하는 feature를 생성할수록 성능이 좋아짐
NAS-FAN : Experiments
Detection accuracy
NAS-FAN : Experiments
Further Improvements with DropBlock
- We apply DropBlock with block size 3x3 after batch normalization layers in the the NAS-FPN layers
- DropBlock을 사용하면 성능이 더 좋아짐
추가 고민사항
- AutoML이 Detection 영역으로 적용된 사례
- AutoML을 돌릴려면 무지막지한 장비와 시간이 드는데 과연 우리들이 할수 있을까?
- 더 효과적인 방법이 있을까?
- Multi resolution feature를 더할때 그냥 sum만 하는데 다른 방법이 없을까?
Efficient DET의 시작.
NAS-FAN : Experiments
EfficientDET:
Scalable andEfficientObject
Detection
04
EFFICIENTDET : Introduction
The state of-the-art object detectors also become increasingly more expensive
The key contribution of our work is in designing the search space that
covers all possible cross-scale connections to generate multiscale feature
representations.
- The latest AmoebaNet-based NASFPN detector requires 167M parameters and 3045B FLOPS (30x
more than RetinaNet)
- Given these real-world resource constraints, model efficiency becomes increasingly important for
object detection.
Model efficiency has become increasingly important in computer vision. First,
we propose a weighted bi-directional feature pyramid network. Second, we
propose a compound scaling method(EfficientNet). We have developed a new
family of object detectors, called EfficientDet
EFFICIENTDET : Introduction
Although these methods tend to achieve better efficiency, they usually sacrifice
accuracy
- Most previous works only focus on a specific or a small range of resource requirements
- the variety of real-world applications, from mobile devices to datacenters
A natural question
Is it possible to build a scalable detection architecture with both higher accuracy
and better efficiency across a wide spectrum of resource constraints.
모든 OD 논문의 공통 질문, 정확도와 효율성을 동시에 잡겠다!
EFFICIENTDET : Introduction
Challenge 1: efficient multi-scale feature fusion
- FPN has been widely used for multiscale feature fusion
- PANet, NAS-FPN, and other studies have developed more network structures for cross-scale feature fusion
- Most previous works simply sum them up without distinction
- We propose a simple yet highly effective weighted bi-directional feature pyramid network (BiFPN)
- PANet Retina Top-Down에서 하나더 Down-Top을 추가로 넣음
- 이유는 낮은 레벨의 feature는 위치정보가 더 있으니, 한번더 위로 올려주어서 상위레벨의 feature에
위치정보를 더 주면 성능이 좋아질것으로 예상.
EFFICIENTDET : Introduction
Challenge 2: model scaling
- Inspired by recent works EfficientNet, we propose a compound scaling method for object detectors, which
jointly scales up the resolution/depth/width for all backbone, feature network, box/class prediction network
- 모델을 크게 만드는 3가지 방법이 width, depth, resolution이 있는데 3개를 동시에 적절히 잘해보자.(Efficient
Net방법 적용)
EFFICIENTDET : Introduction
Our contributions can be summarized
- We proposed BiFPN, a weighted bidirectional feature network for easy and fast multi-scale feature fusion
- We proposed a new compound scaling method, which jointly scales up backbone, feature network,
box/class network, and resolution, in a principled way
- Based on BiFPN and compound scaling, we developed EfficientDet
EFFICIENTDET : BiFPN
Problem Formulation
- We proposed BiFPN, a weighted bidirectional feature network for easy and fast multi-scale feature fusion
- We proposed a new compound scaling method, which jointly scales up backbone, feature network,
box/class network, and resolution, in a principled way
- Based on BiFPN and compound scaling, we developed EfficientDet
EFFICIENTDET : BiFPN
Problem Formulation
- Formally, given a list of multi-scale features
Feature Pyramid에서 사용하는 Feature를 P in
- Our goal is to find a transformation f that can effectively aggregate different features.
- Output a list of new features
EFFICIENTDET : BiFPN
Feature network design
EFFICIENTDET : BiFPN
Cross-Scale Connections
- We observe that PANet achieves better accuracy than FPN and NAS-FPN
- 진짜?? 그럼 왜 NAS를 돌린걸까??
- First, we remove those nodes that only have one input edge
- Our intuition is simple: if a node has only one input edge with no feature fusion then it will have less
contribution called Simplified PANet
- Second, we add an extra edge from the original input to output node if they are at the same level
- Third, unlike PANet that only has one top-down and one bottom-up path, we treat each bidirectional
(top-down & bottom-up) path as one feature network layer, and repeat the same layer multiple times to
enable more high-level feature fusion
First Second Third N
times repeat
EFFICIENTDET : BiFPN
Weighted Feature Fusion
- A common way is to first resize them to the same resolution and then sum them up.
- Pyramid attention network introduces global self-attention upsampling to recover pixel
localization(SENET과 비슷)
Unbounded fusion
- Wi is a learnable weight that can be a scalar (per-feature), a vector (per-channel), or a multi-dimensional
tensor (per-pixel).
- We find a scale, The scalar weight is unbounded
- we resort to weight normalization to bound the value range of each weight
EFFICIENTDET : BiFPN
Softmax-based fusion
- An intuitive idea is to apply softmax to each weight, such that all weights are normalized to be a probability
with value range from 0 to 1, representing the importance of each input.
- The extra softmax leads to significant slowdown on GPU hardware
Fast normalized fusion
- where wi ≥ 0 is ensured by applying a Relu after each Wi
- E = 0.0001 is a small value to avoid numerical instability
- This fast fusion approach has very similar learning behavior and accuracy as the softmax-based fusion,
but runs up to 30% faster on GPUs
EFFICIENTDET : BiFPN
Fast normalized fusion
Ptd 6 P out 6
P out 5
EFFICIENTDET : BiFPN
Fast normalized fusion
Ptd 6 P out 6
P out 5
EFFICIENTDET : Architecture
EfficientDet architecture
- EfficientNet as the backbone network
- BiFPN as the feature network n times
- Shared class/box prediction network
EFFICIENTDET : EFFICIENTNET
Efficient Net
채널을 늘리거나
(width)
더 깊게 쌓거나
(Depth)
Input Image를
키우거나
(Resolution)
적당한 방법으로
늘리자
EFFICIENTDET : EFFICIENTNET
Compound Scaling
- We propose a new compound scaling method for object detection, which uses a simple compound
coefficient φ to jointly scale up all dimensions of backbone network, BiFPN network, class/box
network, and resolution.
- Grid search for all dimensions is prohibitive expensive. Therefore, we use a heuristic-based scaling
approach
Backbone network
- We reuse the same width/depth scaling coefficients of EfficientNet-B0 to B6
EFFICIENTDET : EFFICIENTNET
BiFPN network
- We exponentially grow BiFPN width Wbifpn (#channels)
- Linearly increase depth Dbifpn (#layers)
Box/class prediction network
- We fix their width to be always the same as BiFPN (i.e., Wpred = Wbifpn)
- But linearly increase the depth (#layers)
채널 깊이, 레이어 수
Input image resolution
- Since feature level 3-7 are used in BiFPN, the input resolution must be dividable by 2^7=128
- But linearly increase the depth (#layers)
EFFICIENTDET : EFFICIENTNET
Scaling configs for EfficientDet D0-D7
Wpred = Wbifpn
EfficientNet-B0 to B6
Heuristic-based 만든 공식으로 Scale up 진행
EFFICIENTDET : Experiments
EfficientDet performance on COCO
EFFICIENTDET : Experiments
Model size and inference latency comparison
EFFICIENTDET : Conclusion
Weighted bidirectional feature network
Customized compound scaling method
Improve accuracy and efficiency
EfficientDet-D7 achieves state-of-the-art accuracy
3.2x faster on GPUs and 8.1x faster on CPU
THE END
Appendix
ntos.gitbooks.io/artificial-inteligence/content/single-shot-detectors/ssd.html
https://uk-kim.github.io/2018/12/07/Focal-loss-for-dense-object-detection.htmlDeep Learning for
Generic Object Detection: A Survey
https://taeu.github.io/paper/deeplearning-paper-ssd/
https://leonardoaraujosa
https://towardsdatascience.com/review-fpn-feature-pyramid-network-object-detection-262fc7482610
https://www.groundai.com/project/pyramid-attention-network-for-semantic-segmentation/1
https://www.youtube.com/watch?v=11jDC8uZL0E

Más contenido relacionado

La actualidad más candente

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Training course lect3
Training course lect3Training course lect3
Training course lect3Noor Dhiya
 
머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear ModelJungkyu Lee
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Universitat Politècnica de Catalunya
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCMLconf
 
Algoritma fuzzy c means fcm java c++ contoh program
Algoritma fuzzy c means fcm java c++   contoh programAlgoritma fuzzy c means fcm java c++   contoh program
Algoritma fuzzy c means fcm java c++ contoh programym.ygrex@comp
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelANIRUDHMALODE2
 
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...Lviv Startup Club
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowAltoros
 

La actualidad más candente (20)

Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Multidimensional RNN
Multidimensional RNNMultidimensional RNN
Multidimensional RNN
 
Training course lect3
Training course lect3Training course lect3
Training course lect3
 
머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model머피의 머신러닝 13 Sparse Linear Model
머피의 머신러닝 13 Sparse Linear Model
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Lec10new
Lec10newLec10new
Lec10new
 
rnn BASICS
rnn BASICSrnn BASICS
rnn BASICS
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
 
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYCJeff Johnson, Research Engineer, Facebook at MLconf NYC
Jeff Johnson, Research Engineer, Facebook at MLconf NYC
 
Algoritma fuzzy c means fcm java c++ contoh program
Algoritma fuzzy c means fcm java c++   contoh programAlgoritma fuzzy c means fcm java c++   contoh program
Algoritma fuzzy c means fcm java c++ contoh program
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
Text prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language ModelText prediction based on Recurrent Neural Network Language Model
Text prediction based on Recurrent Neural Network Language Model
 
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...
Grant Reaber “Wavenet and Wavenet 2: Generating high-quality audio with neura...
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
 
RSA
RSARSA
RSA
 

Similar a Single shot multiboxdetectors

PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14Sri Ambati
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2Jeong Choi
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Neural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnNeural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnKwanghee Choi
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...iosrjce
 
Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.Ian Sa
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Dongmin Choi
 
Practical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxPractical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxMannyK4
 

Similar a Single shot multiboxdetectors (20)

PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
2021 05-04-u2-net
2021 05-04-u2-net2021 05-04-u2-net
2021 05-04-u2-net
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Neural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to LearnNeural Architecture Search: Learning How to Learn
Neural Architecture Search: Learning How to Learn
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
H017376369
H017376369H017376369
H017376369
 
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
Practical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsxPractical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsx
 

Último

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Último (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Single shot multiboxdetectors

  • 1. OneStage DeTectors Here is where your presentations begins!
  • 4. SSD : Introduction Object Detection 역사
  • 6. SSD : Introduction SOTA는 FASTER RCNN(2 Stage Detector) - BoundingBox 가설을 통해 각 Box에 대한 픽셀이나 피처의 Resample하고 Class를 분류하는 방법 Too computationally intensive for embedded systems - Faster RCNN도 7fps밖에 안나옴 Significantly increased speed - 정확도가 떨어짐, YOLO - Faster R-CNN 7 FPS with mAP 73.2% or YOLO 45 FPS with mAP 63.4% The first deep network based object detector - does not resample pixels features for bounding box - accurate as approaches 두마리 토끼(속도와 정합성)을 잡자!
  • 7. SSD : Single shot Detector - 여러개의 Default Box 사용, 여러개의 피처에 Prediction 진행 - 높은 레벨의 피처는 추상화가 잘되어 있어서 큰 물체를 잘 찾음 - 낮은 레벨의 피처는 위치정보가 정확함 이런 느낌? 마지막 피처에서만 찾지 말고, 처음, 중간, 마지막 피처에서 찾아보자
  • 8. SSD : Model - VGG 16 의 변경 - VGG 16의 Conv5_3 Conv_7, Conv8_2, Conv9_2 Conv10_2, Conv11_2에서 추출 - Clasifier : 3x3x - Detections : 8732 - 74.3 mAP, 59FPS - 다양한 피처맵 SSD - 중간에 FC(?) - Detecion 98 Conv_7, Conv8_2, Conv9_2 Conv10_2, Conv11_2에서 추출 - Clasifier : 3x3x - Detections : 8732 - 63.4mAP, 45FPS(?) - 마지막 피처맵만 YOLO
  • 9. SSD : Model Multi-scale feature maps for detection - 다른 Feature map에서 detection을 수행함 - 낮은 레이어는 물체의 위치가 더 정확히, 높은 레이어에서는 추상화가 잘되어 있으므로, 두개를 잘 섞자. Convolutional predictors for detection The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections - Detection을 할때는 3x3xP개의 Conv필터를 사용함 - 출력은 a score for a category(1개), or a shape offset relative to the default box coordinates(4개) Default boxes and aspect ratios - Our default boxes are similar to the anchor boxes used in Faster R-CNN - 마치 Faster RCNN처럼 기본 박스를 initial로 정하고, x, y, dw dh의 변화량을 학습함
  • 10. SSD : Model Convolutional predictors for detection 좀더 자세히 - Classifier : Conv: 3x3x(4x(Classes+4)) - 구조 : 첫번째 박스[(4개(dx, dy, dh, dw), 20개(Poscal voc기준 20 class), + 1개(bg)] 두번째, 세번째 , ~6번재박스까지 - 출력 채널 : 150 = 6 x (21 = 4)
  • 11. SSD : Model Yolo v3 참고 : 먼가 SSD랑 비슷함..(?)
  • 12. SSD : Training Matching strategy - 많은 Default Boxes에서 GT랑 많이 겹치는 부분을 찾아내고 나머지는 Background처리 하는 기준이 IOU 0.5 - we then match default boxes to any ground truth with jaccard overlap higher than a threshold (0.5) - Jaccard overlap이 iou임 The key difference between training SSD and training a typical detector that uses region proposals, is that ground truth information needs to be assigned to specific outputs in the fixed set of detector outputs. YOLO and for the region proposal stage of Faster R-CNN
  • 13. SSD : Training Training objective - Faster RCNN이랑 비슷함 ● L conf : The confidence loss is the softmax loss over multiple classes confidences ● L Loc : we regress to offsets for the center (cx, cy) of the default bounding box (d) and for its width (w) and height (h), default box에서 얼마나 이동시키면 되는건가를 학습하는것임 Width와 height는 log임 스케일이 커질수 있으니까. N : the number of matched default boxes
  • 14. SSD : Training - 고양이와 개가 존재(고양이는 작고, 개는 큼) - 8 x 8(낮은 레벨의 피처) 에서 iou가 0,5이상인것은 고양이만 검출(개는 더 크게 봐야함) - 4 x 4(높은 레벨의 피처) 에서는 iou사 0.5이상인것은 개만 검출(고양이는 너무 작음) - 피처에 따라 한 픽셀이 담당하는 원본이미지의 영역이 달라짐 Maching 알고리즘과 로스를 보고 다시한번 첫번째 그림을 해석하면
  • 15. SSD : Training 여러 피처 맵에서 동일 물체를 찾을려고 서로 노력함
  • 16. SSD : Training - 디폴트 박스를 만드는 식 설명 Choosing scales and aspect ratios for default boxes ● M : 몇개의 feature map에서 박스를 뽑아 낼것이냐 ● Smin, Smax는 상수(0.2~0.9) ● K는 선택하는 값 ● Example PASCAL VOC : sk 0.1, 0.2, 0.55, 0.725, 0.9 - Sk 계산이 끝나면 박스의 비율을 선택 ● ar ∈ {1, 2, 3, 1/2 , 1/3 }. ● 비율을 계산 width= sk √ ar, height = sk / √ ar 1이면, 정사각형 2 이면은 세로가 작은, 1/2이면 세로가 큰 ● 5개의 비율이 다른 박스를 생성 ● 바운딩 박스를 6개나 4개를 뽑았는데 1개는 sk만 가지고 추가로 만듬 ● 4개는 3이랑 1/3이 빠져서 4개가 됨
  • 17. SSD : Training - After the matching step, most of the default boxes are negatives, especially when the number of possible default boxes is large - 모든 Detection에 대한 공통적인 문제, Bounding Box가 8732개인데 iou 0.5만 추려내서 사용한다면은 8732개중에 대부분이 Negative Sample이므로 거의 대부분의 데이터가 배경임 - Using the highest confidence loss for each default box - Thee ratio between the negatives and positives is at most 3:1. - 그래서 confidence로 순서를 세우고, Negative중에 높은것들중에 Positive의 3배만 선택 Hard negative mining - Use the entire original input image. - Sample a patch so that the minimum jaccard overlap with the objects is 0.1, 0.3, 0.5, 0.7, or 0.9 - Randomly sample a patch. - The aspect ratio is between 1 2 and 2 - Horizontally flipped with probability of 0.5 - Applying some photo-metric distortions Data augmentation
  • 18. SSD : Experimental Results - VGG16 - We convert fc6 and fc7 to convolutional layers - Using the highest confidence loss for each default box - Subsample parameters from fc6 and fc7, change pool5 from 2 × 2 − s2 to 3 × 3 − s1 - We remove all the dropout layers and the fc8 layer - We fine-tune the resulting model using SGD with initial learning rate 10−3 , 0.9 momentum, 0.0005 weight decay, and batch size 32 Base network
  • 19. SSD : Experimental Results - Both Fast and Faster R-CNN use input images whose minimum dimension is 600 - The two SSD models have exactly the same settings except that they have different input sizes (300×300 vs. 512×512)
  • 20. SSD : Experimental Results - XS=extra-small; S=small; M=medium; L=large; XL =extra-large. Aspect Ratio: XT=extra-tall/narrow; T=tall; M=medium; W=wide; XW =extra-wide - SSD는 작은 물체를 잘 검출하지 못한다. - 비율은 일그러져도 나름 잘 찾음
  • 21. SSD : Experimental Results - 이 논문에서는 Data Augmentation 으로 해결 할려함. 작은 이미지를 train data에 추가함 Sensitivity and impact of different object ● we first randomly place an image on a canvas of 16× of the original image size filled with mean values 원본이미지에 16배 큰 캔버스에 붙여 넣기할 이미지의 평균값으로 채운다 ● We we do any random crop operation ● 그리고 이미지를 붙여 넣음 나름 잘 찾음
  • 22. SSD : Experimental Results Other reasons? FPN의 시작 - 작은 물체는 낮은 레이어에서 검출됨. - 낮은 레이어는 충분하게 Abstraction 이 되어 있지 않아서 검출이 힘듬 - 높은 레이어에서는 충분한 Abtration이 되어 있으나 작은 물체는 검출이 힘듬(큰물체는 잘 찾음) - 높은 레이어의 Abtration결과를 낮은 레이어로 전파해주자. 다시 거꾸로 올려줌 - FPN의 시작. 그중 Retina를 살펴보겠음
  • 24. RETINA : Introduction SOTA는 Two Stage Detector(FASTER FCNN …) Could a simple one-stage detector achieve similar accuracy? Class imbalance가 문제인데 (Negative : 배경이 너무 많음) We propose a new loss function that acts as a more effective alternative to previous approaches for dealing with class imbalance - Faster RCNN은 RPN을 통해 바운딩 박스를 휴리스틱방법을 통해 줄여줌 - Single Stage Detector는 제안하는 박스가 너무 많고 대부분이 배경임 - One Stage : Fast, Simple - Two Stage : 10~40% better accuracy - CE(Cross Entropy)에 몇개 Term을 추가한 focal loss를 제안 - 쉬운 샘플을 더욱더 쉽게 만들어서 어려운 샘플에 더 focus하게 만드는 loss - YOLOv1(98 boxes), YOLOv2(1K), OverFeat(1~2K), SSD(~8-26k) - Default boxes가 많을수록 성능이 좋음
  • 25. RETINA : Introduction Cross Entropy with Imbalance Data We propose a new loss function that acts as a more effective alternative to previous approaches for dealing with class imbalance - CE(Cross Entropy)에 몇개 Term을 추가한 focal loss를 제안 - 쉬운 샘플을 더욱더 쉽게 만들어서 어려운 샘플에 더 focus하게 만드는 loss - 100000 easy, 100 hard examples - 40x bigger loss from easy examples - 그래서 CE를 살짝 변경함
  • 27. RETINA : Focal loss Focal Loss - We introduce the focal loss starting from the cross entropy (CE) loss for binary classification ● y ∈ {±1} specifies the ground-truth class ● p ∈ [0, 1] is the model’s estimated probability for the class with label y = 1
  • 28. RETINA : Focal loss Balanced Cross Entropy ● For instance, with γ = 2, an example classified with pt = 0.9 would have 100× lower loss compared with CE and with pt ≈ 0.968 Focal Loss Definition 쉬운것을 더 쉽게 만들어서 Hard sample에 더 집중하게 만드는 loss
  • 29. RETINA : Retinanet Detector RetinaNet Detector - RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks - The backbone is responsible for computing a convolutional feature map over an entire input image - The second subnet performs convolutional bounding box regression - We construct a pyramid with levels P3 through P7 - the spatial resolution is upsampled by a factor of 2 using the nearest neighbor for simplicity.(FPN), 1 by 1 Conv 추상화가 잘된 피처를 낮은 레이어로 내려서 작은 물체도 잘 디텍션 하게
  • 30. RETINA : Retinanet Detector Experiments
  • 31. RETINA : Retinanet Detector 추가 고민사항 - Backbone을 유지한채로 FPN부분만 잘 설계하면 성능이 좋아지지 않을까? - 꼭 FPN을 top-down으로 섞어야 하는가? - 어떻게 섞는것이 효율적일까? - 잘 모르겠으니 Automl로 이것저것 다 섞어서 테스트를 해보자 NAS-FPN으로 넘어감
  • 32. RETINA : Retinanet Detector 추가 고민사항 - Backbone을 유지한채로 FPN부분만 잘 설계하면 성능이 좋아지지 않을까? - 꼭 FPN을 top-down으로 섞어야 하는가? - 어떻게 섞는것이 효율적일까? - 잘 모르겠으니 Automl로 이것저것 다 섞어서 테스트를 해보자 NAS-FPN으로 넘어감
  • 34. NAS-FAN : Introduction The challenge of designing feature pyramid architecture is in its huge design space The key contribution of our work is in designing the search space that covers all possible cross-scale connections to generate multiscale feature representations. The discovered architecture, named NAS-FPN, offers great flexibility in building object detection architecture. - Recently, Neural Architecture Search algorithm demonstrates promising results on efficiently discovering top-performing architectures for image classification in a huge search space Current state-of-the-art convolutional architectures for object detection are manually designed. Here we aim to learn a better architecture of feature pyramid network for object detection.
  • 35. NAS-FAN : Method - The architecture of FPN can be stacked N times for better accuracy - The backbone model and the subnets for class and box predictions follow the original design in RetinaNet RetinaNet with NAS-FPN
  • 36. NAS-FAN : Method - 5 scales {C3, C4, C5, C6, C7} with corresponding feature stride of {8, 16, 32, 64, 128} pixels - The C6 and C7 are created by simply applying stride 2 and stride 4 max pooling to C5 - 피처맵 2개 선택해서 적당한 연산을 통해 합쳐주는 방법 MergingCell을 제안 Merging Cell - Feature map을 2개 뽑고, output resolution 선택하고, Binary op를 해서 합친다. - The input feature layers are adjusted to the output resolution by nearest neighbor upsampling or max pooling if needed before applying the binary operation - The merged feature layer is always followed by a ReLU, a 3x3 convolution, and a batch normalization layer - 다시 피처맵에 넣고 N time 반복
  • 38. NAS-FAN : Experiments Architecture Search for NAS-FPN - To speed up the training of the RNN controller we need a proxy task - Proxy task for 10 epochs, instead of 50 epochs - A small backbone architecture of ResNet-10 with input 512 × 512 image size - Reward : We reserve a randomly selected 7392 images from the COCO train2017 set as the validation set, which we use to obtain rewards Proxy Task - Similar to our controller is a recurrent neural network (RNN) and it is trained using the Proximal Policy Optimization (PPO) algorithm. - The total number of unique architectures generated by the RNN controller Contoller
  • 39. NAS-FAN : Experiments Architecture Search for NAS-FPN - Left : The reward is computed as the AP of sampled architectures on the proxy task - Right: The number of sampled unique architectures to the total number of sampled architectures - Unique 한 FPN 구조는 대충 8000개 정도에서 수렴함 - 수많은 TPUs 사용해서 만들어낸 결과는?(100 TPUs,? 1000 TPUs??)
  • 40. NAS-FAN : Experiments Scalable Feature Pyramid Architecture - 7 merging cell - RCB : Relu, Conv, BatchNorm - GP : Global pooling - 파란색(서로다른 스케일의 feature map)에서 feature에서 Box Regression
  • 41. NAS-FAN : Experiments Architecture graph of NAS-FPN - Feature layers in the same row have identical resolution - The resolution decreases in the bottom-up direction - 해석을 하자면 FPN은 low 에서 high resolution 으로만 연결이 있음 - NAS가 AP가 높은것을 찾을수록 High resolution을 low resolution으로 연결할려는 모습을 보임 작은 물체를 감지하는 고해상도 피처를 연결하는 feature를 생성할수록 성능이 좋아짐
  • 43. NAS-FAN : Experiments Further Improvements with DropBlock - We apply DropBlock with block size 3x3 after batch normalization layers in the the NAS-FPN layers - DropBlock을 사용하면 성능이 더 좋아짐
  • 44. 추가 고민사항 - AutoML이 Detection 영역으로 적용된 사례 - AutoML을 돌릴려면 무지막지한 장비와 시간이 드는데 과연 우리들이 할수 있을까? - 더 효과적인 방법이 있을까? - Multi resolution feature를 더할때 그냥 sum만 하는데 다른 방법이 없을까? Efficient DET의 시작. NAS-FAN : Experiments
  • 46. EFFICIENTDET : Introduction The state of-the-art object detectors also become increasingly more expensive The key contribution of our work is in designing the search space that covers all possible cross-scale connections to generate multiscale feature representations. - The latest AmoebaNet-based NASFPN detector requires 167M parameters and 3045B FLOPS (30x more than RetinaNet) - Given these real-world resource constraints, model efficiency becomes increasingly important for object detection. Model efficiency has become increasingly important in computer vision. First, we propose a weighted bi-directional feature pyramid network. Second, we propose a compound scaling method(EfficientNet). We have developed a new family of object detectors, called EfficientDet
  • 47. EFFICIENTDET : Introduction Although these methods tend to achieve better efficiency, they usually sacrifice accuracy - Most previous works only focus on a specific or a small range of resource requirements - the variety of real-world applications, from mobile devices to datacenters A natural question Is it possible to build a scalable detection architecture with both higher accuracy and better efficiency across a wide spectrum of resource constraints. 모든 OD 논문의 공통 질문, 정확도와 효율성을 동시에 잡겠다!
  • 48. EFFICIENTDET : Introduction Challenge 1: efficient multi-scale feature fusion - FPN has been widely used for multiscale feature fusion - PANet, NAS-FPN, and other studies have developed more network structures for cross-scale feature fusion - Most previous works simply sum them up without distinction - We propose a simple yet highly effective weighted bi-directional feature pyramid network (BiFPN) - PANet Retina Top-Down에서 하나더 Down-Top을 추가로 넣음 - 이유는 낮은 레벨의 feature는 위치정보가 더 있으니, 한번더 위로 올려주어서 상위레벨의 feature에 위치정보를 더 주면 성능이 좋아질것으로 예상.
  • 49. EFFICIENTDET : Introduction Challenge 2: model scaling - Inspired by recent works EfficientNet, we propose a compound scaling method for object detectors, which jointly scales up the resolution/depth/width for all backbone, feature network, box/class prediction network - 모델을 크게 만드는 3가지 방법이 width, depth, resolution이 있는데 3개를 동시에 적절히 잘해보자.(Efficient Net방법 적용)
  • 50. EFFICIENTDET : Introduction Our contributions can be summarized - We proposed BiFPN, a weighted bidirectional feature network for easy and fast multi-scale feature fusion - We proposed a new compound scaling method, which jointly scales up backbone, feature network, box/class network, and resolution, in a principled way - Based on BiFPN and compound scaling, we developed EfficientDet
  • 51. EFFICIENTDET : BiFPN Problem Formulation - We proposed BiFPN, a weighted bidirectional feature network for easy and fast multi-scale feature fusion - We proposed a new compound scaling method, which jointly scales up backbone, feature network, box/class network, and resolution, in a principled way - Based on BiFPN and compound scaling, we developed EfficientDet
  • 52. EFFICIENTDET : BiFPN Problem Formulation - Formally, given a list of multi-scale features Feature Pyramid에서 사용하는 Feature를 P in - Our goal is to find a transformation f that can effectively aggregate different features. - Output a list of new features
  • 54. EFFICIENTDET : BiFPN Cross-Scale Connections - We observe that PANet achieves better accuracy than FPN and NAS-FPN - 진짜?? 그럼 왜 NAS를 돌린걸까?? - First, we remove those nodes that only have one input edge - Our intuition is simple: if a node has only one input edge with no feature fusion then it will have less contribution called Simplified PANet - Second, we add an extra edge from the original input to output node if they are at the same level - Third, unlike PANet that only has one top-down and one bottom-up path, we treat each bidirectional (top-down & bottom-up) path as one feature network layer, and repeat the same layer multiple times to enable more high-level feature fusion First Second Third N times repeat
  • 55. EFFICIENTDET : BiFPN Weighted Feature Fusion - A common way is to first resize them to the same resolution and then sum them up. - Pyramid attention network introduces global self-attention upsampling to recover pixel localization(SENET과 비슷) Unbounded fusion - Wi is a learnable weight that can be a scalar (per-feature), a vector (per-channel), or a multi-dimensional tensor (per-pixel). - We find a scale, The scalar weight is unbounded - we resort to weight normalization to bound the value range of each weight
  • 56. EFFICIENTDET : BiFPN Softmax-based fusion - An intuitive idea is to apply softmax to each weight, such that all weights are normalized to be a probability with value range from 0 to 1, representing the importance of each input. - The extra softmax leads to significant slowdown on GPU hardware Fast normalized fusion - where wi ≥ 0 is ensured by applying a Relu after each Wi - E = 0.0001 is a small value to avoid numerical instability - This fast fusion approach has very similar learning behavior and accuracy as the softmax-based fusion, but runs up to 30% faster on GPUs
  • 57. EFFICIENTDET : BiFPN Fast normalized fusion Ptd 6 P out 6 P out 5
  • 58. EFFICIENTDET : BiFPN Fast normalized fusion Ptd 6 P out 6 P out 5
  • 59. EFFICIENTDET : Architecture EfficientDet architecture - EfficientNet as the backbone network - BiFPN as the feature network n times - Shared class/box prediction network
  • 60. EFFICIENTDET : EFFICIENTNET Efficient Net 채널을 늘리거나 (width) 더 깊게 쌓거나 (Depth) Input Image를 키우거나 (Resolution) 적당한 방법으로 늘리자
  • 61. EFFICIENTDET : EFFICIENTNET Compound Scaling - We propose a new compound scaling method for object detection, which uses a simple compound coefficient φ to jointly scale up all dimensions of backbone network, BiFPN network, class/box network, and resolution. - Grid search for all dimensions is prohibitive expensive. Therefore, we use a heuristic-based scaling approach Backbone network - We reuse the same width/depth scaling coefficients of EfficientNet-B0 to B6
  • 62. EFFICIENTDET : EFFICIENTNET BiFPN network - We exponentially grow BiFPN width Wbifpn (#channels) - Linearly increase depth Dbifpn (#layers) Box/class prediction network - We fix their width to be always the same as BiFPN (i.e., Wpred = Wbifpn) - But linearly increase the depth (#layers) 채널 깊이, 레이어 수 Input image resolution - Since feature level 3-7 are used in BiFPN, the input resolution must be dividable by 2^7=128 - But linearly increase the depth (#layers)
  • 63. EFFICIENTDET : EFFICIENTNET Scaling configs for EfficientDet D0-D7 Wpred = Wbifpn EfficientNet-B0 to B6 Heuristic-based 만든 공식으로 Scale up 진행
  • 65. EFFICIENTDET : Experiments Model size and inference latency comparison
  • 66. EFFICIENTDET : Conclusion Weighted bidirectional feature network Customized compound scaling method Improve accuracy and efficiency EfficientDet-D7 achieves state-of-the-art accuracy 3.2x faster on GPUs and 8.1x faster on CPU
  • 68. Appendix ntos.gitbooks.io/artificial-inteligence/content/single-shot-detectors/ssd.html https://uk-kim.github.io/2018/12/07/Focal-loss-for-dense-object-detection.htmlDeep Learning for Generic Object Detection: A Survey https://taeu.github.io/paper/deeplearning-paper-ssd/ https://leonardoaraujosa https://towardsdatascience.com/review-fpn-feature-pyramid-network-object-detection-262fc7482610 https://www.groundai.com/project/pyramid-attention-network-for-semantic-segmentation/1 https://www.youtube.com/watch?v=11jDC8uZL0E