SlideShare una empresa de Scribd logo
1 de 51
Descargar para leer sin conexión
CNNS
FROM THE BASICS TO RECENT ADVANCES
Dmytro Mishkin
Center for Machine Perception
Czech Technical University in Prague
ducha.aiki@gmail.com
MY BACKGROUND
 PhD student of Czech Technical university in
Prague. Now fully working in Deep Learning,
recent paper “All you need is a good init”
added to Stanford CS231n course.
 Kaggler. 9th out of 1049 teams at National
Data Science Bowl
 CTO of Clear Research (clear.sx) Using deep
learning at work since 2014.
2
Short review of the CNN design
Architecture progress
AlexNet → VGGNet → ResNet
→ GoogLeNet
Initialization
Design choices
3
OUTLINE
IMAGENET WINNERS
4
AlexNet (original):Krizhevsky et.al., ImageNet Classification with Deep Convolutional Neural Networks, 2012.
CaffeNet: Jia et.al., Caffe: Convolutional Architecture for Fast Feature Embedding, 2014.
Image credit: Roberto Matheus Pinheiro Pereira, “Deep Learning Talk”.
Srinivas et.al “A Taxonomy of Deep Convolutional Neural Nets for Computer Vision”, 2016.
5
CAFFENET ARCHITECTURE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
6
CAFFENET ARCHITECTURE
VGGNET ARCHITECTURE
7
All convolutions are 3x3
Good performance,
but slow
INCEPTION (GOOGLENET):
BUILDING BLOCK
Szegedy et.al. Going Deeper with Convolutions. CVPR, 2015
Image credit: https://www.udacity.com/course/deep-learning--ud730
8
DEPTH LIMIT NOT REACHED YET
IN DEEP LEARNING
9
DEPTH LIMIT NOT REACHED YET
IN DEEP LEARNING
10
DEPTH LIMIT NOT REACHED YET
IN DEEP LEARNING
11
VGGNet – 19 layers, 19.6 billion FLOPS. Simonyan et.al., 2014
ResNet – 152 layers, 11.3 billion FLOPS. He et.al., 2015
Slide credit Šulc et.al. Very Deep Residual Networks with MaxOut for Plant Identification in the Wild, 2016.
Stochastic ResNet – 1200 layers.
Huang et.al., Deep Networks with Stochastic Depth, 2016
RESNET: RESIDUAL BLOCK
He et.al. Deep Residual Learning for Image Recognition, ICCV 2015
12
RESIDUAL NETWORKS: HOT TOPIC
 Identity mapping https://arxiv.org/abs/1603.05027
 Wide ResNets https://arxiv.org/abs/1605.07146
 Stochastic depth https://arxiv.org/abs/1603.09382
 Residual Inception https://arxiv.org/abs/1602.07261
 ResNets + ELU http://arxiv.org/pdf/1604.04112.pdf
 ResNet in ResNet
http://arxiv.org/pdf/1608.02908v1.pdf
 DC Nets http://arxiv.org/abs/1608.06993
 Weighted ResNet
http://arxiv.org/pdf/1605.08831v1.pdf
13
DATASETS USED IN PRESENTATION:
IMAGENET AND CIFAR-10
CIFAR-10:
50k training images
(32x32 px )
10k validation images
10 classes
ImageNet:
1.2M training images
(~ 256 x 256px )
50k validation images
1000 classes
Russakovskiy et.al, ImageNet Large Scale Visual Recognition Challenge, 2015
Krizhevsky, Learning Multiple Layers of Features from Tiny Images, 2009
14
Image credit: Hu et.al, 2015
Transferring Deep Convolutional Neural Networks for the Scene Classification
of High-Resolution Remote Sensing Imagery
15
CAFFENET ARCHITECTURE
LIST OF HYPER-PARAMETERS TESTED
16
REFERENCE METHODS: IMAGE SIZE SENSITIVE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
17
CHOICE OF NON-LINEARITY
18
CHOICE OF NON-LINEARITY
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
19
NON-LINEARITIES ON CAFFENET
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
20
 Gaussian noise with variance.
 𝑣𝑎𝑟 𝜔𝑙 = 0.01 (AlexNet, Krizhevsky et.al, 2012)
 𝑣𝑎𝑟 𝜔𝑙 = 1/𝑛𝑖𝑛𝑝𝑢𝑡𝑠 (Glorot et.al. 2010)
 𝑣𝑎𝑟 𝜔𝑙 = 2/𝑛𝑖𝑛𝑝𝑢𝑡𝑠 (He et.al. 2015)
 Orthonormal: (Saxe et.al. 2013)
Glorot → SVD → 𝜔𝑙 = V
 Data-dependent: LSUV (Mishkin et.al, 2016)
Mishkin and Matas. All you need is a good init. ICLR, 2016
21
WEIGHT INITIALIZATION FOR A VERY DEEP NET
a) Layer gain 𝐺 𝐿 < 1→ vanishing activations variance
b) Layer gain 𝐺 𝐿 > 1 → exploding activation variance
Mishkin and Matas. All you need is a good init. ICLR, 2016
22
WEIGHT INITIALIZATION INFLUENCES ACTIVATIONS
ACTIVATIONS INFLUENCES MAGNITUDE OF
GRADIENT COMPONENTS
23
Mishkin and Matas. All you need is a good init. ICLR, 2016
Algorithm 1. Layer-sequential unit-variance orthogonal initialization.
𝑳 − convolution or fully-connected layer, 𝑊𝐿 − its weights, 𝑂 𝐿 − layer output,
𝜀 − variance tolerance,
𝑇𝑖 − iteration number, 𝑇 𝑚𝑎𝑥 − max number of iterations.
Pre-initialize network with orthonormal matrices as in Saxe et.al. (2013)
for each convolutional and fully-connected layer 𝑳 do
do forward pass with mini-batch
calculate v𝑎𝑟(𝑂 𝐿)
𝑊𝐿
𝑖+1
= ൗ𝑊𝐿
𝑖
𝑣𝑎𝑟(𝑂 𝐿)
until 𝑣𝑎𝑟 𝑂 𝐿 − 1.0 < 𝜀 or (𝑇𝑖 > 𝑇𝑚𝑎𝑥)
end for
*The LSUV algorithm does not deal with biases and initializes them with zeros
KEEPING PRODUCT OF PER-LAYER GAINS ~ 1:
LAYER-SEQUENTIAL UNIT-VARIANCE
ORTHOGONAL INITIALIZATION
Mishkin and Matas. All you need is a good init. ICLR, 2016
24
COMPARISON OF THE INITIALIZATIONS FOR
DIFFERENT ACTIVATIONS
 CIFAR-10 FitNet, accuracy [%]

 CIFAR-10 FitResNet, accuracy [%]
Mishkin and Matas. All you need is a good init. ICLR, 2016
25
LSUV INITIALIZATION IMPLEMENTATIONS
Mishkin and Matas. All you need is a good init. ICLR, 2016
 Caffe https://github.com/ducha-aiki/LSUVinit
 Keras https://github.com/ducha-aiki/LSUV-keras
 Torch https://github.com/yobibyte/torch-lsuv
26
BATCH NORMALIZATION
(AFTER EVERY CONVOLUTION LAYER)
Ioffe et.al, ICML 2015
27
BATCH NORMALIZATION: WHERE,
BEFORE OR AFTER NON-LINEARITY?
Mishkin and Matas. All you need is a good init. ICLR, 2016
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
ImageNet, top-1 accuracy
CIFAR-10, top-1 accuracy,
FitNet4 network
In short:
better to test
with
your architecture
and dataset :)
28
Network No BN Before ReLU After ReLU
CaffeNet128-FC2048 47.1 47.8 49.9
GoogLeNet128 61.9 60.3 59.6
Non-linearity BN Before BN After
TanH 88.1 89.2
ReLU 92.6 92.5
MaxOut 92.3 92.9
BATCH NORMALIZATION
SOMETIMES WORKS TOO GOOD AND HIDES PROBLEMS
29
Case:
CNN has less number outputs ( just typo),
than classes in dataset: 26 vs. 28
BatchNormed “learns well”
Plain CNN diverges
NON-LINEARITIES ON CAFFENET, WITH
BATCH NORMALIZATION
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
30
NON-LINEARITIES: TAKE AWAY MESSAGE
 Use ELU without batch normalization
 Or ReLU + BN
 Try maxout for the final layers
 Fallback solution (if something goes wrong) – ReLU
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
31
BUT IN SMALL DATA REGIME ( ~50K IMAGES)
TRY LEAKY OR RANDOMIZED RELU
 Accuracy [%], Network in Network architecture
 LogLoss, Plankton VGG architecture
Xu et.al. Empirical Evaluation of Rectified Activations in Convolutional Network ICLR 2015
ReLU VLReLU RReLU PReLU
CIFAR-10 87.55 88.80 88.81 88.20
CIFAR-100 57.10 59.60 59.80 58.4
ReLU VLReLU RReLU PReLU
KNDB 0.77 0.73 0.72 0.74
32
INPUT IMAGE SIZE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
33
PADDING TYPES
Zero padding, stride = 2
Dumoulin and Visin. A guide to convolution arithmetic for deep learning. ArXiv 2016
No padding, stride = 2 Zero padding, stride = 1
34
PADDING
 Zero-padding:
 Preserving spatial size, not “washing out”
information
 Dropout-like augmentation by zeros
Caffenet128
 with conv padding: 47% top-1 acc
 w/o conv padding: 41% top-1 acc.
35
MAX POOLING:
PADDING AND KERNEL
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
36
POOLING METHODS
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
37
POOLING METHODS
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
38
LEARNING RATE POLICY
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
39
IMAGE PREPROCESSING
 Subtract mean pixel (training set), divide by std.
 RGB is the best (standard) colorspace for CNN
 Do nothing more…
 …unless you have specific dataset.
Subtract local mean pixel
B.Graham, 2015
Kaggle Diabetic Retinopathy Competition report
40
IMAGE PREPROCESSING:
WHAT DOESN`T WORK
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
41
IMAGE PREPROCESSING:
LET`S LEARN THE COLORSPACE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
Image credit: https://www.udacity.com/course/deep-learning--ud730
42
DATASET QUALITY AND SIZE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
43
NETWORK WIDTH: SATURATION AND SPEED
PROBLEM
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
44
BATCH SIZE AND LEARNING RATE
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
45
CLASSIFIER DESIGN
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
Ren et.al Object Detection Networks on Convolutional Feature Maps, arXiv 2016
Take home:
put
fully-connected
before final layer,
not earlier
46
APPLYING ALTOGETHER
Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016
47
>5 pp. additional
top-1 accuracy
for free.
THANK YOU FOR THE ATTENTION
 Any questions?
 All logs, graphs and network definitions:
https://github.com/ducha-aiki/caffenet-benchmark
Feel free to add your tests :)
 The paper is here: https://arxiv.org/abs/1606.02228
ducha.aiki@gmail.com
mishkdmy@cmp.felk.cvut.cz 48
ARCHITECTURE
 Use as small filters as possible
 3x3 + ReLU + 3x3 + ReLU > 5x5 + ReLU.
 3x1 + 1x3 > 3x3.
 2x2 + 2x2 > 3x3
 Exception: 1st layer. Too computationally
ineffective to use 3x3 there.
Convolutional Neural Networks at Constrained Time Cost. He and Sun, CVPR 2015
49
CAFFENET TRAINING
Mishkin and Matas. All you need is a good init. ICLR, 2016
50
GOOGLENET TRAINING
Mishkin and Matas. All you need is a good init. ICLR, 2016
51

Más contenido relacionado

La actualidad más candente

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 

La actualidad más candente (20)

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Deep learning
Deep learningDeep learning
Deep learning
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognition
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
AlexNet
AlexNetAlexNet
AlexNet
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 

Similar a CNNs: from the Basics to Recent Advances

UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
NoorUlHaq47
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Sitakanta Mishra
 

Similar a CNNs: from the Basics to Recent Advances (20)

ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 
CNN
CNNCNN
CNN
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep Learning
 
Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?Can AI say from our eyes when we read relevant information?
Can AI say from our eyes when we read relevant information?
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
PointNet
PointNetPointNet
PointNet
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
BLOOD TISSUE IMAGE TO IDENTIFY MALARIA DISEASE CLASSIFICATION
BLOOD TISSUE IMAGE TO IDENTIFY MALARIA DISEASE CLASSIFICATIONBLOOD TISSUE IMAGE TO IDENTIFY MALARIA DISEASE CLASSIFICATION
BLOOD TISSUE IMAGE TO IDENTIFY MALARIA DISEASE CLASSIFICATION
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
ct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdfct_meeting_final_jcy (1).pdf
ct_meeting_final_jcy (1).pdf
 
DeepFix: a fully convolutional neural network for predicting human fixations...
DeepFix:  a fully convolutional neural network for predicting human fixations...DeepFix:  a fully convolutional neural network for predicting human fixations...
DeepFix: a fully convolutional neural network for predicting human fixations...
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Pres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_engPres Tesi LM-2016+transcript_eng
Pres Tesi LM-2016+transcript_eng
 

Último

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Último (20)

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 

CNNs: from the Basics to Recent Advances

  • 1. CNNS FROM THE BASICS TO RECENT ADVANCES Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague ducha.aiki@gmail.com
  • 2. MY BACKGROUND  PhD student of Czech Technical university in Prague. Now fully working in Deep Learning, recent paper “All you need is a good init” added to Stanford CS231n course.  Kaggler. 9th out of 1049 teams at National Data Science Bowl  CTO of Clear Research (clear.sx) Using deep learning at work since 2014. 2
  • 3. Short review of the CNN design Architecture progress AlexNet → VGGNet → ResNet → GoogLeNet Initialization Design choices 3 OUTLINE
  • 5. AlexNet (original):Krizhevsky et.al., ImageNet Classification with Deep Convolutional Neural Networks, 2012. CaffeNet: Jia et.al., Caffe: Convolutional Architecture for Fast Feature Embedding, 2014. Image credit: Roberto Matheus Pinheiro Pereira, “Deep Learning Talk”. Srinivas et.al “A Taxonomy of Deep Convolutional Neural Nets for Computer Vision”, 2016. 5 CAFFENET ARCHITECTURE
  • 6. Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 6 CAFFENET ARCHITECTURE
  • 7. VGGNET ARCHITECTURE 7 All convolutions are 3x3 Good performance, but slow
  • 8. INCEPTION (GOOGLENET): BUILDING BLOCK Szegedy et.al. Going Deeper with Convolutions. CVPR, 2015 Image credit: https://www.udacity.com/course/deep-learning--ud730 8
  • 9. DEPTH LIMIT NOT REACHED YET IN DEEP LEARNING 9
  • 10. DEPTH LIMIT NOT REACHED YET IN DEEP LEARNING 10
  • 11. DEPTH LIMIT NOT REACHED YET IN DEEP LEARNING 11 VGGNet – 19 layers, 19.6 billion FLOPS. Simonyan et.al., 2014 ResNet – 152 layers, 11.3 billion FLOPS. He et.al., 2015 Slide credit Šulc et.al. Very Deep Residual Networks with MaxOut for Plant Identification in the Wild, 2016. Stochastic ResNet – 1200 layers. Huang et.al., Deep Networks with Stochastic Depth, 2016
  • 12. RESNET: RESIDUAL BLOCK He et.al. Deep Residual Learning for Image Recognition, ICCV 2015 12
  • 13. RESIDUAL NETWORKS: HOT TOPIC  Identity mapping https://arxiv.org/abs/1603.05027  Wide ResNets https://arxiv.org/abs/1605.07146  Stochastic depth https://arxiv.org/abs/1603.09382  Residual Inception https://arxiv.org/abs/1602.07261  ResNets + ELU http://arxiv.org/pdf/1604.04112.pdf  ResNet in ResNet http://arxiv.org/pdf/1608.02908v1.pdf  DC Nets http://arxiv.org/abs/1608.06993  Weighted ResNet http://arxiv.org/pdf/1605.08831v1.pdf 13
  • 14. DATASETS USED IN PRESENTATION: IMAGENET AND CIFAR-10 CIFAR-10: 50k training images (32x32 px ) 10k validation images 10 classes ImageNet: 1.2M training images (~ 256 x 256px ) 50k validation images 1000 classes Russakovskiy et.al, ImageNet Large Scale Visual Recognition Challenge, 2015 Krizhevsky, Learning Multiple Layers of Features from Tiny Images, 2009 14
  • 15. Image credit: Hu et.al, 2015 Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery 15 CAFFENET ARCHITECTURE
  • 17. REFERENCE METHODS: IMAGE SIZE SENSITIVE Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 17
  • 19. CHOICE OF NON-LINEARITY Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 19
  • 20. NON-LINEARITIES ON CAFFENET Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 20
  • 21.  Gaussian noise with variance.  𝑣𝑎𝑟 𝜔𝑙 = 0.01 (AlexNet, Krizhevsky et.al, 2012)  𝑣𝑎𝑟 𝜔𝑙 = 1/𝑛𝑖𝑛𝑝𝑢𝑡𝑠 (Glorot et.al. 2010)  𝑣𝑎𝑟 𝜔𝑙 = 2/𝑛𝑖𝑛𝑝𝑢𝑡𝑠 (He et.al. 2015)  Orthonormal: (Saxe et.al. 2013) Glorot → SVD → 𝜔𝑙 = V  Data-dependent: LSUV (Mishkin et.al, 2016) Mishkin and Matas. All you need is a good init. ICLR, 2016 21 WEIGHT INITIALIZATION FOR A VERY DEEP NET
  • 22. a) Layer gain 𝐺 𝐿 < 1→ vanishing activations variance b) Layer gain 𝐺 𝐿 > 1 → exploding activation variance Mishkin and Matas. All you need is a good init. ICLR, 2016 22 WEIGHT INITIALIZATION INFLUENCES ACTIVATIONS
  • 23. ACTIVATIONS INFLUENCES MAGNITUDE OF GRADIENT COMPONENTS 23 Mishkin and Matas. All you need is a good init. ICLR, 2016
  • 24. Algorithm 1. Layer-sequential unit-variance orthogonal initialization. 𝑳 − convolution or fully-connected layer, 𝑊𝐿 − its weights, 𝑂 𝐿 − layer output, 𝜀 − variance tolerance, 𝑇𝑖 − iteration number, 𝑇 𝑚𝑎𝑥 − max number of iterations. Pre-initialize network with orthonormal matrices as in Saxe et.al. (2013) for each convolutional and fully-connected layer 𝑳 do do forward pass with mini-batch calculate v𝑎𝑟(𝑂 𝐿) 𝑊𝐿 𝑖+1 = ൗ𝑊𝐿 𝑖 𝑣𝑎𝑟(𝑂 𝐿) until 𝑣𝑎𝑟 𝑂 𝐿 − 1.0 < 𝜀 or (𝑇𝑖 > 𝑇𝑚𝑎𝑥) end for *The LSUV algorithm does not deal with biases and initializes them with zeros KEEPING PRODUCT OF PER-LAYER GAINS ~ 1: LAYER-SEQUENTIAL UNIT-VARIANCE ORTHOGONAL INITIALIZATION Mishkin and Matas. All you need is a good init. ICLR, 2016 24
  • 25. COMPARISON OF THE INITIALIZATIONS FOR DIFFERENT ACTIVATIONS  CIFAR-10 FitNet, accuracy [%]   CIFAR-10 FitResNet, accuracy [%] Mishkin and Matas. All you need is a good init. ICLR, 2016 25
  • 26. LSUV INITIALIZATION IMPLEMENTATIONS Mishkin and Matas. All you need is a good init. ICLR, 2016  Caffe https://github.com/ducha-aiki/LSUVinit  Keras https://github.com/ducha-aiki/LSUV-keras  Torch https://github.com/yobibyte/torch-lsuv 26
  • 27. BATCH NORMALIZATION (AFTER EVERY CONVOLUTION LAYER) Ioffe et.al, ICML 2015 27
  • 28. BATCH NORMALIZATION: WHERE, BEFORE OR AFTER NON-LINEARITY? Mishkin and Matas. All you need is a good init. ICLR, 2016 Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 ImageNet, top-1 accuracy CIFAR-10, top-1 accuracy, FitNet4 network In short: better to test with your architecture and dataset :) 28 Network No BN Before ReLU After ReLU CaffeNet128-FC2048 47.1 47.8 49.9 GoogLeNet128 61.9 60.3 59.6 Non-linearity BN Before BN After TanH 88.1 89.2 ReLU 92.6 92.5 MaxOut 92.3 92.9
  • 29. BATCH NORMALIZATION SOMETIMES WORKS TOO GOOD AND HIDES PROBLEMS 29 Case: CNN has less number outputs ( just typo), than classes in dataset: 26 vs. 28 BatchNormed “learns well” Plain CNN diverges
  • 30. NON-LINEARITIES ON CAFFENET, WITH BATCH NORMALIZATION Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 30
  • 31. NON-LINEARITIES: TAKE AWAY MESSAGE  Use ELU without batch normalization  Or ReLU + BN  Try maxout for the final layers  Fallback solution (if something goes wrong) – ReLU Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 31
  • 32. BUT IN SMALL DATA REGIME ( ~50K IMAGES) TRY LEAKY OR RANDOMIZED RELU  Accuracy [%], Network in Network architecture  LogLoss, Plankton VGG architecture Xu et.al. Empirical Evaluation of Rectified Activations in Convolutional Network ICLR 2015 ReLU VLReLU RReLU PReLU CIFAR-10 87.55 88.80 88.81 88.20 CIFAR-100 57.10 59.60 59.80 58.4 ReLU VLReLU RReLU PReLU KNDB 0.77 0.73 0.72 0.74 32
  • 33. INPUT IMAGE SIZE Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 33
  • 34. PADDING TYPES Zero padding, stride = 2 Dumoulin and Visin. A guide to convolution arithmetic for deep learning. ArXiv 2016 No padding, stride = 2 Zero padding, stride = 1 34
  • 35. PADDING  Zero-padding:  Preserving spatial size, not “washing out” information  Dropout-like augmentation by zeros Caffenet128  with conv padding: 47% top-1 acc  w/o conv padding: 41% top-1 acc. 35
  • 36. MAX POOLING: PADDING AND KERNEL Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 36
  • 37. POOLING METHODS Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 37
  • 38. POOLING METHODS Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 38
  • 39. LEARNING RATE POLICY Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 39
  • 40. IMAGE PREPROCESSING  Subtract mean pixel (training set), divide by std.  RGB is the best (standard) colorspace for CNN  Do nothing more…  …unless you have specific dataset. Subtract local mean pixel B.Graham, 2015 Kaggle Diabetic Retinopathy Competition report 40
  • 41. IMAGE PREPROCESSING: WHAT DOESN`T WORK Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 41
  • 42. IMAGE PREPROCESSING: LET`S LEARN THE COLORSPACE Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 Image credit: https://www.udacity.com/course/deep-learning--ud730 42
  • 43. DATASET QUALITY AND SIZE Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 43
  • 44. NETWORK WIDTH: SATURATION AND SPEED PROBLEM Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 44
  • 45. BATCH SIZE AND LEARNING RATE Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 45
  • 46. CLASSIFIER DESIGN Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 Ren et.al Object Detection Networks on Convolutional Feature Maps, arXiv 2016 Take home: put fully-connected before final layer, not earlier 46
  • 47. APPLYING ALTOGETHER Mishkin et.al. Systematic evaluation of CNN advances on the ImageNet, arXiv 2016 47 >5 pp. additional top-1 accuracy for free.
  • 48. THANK YOU FOR THE ATTENTION  Any questions?  All logs, graphs and network definitions: https://github.com/ducha-aiki/caffenet-benchmark Feel free to add your tests :)  The paper is here: https://arxiv.org/abs/1606.02228 ducha.aiki@gmail.com mishkdmy@cmp.felk.cvut.cz 48
  • 49. ARCHITECTURE  Use as small filters as possible  3x3 + ReLU + 3x3 + ReLU > 5x5 + ReLU.  3x1 + 1x3 > 3x3.  2x2 + 2x2 > 3x3  Exception: 1st layer. Too computationally ineffective to use 3x3 there. Convolutional Neural Networks at Constrained Time Cost. He and Sun, CVPR 2015 49
  • 50. CAFFENET TRAINING Mishkin and Matas. All you need is a good init. ICLR, 2016 50
  • 51. GOOGLENET TRAINING Mishkin and Matas. All you need is a good init. ICLR, 2016 51