What is beyond using Tensorflow, GPU or TPU to process images seamlessly? Do we have a silver bullet for image processing? Over the years, image processing has picked up a different level of attraction. Everyone can think about its ease of usability because it has become a reality now. We have started seeing how Residual Neural Network architecture is being used for different cases and not only that, how Residual Neural network is being tweaked to solve different problems. Along with tweaking the ResNet, preprocessing is also being improved to support different architecture for this matter.
Everyone has almost become cyborg already with mobile phones in our hands and apparently until human beings bring the AI/ML to the phones completely they are not taking any rest. We are going to see the development of different architecture and algorithms around running AI/ML on low configuration devices.
In this session, we are going to talk about different research papers submitted for these matters and some implementations for the same as well.
3. 3
About Knoldus MachineX
MachineX is a group of data wizards.
We are a team of Data Scientist and engineers with a
product mindset who deliver competitive business
advantage.
8. 8
Enable organizations to
capture new value
and business capabilities
Innovation Labs
Consistently blogging, to
share our knowledge,
research
Blogs
Deeplearning, Coursera,
Stanford certified
professionals
Certifications
Insight & perspective to help
you to make right business
decisions
TOK Sessions
It’s great to contribute back
to the community. We
continuously advance open
source technologies to meet
demanding business
requirements.
Open Source
Contribution
13. 13
Problems
The problem with this pipeline
● Feature extraction cannot be tweaked according to
the classes and images
● Completely different from how we humans learn to
recognize things.
17. 17
The Application of
skills, knowledge,
and/or attitudes that
were learned in one
situation to another
learning situation
transfer learning is usually
expressed through the use of
pre-trained models
26. 26
AlexNet
● Data augmentation is carried out to reduce overfitting
● Used Relu which achieved 25% error rate about 6 times faster
than the same network with tanh nonlinearity.
● AlexNet introduced Local Response Normalization (LRN) to
help with the vanishing gradient problem
27. 27
VGGNet
● VGG16 has a total of 138 million parameters
● Conv kernels are of size 3x3 and maxpool kernels are of size 2x2 with
stride of two
32. 32
Hierarchical Features and role of Depth
● Low, Mid , and High-level features
● More layers enrich the “levels” of the features
● Previous ImageNet models have depths of 16 and 30
layers
35. 35
Construction Insight
● Consider a shallow architecture and its deeper
counterpart
● The deeper model would would just need to copy the
shallower model with identity mapping
● Construction solution suggests that a deeper model
should produce no higher training error that its shallow
counterpart
36. 36
Residual Functions
● We explicitly reformulate the layers as learning residual functions
with reference to the layer inputs, instead of learning
unreferenced functions
● H[x] = F[x] + x
39. 39
Experiment
● 152 layer Layers on ImageNet
○ 8* Deeper than VGGNet
○ Less parameters
● ResNet achieve 3.57% error on Imagenet test
○ 1st place in ILSVRC
40. 40
Results
● AlexNet and ResNet-152, both have about 60M parameters but there is
about 10% difference in their top-5 accuracy
● VGGNet not only has a higher number of parameters and FLOP as compared
to ResNet-152, but also has a decreased accuracy
● Training an AlexNet takes about the same time as training Inception (10
times less memory requirements)
42. 42
History and its importance
● Origin of CNN(1980s-1999)
● Stagnation of CNN(Early 2000)
● Revival of CNN (2006-2011)
● Rise of CNN (2012-2014)
● Rapid increase in Architectural Innovations (2015-present)
● Important because we are not done yet.
50. 50
Attention based CNNs
● Residual Attention Neural Network
● Convolutional block attention
● Concurrent Squeeze and Excitation
51. 51
Improvement summary
● Learning capacity of CNN is significantly improved over
the years by exploiting depth and other structural
modifications.
○ Activation, loss function, optimization, regularization,
learning algorithms, and restructuring of processing
units.
● Major improvement on CNN
○ Main boost in CNN performance has been achieved by
replacing the conventional layer structure with blocks
52. 52
Challenge Exists
● Deep NN are generally like a black box and thus may lack
in interpretation and explanation
● Each layer of CNN automatically tries to extract better and
problem specific features related to the task
● Deep CNNs are based on supervised learning
mechanism, and therefore, availability of a large and
annotated data is required for its proper learning
● Hyperparameter selection highly influences the
performance of CNN
● Efficient training of CNN demands powerful hardware
resources such as GPUs.
54. 54
References
● [1]. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional
neural networks. In Advances in neural information processing systems,pages 1097–1105,2012.
● [2]. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint
arXiv:1512.03385,2015.
● [3]. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556,2014.
● [4]. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A.
Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition,pages 1–9,2015.
● https://arxiv.org/pdf/1901.06032.pdf