SlideShare una empresa de Scribd logo
1 de 25
DEEP LEARNING
Dr.S.SHAJUN NISHA, MCA.,M.Phil.,M.Tech.,MBA.,Ph.D.,
Assistant Professor & Head ,
PG & Research Dept. of Computer Science
Sadakathullah Appa College(Autonomous),
Tirunelveli.
shajunnisha_s@yahoo.com
+91 99420 96220
Introduction
• Deep learning is a form of machine learning that uses
a model of computing that's very much inspired by the
structure of the brain. Hence we call this model a
neural network. The basic foundational unit of a
neural network is the neuron).
• Each neuron has a set of inputs, each of which is given
a specific weight. The neuron computes some function
on these weighted inputs. A linear neuron takes a
linear combination of the weighted inputs and apply
activation function (sigmoid, tanh etc.)
• Network feeds the weighted sum of the inputs into the
logistic function (in case of sigmoid activation function).
The logistic function returns a value between 0 and 1.
When the weighted sum is very negative, the return
value is very close to 0. When the weighted sum is very
large and positive, the return value is very close to 1
Biological
Neuron
Artificial
Neuron
Number of Neurons in
Species
Introduction
 Softwares used in Deep Learning
 Theano: Python based Deep Learning Library
 TensorFlow: Google’s Deep Learning library runs on top of
Python/C++
 Keras / Lasagne: Light weight wrapper which sits on top
of Theano/TensorFlow, enables faster model prototyping
 Torch: Lua based Deep Learning library with wide support for
machine learning algorithms
 Caffe: Deep Learning library primarily used for processing
pictures
FUNDAMENTALS
 Activation Functions: Every activation function takes a single
number and performs a certain fixed mathematical operation on it.
Below are popularly used activation functions in Deep Learning
 Sigmoid
 Tanh
 Relu
 Sigmoid: Sigmoid Linear
 It has mathematical form σ(x) = 1 / (1+e−x). It takes real valued
number and squashes it into range between 0 and 1. Sigmoid is
popular choice, which makes ease of calculating derivatives and
easy to interpret
 Tanh: Tanh squashes the real valued number to the range [-
1,1]. Output is zero centered. In practice tanh non-linearity is
always preferred to the sigmoid nonlinearity. Also, it can be
proved that Tanh is scaled sigmoid neuron tanh(x) = 2σ(2x) − 1
Sigmoid Activation
Function
.
Tanh Activation
Function
FUNDAMENTALS
.
 ReLU (Rectified Linear Unit): ReLU has
become very popular in last few years. It
computes the function f(x)=max(0,x).
Activation is simply thresholds at zero
 Linear: Linear activation function is used
in Linear regression problems where it
provides derivative always as 1 due to the
function used is f(x) = x
 Relu is now popularly being used in place
of Sigmoid or Tanh due to its better
property of convergence
ReLU Activation Function
Linear Activation Function
Weigh
t9
Activation = g(BiasWeight1 + Weight1 *
Input1 + Weight2 * Input2)
Weight1
Weig
ht2
FUNDAMENTALS
 Forward propagation & Backpropagation:
During the forward propagation stage,
features are input to the network and feed
However, we can calculate error of the
network only at the update the weights to
optimal, we must propagate the network’s
errors backwards through its layers
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Forward Propagation of
Layer 1 Neurons
Hidden Layer 1 Hidden Layer 2
BiasWeight1 Output
Layer
Input
Layer
Activation = g(BiasWeight4 + Weight7 * Hidden1 +
Weight8 * Hidden2+
Weight9 * Hidden3)
Weight
7
Weig
ht8
BiasWeig
ht4
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Forward Propagation of Layer 2 Neurons
Hidden Layer 1 Hidden Layer 2
Output
Layer
Input
Layer
FUNDAMENTALS
Weight
18
Activation = g(BiasWeight7 + Weight16 * Hidden4
+ Weight17 *
Hidden5+ Weight18
* Hidden6)
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
2
Forward Propagation of
Output Layer Neurons
Hidden Layer 1 Hidden Layer 2 Output
Layer
BiasWeight7
Weight16
O
u
t
p
u
t
1
Weight17
Input
Layer
Error (Output 1)= g’(Output
1) * (True1 – Output1)
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Backpropagation of
Output Layer Neurons
Hidden Layer 1 Hidden Layer 2 Output
Layer
Input
Layer
Weight
13
Weight
10
Weigh
t7
Weight
19
FUNDAMENTALS
Error (Hidden4)= g’(Hidden4) + (Weight16 *
Error(Output1) + Weight19 * Error(Output2))
Weight
16
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Backpropagation of
Hidden Layer2 Neurons
Hidden Layer 1 Hidden Layer 2 Output
Layer
Input
Layer
Error (Hidden1)= g’(Hidden1) * (Weight7 *
Error(Hidden4) + Weight10 *
Error(Hidden5) +
Weight13*Error(Hidden6))
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Backpropagation of
Hidden Layer1 Neurons
Hidden Layer 1 Hidden Layer 2 Output
Layer
Input
Layer
Weigh
t8
BiasWeig
ht4
Weigh
t1
Input
Layer
Weigh
t2
FUNDAMENTALS
BiasWeight1= BiasWeight1 +
a * Error(Hidden1) * 1
Weight1 = Weight1 + a *
Error(Hidden1) * Input1
Weight2 = Weight2 + a *
Error(Hidden1) * Input2
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Hidden
Layer 1
BiasWeight1
Hidden
Layer 2
Output
Layer
Weigh
t7
Input
Layer
Weigh
t9
BiasWeight4= BiasWeight4 + a *
Error(Hidden4) * 1 Weight7 =
Weight7 + a * Error(Hidden4) *
Hidden1 Weight8 = Weight8 + a
* Error(Hidden4) * Hidden2
Weight9 = Weight9 + a *
Error(Hidden4) * Hidden3
Hidd
en
1
Hidd
en
2
Hidd
en
3
Input
1
Input
2
Hidd
en
4
Hidd
en
5
Hidd
en
6
Outp
ut
1
Outp
ut
2
Output
Layer
Updating Weights of Hidden
Layer 1 – Hidden Layer 2 during
Forward Propagation
Hidden Layer 1 Hidden Layer 2
Updating Weights of Hidden Layer
1 – Hidden Layer 2 during
backward Propagation
FUNDAMENTALS
 Dropout: Dropout is a regularization in Neural
networks to avoid over fitting the data. Typically
Dropout is 0.8 (80 % neurons present randomly all
the time) in initial layers and 0.5 in middle layers
 Optimization: Various techniques used to optimize
the weights including
 SGD (Stochastic Gradient Descent)
 Momentum
 Nag (Nesterov Accelerated Gradient)
 Adagrad (Adaptive gradient)
 Adadelta
 Rmsprop
 Adam (Adaptive moment estimation)
In practice Adam is good default choice, if you
cant afford full batch updates, then try out L-
BFGS
Application of Dropout in
Neural network
Optimization of Error
Surface
FUNDAMENTALS
 Stochastic Gradient Descent (SGD): Gradient
descent is a way to minimize an objective function
J(θ) parameterized by a model’s parameter θ∈Rd by
updating the parameters in the opposite direction of
the gradient of the objective function w.r.to the
parameters. Learning rate determines the size of
steps taken to reach minimum.
 Batch Gradient Descent (all training observations
per each iteration)
 SGD (1 observation per iteration)
 Mini Batch Gradient Descent (size of about
50 training observations for each iteration)
 Momentum: SGD has trouble navigating surface curves
much more steeply in one dimension than in other, in
these scenarios SGD
Gradient
Descent
Comparison of SGD without & with
Momentum
FUNDAMENTALS
• Nesterov Accelerated Gradient (NAG): If a ball rolls down a hill and blindly
follows a slope, is highly unsatisfactory and it should have a notion of where it
is going so that it knows to slow down before the hill slopes up again. NAG is a
way to give momentum term this kind of prescience
• Adagrad: Adagrad is an algorithm for gradient-based optimization that adapts
the differential learning rate to parameters, performing larger updates for
infrequent and smaller updates.
Nesterov Momentum
Update
FUNDAMENTALS
 Adadelta: Adadelta is an extension of Adagrad that seeks to reduce its
aggressive, monotonically decreasing learning rate. Instead of accumulating all
past squared gradients, Adadelta restricts the window of accumulated past
gradients to some fixed size w
RMSprop: RMSprop and Adadelta have both developed independently around the
same time to resolve Adagrad’s radically diminishing learning rates
 Adam (Adaptive Moment Estimation): Adam is another method that computes
adaptive learning rates for each parameter. In addition to storing an exponentially
decaying average of past squared gradients like Adadelta and RMSprop, Adam
also keeps an exponentially decaying average of past gradients similar to
momentum
Deep Architecture of ANN
(Artificial Neural Network)
 All hidden layers should have same
number of neurons per layer
 Typically 2 hidden layers are good
enough to solve majority of problems
Using scaling/batch normalization
(mean 0, variance
 for all input variables after each layer
improves convergence effectiveness
 Reduction in step size after each
iteration improves convergence, in
addition to usage of momentum &
Dropout
Decision Boundary of Deep
Architecture
CONVOLUTIONAL NEURAL NETWORKS
 Convolutional Neural Networks used in
picture analysis, including image
captioning, digit recognizer and various
visual system processing E.g.: Vision
detection in Self driving cars, Hand
written Digit recognizer, Google
Deepmind’ s Alphago
Object recognition and classification
using Convolutional Networks
CNN application in Self
Driving Cars
CNN application in Handwritten
Digit Recognizer
CONVOLUTIONAL NEURAL NETWORKS
Hubel & Weisel experiments
on Cat’s Vision
Formation of features over layers
using Neural Networks
• Hubel & Wiesel inserted microscopic electrodes
into the visual cortex of anesthetized cat to read
activity of the single cells in visual cortex while
presenting various stimuli to it’s eyes during
experiment on 1959. For which received noble
prize under Medicine category on 1981Hubel &
Wiesel discovered that vision is hierarchical,
consists of simple cells, complex cells & hyper-
complex cells
Object Detection
Vision is Hierarchical phenomenon
CONVOLUTIONAL NEURAL NETWORKS
• Input layer/picture consists of 32 x 32
pixels with 3 colors (Red, Green & Blue)
(32 x 32 x 3)
• Convolution layer is formed by running a
filter (5 x 5 x 3) over Input layer which
will result in (28 x 28 x 1)
Input Layer & Filter
Running filter over Input Layer to form
Convolution layer Complete Convolution Layer
from filter
CONVOLUTIONAL NEURAL NETWORKS
 2nd Convolution layer has been created in
similar way with another filter
 After striding/convolving with 6 filters,
new layer has been created with 28 x 28 x
6 dimension
Complete Convolution layer
from Filter 2
Convolution layers created
with 6 Filters Formation of complete 2nd layer
CONVOLUTIONAL NEURAL NETWORKS
Max pooling working methodology
Max pool layer after performing pooling
• Pooling Layer: Pooling layer makes the
representation smaller and more
manageable. Operates over each
activation map independently. Pooling
applies on width and breadth of the layer
and depth will remains the same during
pooling stage
• Padding: Size of the image (width &
breadth) is getting shrunk consecutively,
this issue is undesirable during deep
networks, padding keeps the size of picture
constant or controllable in size throughout
the network
Zero padding on 6 x 6 picture
RECURRENT NEURAL NETWORKS
 Recurrent neural networks are very much
useful in sequence remembering, time
series forecasting, Image captioning,
machine translation etc.
 RNNs are useful in building A.I. Chabot in
which sequence of words with all syntaxes &
semantics would be remembered and
subsequently provide answers to given
questions
Recurrent Neural Networks
Image Captioning using Convolutional and
Recurrent Neural Network
Application of RNN in A.I. Chatbot
RECURRENT NEURAL NETWORKS
• Recurrent neural network is used for
processing sequence of vectors x by
applying a recurrence formula at every time
step
Recurrent Neural Network
Vanilla
Network
Image
Captioning
(image -> Seq. of
words)
Sentiment
Classification
(Seq. of words -> Sentiment)
Machine
Translation
(Seq. of words -> Seq. of
words)
Video
Classification on
frame level
yt
x
t
RNN
y0
x0
RNN
y1
x1
RNN
y2
x2
RNN
yt
x
t
RNN
RECURRENT NEURAL NETWORKS
 Vanishing gradient problem with RNN:
Gradients do vanishes quickly with more number
of layers and this issue is severe with RNN.
Vanishing gradients leads to slow training rates.
LSTM & GRU are used to avoid this issue
 LSTM (Long Short Term Memory): LSTM is an
artificial neural network contains LSTM blocks in
addition to regular network units. LSTM block
contains gates that determine when the input is
significant enough to remember, when it should
continue to remember or when it should forget
the value and when it should output the value
LSTM Working Principle
(Backpropagation )
LSTM Cell
Deep Autoencoders
 Deep Autoencoder: Autoencoder neural network is an unsupervised learning
algorithm that applies backpropagation. Stacking layers of Autoencoders produces
a deeper architecture known as Stacked or Deep Autoencoders
 Application of Encoders in Face recognition, Speech recognition, Signal
Denoising etc.
PCA vs Deep Autoencoder for
MNIST Data
Face Recognition using Deep
Autoencoders
Deep Autoencoders
• Deep Autoencoder: Autoencoder neural
network is an unsupervised learning
algorithm that applies backpropagation,
setting the target values to be equal to
the inputs. i.e. it uses y ( i ) = x ( i )
• Typically deep Autoencoder is composed of
two segments, encoding network and
decoding network.
Deep Autoencoder
Examples
Training Deep
Autoencoder
Autoencoder with
Classifier
Reconstruction of features with
weight transpose
Auto encoders in Deep Learning

Más contenido relacionado

La actualidad más candente

Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithmKIRAN R
 
Classification By Back Propagation
Classification By Back PropagationClassification By Back Propagation
Classification By Back PropagationBineeshJose99
 
NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief netszukun
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)Yu Liu
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationSung-ju Kim
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagationMohit Shrivastava
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networksAkash Goel
 
Principles of soft computing-Associative memory networks
Principles of soft computing-Associative memory networksPrinciples of soft computing-Associative memory networks
Principles of soft computing-Associative memory networksSivagowry Shathesh
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madalineNagarajan
 
soft computing
soft computingsoft computing
soft computingAMIT KUMAR
 
MPerceptron
MPerceptronMPerceptron
MPerceptronbutest
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Indraneel Pole
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANNwaseem khan
 

La actualidad más candente (20)

Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Classification By Back Propagation
Classification By Back PropagationClassification By Back Propagation
Classification By Back Propagation
 
Back propagation method
Back propagation methodBack propagation method
Back propagation method
 
NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief nets
 
DNN and RBM
DNN and RBMDNN and RBM
DNN and RBM
 
On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)On Implementation of Neuron Network(Back-propagation)
On Implementation of Neuron Network(Back-propagation)
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back Propagation
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagation
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
 
Principles of soft computing-Associative memory networks
Principles of soft computing-Associative memory networksPrinciples of soft computing-Associative memory networks
Principles of soft computing-Associative memory networks
 
Lec 6-bp
Lec 6-bpLec 6-bp
Lec 6-bp
 
Adaline madaline
Adaline madalineAdaline madaline
Adaline madaline
 
soft computing
soft computingsoft computing
soft computing
 
MPerceptron
MPerceptronMPerceptron
MPerceptron
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANN
 

Similar a Auto encoders in Deep Learning

Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial Ligeng Zhu
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksYasutoTamura1
 
Deep learning: Mathematical Perspective
Deep learning: Mathematical PerspectiveDeep learning: Mathematical Perspective
Deep learning: Mathematical PerspectiveYounusS2
 
19 - Neural Networks I.pptx
19 - Neural Networks I.pptx19 - Neural Networks I.pptx
19 - Neural Networks I.pptxEmanAl15
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMayuraD1
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksStratio
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep LearningYaminiAlapati1
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfssuser7f0b19
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 

Similar a Auto encoders in Deep Learning (20)

Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
Deep learning
Deep learningDeep learning
Deep learning
 
Illustrative Introductory Neural Networks
Illustrative Introductory Neural NetworksIllustrative Introductory Neural Networks
Illustrative Introductory Neural Networks
 
NNAF_DRK.pdf
NNAF_DRK.pdfNNAF_DRK.pdf
NNAF_DRK.pdf
 
Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)
 
Deep learning: Mathematical Perspective
Deep learning: Mathematical PerspectiveDeep learning: Mathematical Perspective
Deep learning: Mathematical Perspective
 
DeepLearning.pdf
DeepLearning.pdfDeepLearning.pdf
DeepLearning.pdf
 
19 - Neural Networks I.pptx
19 - Neural Networks I.pptx19 - Neural Networks I.pptx
19 - Neural Networks I.pptx
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Glowworm Swarm Optimisation
Glowworm Swarm OptimisationGlowworm Swarm Optimisation
Glowworm Swarm Optimisation
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Lesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdfLesson_8_DeepLearning.pdf
Lesson_8_DeepLearning.pdf
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 

Más de Shajun Nisha

Google meet and its extensions sac
Google meet and its extensions sacGoogle meet and its extensions sac
Google meet and its extensions sacShajun Nisha
 
Dip fundamentals 2
Dip fundamentals 2Dip fundamentals 2
Dip fundamentals 2Shajun Nisha
 
Dip digital image 3
Dip digital image 3Dip digital image 3
Dip digital image 3Shajun Nisha
 
25 environmental ethics intellectual property rights
25 environmental ethics intellectual property rights25 environmental ethics intellectual property rights
25 environmental ethics intellectual property rightsShajun Nisha
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learningShajun Nisha
 
Basics of research in research methodology
Basics of research in research methodologyBasics of research in research methodology
Basics of research in research methodologyShajun Nisha
 
Teaching Aptitude in Research Methodology
Teaching Aptitude in Research MethodologyTeaching Aptitude in Research Methodology
Teaching Aptitude in Research MethodologyShajun Nisha
 
Perceptron and Sigmoid Neurons
Perceptron and Sigmoid NeuronsPerceptron and Sigmoid Neurons
Perceptron and Sigmoid NeuronsShajun Nisha
 
Mc Culloch Pitts Neuron
Mc Culloch Pitts NeuronMc Culloch Pitts Neuron
Mc Culloch Pitts NeuronShajun Nisha
 
Intensity Transformation and Spatial filtering
Intensity Transformation and Spatial filteringIntensity Transformation and Spatial filtering
Intensity Transformation and Spatial filteringShajun Nisha
 
Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Shajun Nisha
 
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)Shajun Nisha
 
Image processing lab work
Image processing lab workImage processing lab work
Image processing lab workShajun Nisha
 
introduction to cloud computing
 introduction to cloud computing introduction to cloud computing
introduction to cloud computingShajun Nisha
 
online learning NPTEL
online learning NPTELonline learning NPTEL
online learning NPTELShajun Nisha
 

Más de Shajun Nisha (18)

Google meet and its extensions sac
Google meet and its extensions sacGoogle meet and its extensions sac
Google meet and its extensions sac
 
Dip syntax 4
Dip syntax 4Dip syntax 4
Dip syntax 4
 
Dip fundamentals 2
Dip fundamentals 2Dip fundamentals 2
Dip fundamentals 2
 
Dip application 1
Dip application 1Dip application 1
Dip application 1
 
Dip digital image 3
Dip digital image 3Dip digital image 3
Dip digital image 3
 
ICT tools
ICT  toolsICT  tools
ICT tools
 
25 environmental ethics intellectual property rights
25 environmental ethics intellectual property rights25 environmental ethics intellectual property rights
25 environmental ethics intellectual property rights
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
Basics of research in research methodology
Basics of research in research methodologyBasics of research in research methodology
Basics of research in research methodology
 
Teaching Aptitude in Research Methodology
Teaching Aptitude in Research MethodologyTeaching Aptitude in Research Methodology
Teaching Aptitude in Research Methodology
 
Perceptron and Sigmoid Neurons
Perceptron and Sigmoid NeuronsPerceptron and Sigmoid Neurons
Perceptron and Sigmoid Neurons
 
Mc Culloch Pitts Neuron
Mc Culloch Pitts NeuronMc Culloch Pitts Neuron
Mc Culloch Pitts Neuron
 
Intensity Transformation and Spatial filtering
Intensity Transformation and Spatial filteringIntensity Transformation and Spatial filtering
Intensity Transformation and Spatial filtering
 
Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)Image Restoration (Digital Image Processing)
Image Restoration (Digital Image Processing)
 
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
ESTIMATING NOISE PARAMETER & FILTERING (Digital Image Processing)
 
Image processing lab work
Image processing lab workImage processing lab work
Image processing lab work
 
introduction to cloud computing
 introduction to cloud computing introduction to cloud computing
introduction to cloud computing
 
online learning NPTEL
online learning NPTELonline learning NPTEL
online learning NPTEL
 

Último

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 

Último (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 

Auto encoders in Deep Learning

  • 1. DEEP LEARNING Dr.S.SHAJUN NISHA, MCA.,M.Phil.,M.Tech.,MBA.,Ph.D., Assistant Professor & Head , PG & Research Dept. of Computer Science Sadakathullah Appa College(Autonomous), Tirunelveli. shajunnisha_s@yahoo.com +91 99420 96220
  • 2. Introduction • Deep learning is a form of machine learning that uses a model of computing that's very much inspired by the structure of the brain. Hence we call this model a neural network. The basic foundational unit of a neural network is the neuron). • Each neuron has a set of inputs, each of which is given a specific weight. The neuron computes some function on these weighted inputs. A linear neuron takes a linear combination of the weighted inputs and apply activation function (sigmoid, tanh etc.) • Network feeds the weighted sum of the inputs into the logistic function (in case of sigmoid activation function). The logistic function returns a value between 0 and 1. When the weighted sum is very negative, the return value is very close to 0. When the weighted sum is very large and positive, the return value is very close to 1 Biological Neuron Artificial Neuron Number of Neurons in Species
  • 3. Introduction  Softwares used in Deep Learning  Theano: Python based Deep Learning Library  TensorFlow: Google’s Deep Learning library runs on top of Python/C++  Keras / Lasagne: Light weight wrapper which sits on top of Theano/TensorFlow, enables faster model prototyping  Torch: Lua based Deep Learning library with wide support for machine learning algorithms  Caffe: Deep Learning library primarily used for processing pictures
  • 4. FUNDAMENTALS  Activation Functions: Every activation function takes a single number and performs a certain fixed mathematical operation on it. Below are popularly used activation functions in Deep Learning  Sigmoid  Tanh  Relu  Sigmoid: Sigmoid Linear  It has mathematical form σ(x) = 1 / (1+e−x). It takes real valued number and squashes it into range between 0 and 1. Sigmoid is popular choice, which makes ease of calculating derivatives and easy to interpret  Tanh: Tanh squashes the real valued number to the range [- 1,1]. Output is zero centered. In practice tanh non-linearity is always preferred to the sigmoid nonlinearity. Also, it can be proved that Tanh is scaled sigmoid neuron tanh(x) = 2σ(2x) − 1 Sigmoid Activation Function . Tanh Activation Function
  • 5. FUNDAMENTALS .  ReLU (Rectified Linear Unit): ReLU has become very popular in last few years. It computes the function f(x)=max(0,x). Activation is simply thresholds at zero  Linear: Linear activation function is used in Linear regression problems where it provides derivative always as 1 due to the function used is f(x) = x  Relu is now popularly being used in place of Sigmoid or Tanh due to its better property of convergence ReLU Activation Function Linear Activation Function
  • 6. Weigh t9 Activation = g(BiasWeight1 + Weight1 * Input1 + Weight2 * Input2) Weight1 Weig ht2 FUNDAMENTALS  Forward propagation & Backpropagation: During the forward propagation stage, features are input to the network and feed However, we can calculate error of the network only at the update the weights to optimal, we must propagate the network’s errors backwards through its layers Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Forward Propagation of Layer 1 Neurons Hidden Layer 1 Hidden Layer 2 BiasWeight1 Output Layer Input Layer Activation = g(BiasWeight4 + Weight7 * Hidden1 + Weight8 * Hidden2+ Weight9 * Hidden3) Weight 7 Weig ht8 BiasWeig ht4 Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Forward Propagation of Layer 2 Neurons Hidden Layer 1 Hidden Layer 2 Output Layer Input Layer
  • 7. FUNDAMENTALS Weight 18 Activation = g(BiasWeight7 + Weight16 * Hidden4 + Weight17 * Hidden5+ Weight18 * Hidden6) Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 2 Forward Propagation of Output Layer Neurons Hidden Layer 1 Hidden Layer 2 Output Layer BiasWeight7 Weight16 O u t p u t 1 Weight17 Input Layer Error (Output 1)= g’(Output 1) * (True1 – Output1) Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Backpropagation of Output Layer Neurons Hidden Layer 1 Hidden Layer 2 Output Layer Input Layer
  • 8. Weight 13 Weight 10 Weigh t7 Weight 19 FUNDAMENTALS Error (Hidden4)= g’(Hidden4) + (Weight16 * Error(Output1) + Weight19 * Error(Output2)) Weight 16 Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Backpropagation of Hidden Layer2 Neurons Hidden Layer 1 Hidden Layer 2 Output Layer Input Layer Error (Hidden1)= g’(Hidden1) * (Weight7 * Error(Hidden4) + Weight10 * Error(Hidden5) + Weight13*Error(Hidden6)) Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Backpropagation of Hidden Layer1 Neurons Hidden Layer 1 Hidden Layer 2 Output Layer Input Layer
  • 9. Weigh t8 BiasWeig ht4 Weigh t1 Input Layer Weigh t2 FUNDAMENTALS BiasWeight1= BiasWeight1 + a * Error(Hidden1) * 1 Weight1 = Weight1 + a * Error(Hidden1) * Input1 Weight2 = Weight2 + a * Error(Hidden1) * Input2 Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Hidden Layer 1 BiasWeight1 Hidden Layer 2 Output Layer Weigh t7 Input Layer Weigh t9 BiasWeight4= BiasWeight4 + a * Error(Hidden4) * 1 Weight7 = Weight7 + a * Error(Hidden4) * Hidden1 Weight8 = Weight8 + a * Error(Hidden4) * Hidden2 Weight9 = Weight9 + a * Error(Hidden4) * Hidden3 Hidd en 1 Hidd en 2 Hidd en 3 Input 1 Input 2 Hidd en 4 Hidd en 5 Hidd en 6 Outp ut 1 Outp ut 2 Output Layer Updating Weights of Hidden Layer 1 – Hidden Layer 2 during Forward Propagation Hidden Layer 1 Hidden Layer 2 Updating Weights of Hidden Layer 1 – Hidden Layer 2 during backward Propagation
  • 10. FUNDAMENTALS  Dropout: Dropout is a regularization in Neural networks to avoid over fitting the data. Typically Dropout is 0.8 (80 % neurons present randomly all the time) in initial layers and 0.5 in middle layers  Optimization: Various techniques used to optimize the weights including  SGD (Stochastic Gradient Descent)  Momentum  Nag (Nesterov Accelerated Gradient)  Adagrad (Adaptive gradient)  Adadelta  Rmsprop  Adam (Adaptive moment estimation) In practice Adam is good default choice, if you cant afford full batch updates, then try out L- BFGS Application of Dropout in Neural network Optimization of Error Surface
  • 11. FUNDAMENTALS  Stochastic Gradient Descent (SGD): Gradient descent is a way to minimize an objective function J(θ) parameterized by a model’s parameter θ∈Rd by updating the parameters in the opposite direction of the gradient of the objective function w.r.to the parameters. Learning rate determines the size of steps taken to reach minimum.  Batch Gradient Descent (all training observations per each iteration)  SGD (1 observation per iteration)  Mini Batch Gradient Descent (size of about 50 training observations for each iteration)  Momentum: SGD has trouble navigating surface curves much more steeply in one dimension than in other, in these scenarios SGD Gradient Descent Comparison of SGD without & with Momentum
  • 12. FUNDAMENTALS • Nesterov Accelerated Gradient (NAG): If a ball rolls down a hill and blindly follows a slope, is highly unsatisfactory and it should have a notion of where it is going so that it knows to slow down before the hill slopes up again. NAG is a way to give momentum term this kind of prescience • Adagrad: Adagrad is an algorithm for gradient-based optimization that adapts the differential learning rate to parameters, performing larger updates for infrequent and smaller updates. Nesterov Momentum Update
  • 13. FUNDAMENTALS  Adadelta: Adadelta is an extension of Adagrad that seeks to reduce its aggressive, monotonically decreasing learning rate. Instead of accumulating all past squared gradients, Adadelta restricts the window of accumulated past gradients to some fixed size w RMSprop: RMSprop and Adadelta have both developed independently around the same time to resolve Adagrad’s radically diminishing learning rates  Adam (Adaptive Moment Estimation): Adam is another method that computes adaptive learning rates for each parameter. In addition to storing an exponentially decaying average of past squared gradients like Adadelta and RMSprop, Adam also keeps an exponentially decaying average of past gradients similar to momentum
  • 14. Deep Architecture of ANN (Artificial Neural Network)  All hidden layers should have same number of neurons per layer  Typically 2 hidden layers are good enough to solve majority of problems Using scaling/batch normalization (mean 0, variance  for all input variables after each layer improves convergence effectiveness  Reduction in step size after each iteration improves convergence, in addition to usage of momentum & Dropout Decision Boundary of Deep Architecture
  • 15. CONVOLUTIONAL NEURAL NETWORKS  Convolutional Neural Networks used in picture analysis, including image captioning, digit recognizer and various visual system processing E.g.: Vision detection in Self driving cars, Hand written Digit recognizer, Google Deepmind’ s Alphago Object recognition and classification using Convolutional Networks CNN application in Self Driving Cars CNN application in Handwritten Digit Recognizer
  • 16. CONVOLUTIONAL NEURAL NETWORKS Hubel & Weisel experiments on Cat’s Vision Formation of features over layers using Neural Networks • Hubel & Wiesel inserted microscopic electrodes into the visual cortex of anesthetized cat to read activity of the single cells in visual cortex while presenting various stimuli to it’s eyes during experiment on 1959. For which received noble prize under Medicine category on 1981Hubel & Wiesel discovered that vision is hierarchical, consists of simple cells, complex cells & hyper- complex cells Object Detection Vision is Hierarchical phenomenon
  • 17. CONVOLUTIONAL NEURAL NETWORKS • Input layer/picture consists of 32 x 32 pixels with 3 colors (Red, Green & Blue) (32 x 32 x 3) • Convolution layer is formed by running a filter (5 x 5 x 3) over Input layer which will result in (28 x 28 x 1) Input Layer & Filter Running filter over Input Layer to form Convolution layer Complete Convolution Layer from filter
  • 18. CONVOLUTIONAL NEURAL NETWORKS  2nd Convolution layer has been created in similar way with another filter  After striding/convolving with 6 filters, new layer has been created with 28 x 28 x 6 dimension Complete Convolution layer from Filter 2 Convolution layers created with 6 Filters Formation of complete 2nd layer
  • 19. CONVOLUTIONAL NEURAL NETWORKS Max pooling working methodology Max pool layer after performing pooling • Pooling Layer: Pooling layer makes the representation smaller and more manageable. Operates over each activation map independently. Pooling applies on width and breadth of the layer and depth will remains the same during pooling stage • Padding: Size of the image (width & breadth) is getting shrunk consecutively, this issue is undesirable during deep networks, padding keeps the size of picture constant or controllable in size throughout the network Zero padding on 6 x 6 picture
  • 20. RECURRENT NEURAL NETWORKS  Recurrent neural networks are very much useful in sequence remembering, time series forecasting, Image captioning, machine translation etc.  RNNs are useful in building A.I. Chabot in which sequence of words with all syntaxes & semantics would be remembered and subsequently provide answers to given questions Recurrent Neural Networks Image Captioning using Convolutional and Recurrent Neural Network Application of RNN in A.I. Chatbot
  • 21. RECURRENT NEURAL NETWORKS • Recurrent neural network is used for processing sequence of vectors x by applying a recurrence formula at every time step Recurrent Neural Network Vanilla Network Image Captioning (image -> Seq. of words) Sentiment Classification (Seq. of words -> Sentiment) Machine Translation (Seq. of words -> Seq. of words) Video Classification on frame level yt x t RNN y0 x0 RNN y1 x1 RNN y2 x2 RNN yt x t RNN
  • 22. RECURRENT NEURAL NETWORKS  Vanishing gradient problem with RNN: Gradients do vanishes quickly with more number of layers and this issue is severe with RNN. Vanishing gradients leads to slow training rates. LSTM & GRU are used to avoid this issue  LSTM (Long Short Term Memory): LSTM is an artificial neural network contains LSTM blocks in addition to regular network units. LSTM block contains gates that determine when the input is significant enough to remember, when it should continue to remember or when it should forget the value and when it should output the value LSTM Working Principle (Backpropagation ) LSTM Cell
  • 23. Deep Autoencoders  Deep Autoencoder: Autoencoder neural network is an unsupervised learning algorithm that applies backpropagation. Stacking layers of Autoencoders produces a deeper architecture known as Stacked or Deep Autoencoders  Application of Encoders in Face recognition, Speech recognition, Signal Denoising etc. PCA vs Deep Autoencoder for MNIST Data Face Recognition using Deep Autoencoders
  • 24. Deep Autoencoders • Deep Autoencoder: Autoencoder neural network is an unsupervised learning algorithm that applies backpropagation, setting the target values to be equal to the inputs. i.e. it uses y ( i ) = x ( i ) • Typically deep Autoencoder is composed of two segments, encoding network and decoding network. Deep Autoencoder Examples Training Deep Autoencoder Autoencoder with Classifier Reconstruction of features with weight transpose