Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
1. Looking into the Black Box-
A Theoretical Insight into
Deep Learning Networks
2. What is
Deep Learning?
Deep learning is a branch of machine learning which is based on
Artificial Neural Networks. As neural networks mimics the human brain,
so does the deep learning based on neural networks.
Deep learning is a representation learning. The automated formation
of useful representations from data.
There are variety of deep learning networks such as Multilayer
Perceptron (MLP), Autoencoders (AE), Convolution Neural Network
(CNN), Recurrent Neural Network (RNN).
02
4. 04
Why
Deep Learning is
Successful?
Deep learning models are large and deep Artificial Neural Networks. A
neural network (“NN”) can be well presented in a directed acyclic
graph.
The input layer takes in signal vectors; one or multiple hidden layers
process the outputs of the previous layer.
But why does it work now? Why deep learning is successful now than
ever?
5. 05
Understands the
problem & check
Feasibility for
Deep Learning
Test the model’s
Performance
Choose
Deep Learning
Algorithm
Identifies Relevant
Data & prepares it
Training
Algorithm
Architecture of Deep Learning
7. How do data science techniques scale with amount of data?
Performance
Amount of data
Why Deep Learning
Deep learning
Older learning
algorithms
07
8. Why is
Deep Learning
Successful?
The reason for Deep learning networks success is:
We have a lot more data.
We have much powerful computers.
A large and deep neural network has many layers and many nodes in
each layer, resulting in many parameters to tune. Without enough
data, we cannot learn parameters efficiently. Without powerful
computers, learning would be too slow and insufficient.
08
9. Why is
Deep Learning
Successful?
Neural networks are either encoders, decoders, or a combination of
both:
Encoders find patterns in raw data to form compact, useful
representations.
Decoders generate high-resolution data from those representations.
The generated data is either new examples or descriptive knowledge.
09
10. Traditional Pattern Recognition: Fixed/Handcrafted Feature Extractor
Mainstream Pattern Recognition: Unsupervised mid-level features
Deep Learning: Representations are hierarchical and trained
Feature
Extractor
Mid-Level
Features
Low-Level
Features
Feature
Extractor
Trainable
Classifier
Trainable
Classifier
Trainable
Classifier
Mid-Level
Features
High-Level
Features
10
12. Feed Forward
Neural Networks
(FFNNs)
FFNNs dating back to 1940s, are networks that don’t have any cycles.
Data passes from input to output in a single pass without any “state
memory”.
Technically, most networks in deep learning can be considered FFNNs,
but usually “FFNN” refers to its simplest variant: a densely-
connected multilayer perceptron (MLP).
Dense encoders are used to map an already compact set of
numbers on the input to a prediction: either a classification (discrete)
or a regression (continuous).
12
13. Feed Forward Neural Networks
A Few
Numbers
Dense
Encoder
Representation Prediction Prediction
Ground TruthOutputNetworkInput
13
14. Convolutional
Neural Networks
(CNNs)
CNNs are feed forward neural networks that use a spatial-invariance
trick to efficiently learn local patterns, most commonly, in images.
Spatial-invariance means that a subject's ear in the top left of the
image has the same features as a subject's ear in bottom right of the
image. CNNs share weights across space to make the detection of
subject's ears and other patterns more efficient.
Instead of using only densely-connected layers, they use
convolutional layers (convolutional encoder). These networks are
used for image classification, object detection, video action
recognition, and any data that has some spatial invariance in its
structure.
14
16. Recurrent
Neural Networks
(RNNs)
RNNs are networks that have cycles and therefore have “state
memory”. They can be unrolled in time to become feed forward
networks where the weights are shared. Just as CNNs share weights
across “space”, RNNs share weights across “time”. This allows them to
process and efficiently represent patterns in sequential data.
Many variants of RNNs modules have been developed, including LSTMs
and GRUs, to help learn patterns in longer sequences. Applications
include natural language modeling, speech recognition, speech
generation, etc.
16
18. Encoder-Decoder
Architectures
FFNNs, CNNs, and RNNs presented in first 3 sections are simply
networks that make a prediction using either a dense encoder,
convolutional encoder, or a recurrent encoder, respectively. These
encoders can be combined or switched depending on the kind of raw
data we’re trying to form a useful representation of.
“Encoder-Decoder” architecture is a higher-level concept that builds
on the encoding step to, instead of making a prediction, generate a
high-dimensional output via a decoding step by upsampling the
compressed representation.
Applications include semantic segmentation, machine translation,
etc.
18
20. Autoencoders
Autoencoders are one of the simpler forms of “unsupervised learning”
taking the encoder-decoder architecture and learning to generate an
exact copy of the input data. Since the encoded representation is
much smaller than the input data, the network is forced to learn how
to form the most meaningful representation.
Since the ground truth data comes from the input data, no human
effort is required. In other words, it’s self-supervised.
Applications include unsupervised embeddings, image denoising, etc.
20
22. Generative
Adversarial
Networks
(GANs)
GANs are a framework for training networks optimized for generating
new realistic samples from a particular representation. In its simplest
form, the training process involves two networks. One network, called
the generator, generates new data instances, trying to fool the other
network, the discriminator, that classifies images as real or fake.
They can generate images from a particular class, the ability to map
images from one domain to another, and an incredible increase in
realism of generated images.
22
23. Generative Adversarial Networks
Real Image
Generator Fake Image
Ground TruthNetwork
Discriminator
Network
Noise
Input Output
Prediction
Real or Fake
Throw away after training
23
24. Deep
Reinforcement
Learning
(Deep RL)
Deep RL allows us to apply neural networks in simulated or real-world
environments when sequences of decisions need to be made.
When the learning is done by a neural network, we refer to it as Deep
Reinforcement Learning (Deep RL). There are three types of RL
frameworks: policy-based, value-based, and model-based. The
distinction is what the neural network is tasked with learning.
This includes game playing, robotics , neural architecture search, and
much more.
24
26. 26
Advantages
Best in-class performance
on problems.
Reduces need for
feature engineering.
Eliminates
unnecessary costs.
Identifies defects easily
that are difficult to detect.
27. 27
Applications of
Deep Learning
Models
Automatic Text Generation – Corpus of text is learned and from this
model new text is generated, word-by-word or character-by-
character. Then this model can learn how to spell, punctuate, form
sentences, or it may even capture the style.
Healthcare – Helps in diagnosing various diseases and treating it.
Automatic Machine Translation – Certain words, sentences or
phrases in one language is transformed into another language (Deep
Learning is achieving top results in the areas of text, images).
28. 28
Applications of
Deep Learning
Models
Image Recognition – Recognizes and identifies people and objects in
images as well as to understand content and context. This area is
already being used in Gaming, Retail, Tourism, etc.
Predicting Earthquakes – Teaches a computer to perform viscoelastic
computations which are used in predicting earthquakes.
29. To assist you with our services
please reach us at
hello@mitosistech.com
www.mitosistech.com
IND: +91-7824035173
US: +1-(415) 251-2064