2. Problem Statement
To extract features representation from Denoising
Autoencoder and try to enhance predictive accuracy
in recognizing handwritten digits from MNIST
dataset. The pipeline is following:
Load, reshape, scale and add noise to data,
Train DAE on merged training and testing data,
Get neuron outputs from DAE as new features,
Train classification algorithm on new features.
2
3. Motivation
3
When we study different machine learning
architectures, encoding of data are not significant but
in auto encoders encoding and decoding has
significant rule this made me to work on
understanding and analyzing Autoencoder and
apply on my learning to de noise image
4. Introduction
Autoencoders are part of neural network family. The algorithm is
fairly simple as AE require output to be the same as input, so that
we can classify them to unsupervised machine learning
algorithms. The AE compress input data to latent-space
representation and then reconstruct the output. We can divide
algorithm into two subsets:
Encoder – compressing the input data to lower dimensional
representation sometimes called latent-space representation
Decoder – decompressing the representation to reconstruct the
input as best as possible
4
6. Where Autoencoder is used
AE are currently used in image or sound compressing and
dimensionality reduction. In specific cases they can provide more
interesting/efficient data projections or other dimensionality
reduction techniques. Moreover, the extension of AE, called
Denoising Autoencoders are used in representation learning,
which uses not only training but also testing data to engineer
features
6
7. Denoising Autoencoder
While the main purpose of basic AE is to compress and reduce
dimensionality of data, DAE are used in another practical
application. Imagine a set of low quality images with some noise.
Is it possible to clear these images from noise using machine
learning algorithms.
In the following example we will show how to clear handwritten
MNIST digits from Gaussian random noise.
7
8. Constructing of Denosing Autoencoder
To introduce Gaussian random noise we add following code:
Noise factor controls the noisiness of images and we clip the
values to make sure that the elements of feature vector
representing image are between 0 and 1.
We use basic neural network to encode 784 input features to 32
neurons with rectifier activation function and then decode it back
to 784 neurons outputting values in range thanks
to sigmoid activation function. The only difference is that the
training is done on noisy input samples:
8
9. The results on test data are not satisfying, DAE reconstructed
digit 4 to a digit similar to 9(fourth image from left).
We could vary around with epochs, batch size other parameters
trying to get better results. So use Convolution Neural Networks
which are successful in image processing area.
9
11. Constructing Convolutional Denoising Autoencoder
Now, we will merge concept of Denoising Autoencoder with
Convolutional Neural Networks.
The input data for CNN will be represented by matrices, not
vectors as in a standard fully-connected neural network
After this operation, the dimension of x_train and x_test arrays
is num_samples x 28 x 28 x 1. The last dimension (depth/channel)
is added just because convolutional layers in keras accept 3D
tensors.
11
12. Then, we add noise to training and testing data in similar way as
in the previous basic DAE:
We define CNN architecture with following function:
The pooling layers in encoder are down sampling the input by
halve using operation.
12
13. The up sampling layers in decoder reconstruct the input size.
Finally, the last convolutional layer with sigmoid activation
function outputs decoded image.
We train the model using 40 epochs and batch size=128. We
have leaky rectifier is used to fix limitations of standard rectifier.
The standard rectifier sometimes causes a neuron unit to stay
zero and be never activated in the next gradient descent
iterations. The leaky rectifier solves that problem with the
activation function .
13
14. The outputs of Convolutional Denoising Autoencoder on test data
are much better than previously:
14
15. Implementation
We will extract features representation from Denoising
Autoencoder and try to enhance predictive accuracy in
recognizing handwritten digits from MNIST dataset. The pipeline
is following:
Load, reshape, scale and add noise to data,
Train DAE on merged training and testing data,
Get neuron outputs from DAE as new features,
Train classification algorithm on new features.
15
16. We import libraries, load and preprocess data.
Autoencoders belong to unsupervised machine learning
algorithms, in which we do not care about labels in the data and
we can use both training and testing data in Representation
Learning.
Then, we define architecture of Denoising Autoencoder. Let’s
keep it simple and instead Convolutional Neural Networks, we
use deep network with 3 hidden layers, each containing 1024
neurons. We use rectifier as activation function and sigmoid in
output layer to produce pixel values in range .
16
17. We train the Denoising Autoencoder with 40 epochs and
batch_size=128. The validation_data argument to monitor
validation loss is not provided, thus we may need to be careful not
to overfit the Autoencoder. On the other hand, if we are using
both training and testing data that are good representation of
population data, the chances of overfitting are smaller.
Next step will be to extract features from pretrained Autoencoder.
As mentioned earlier, we are taking outputs of neurons located in
all hidden layers (encoder, bottleneck and decoder layers) as new
representation of the data.
17
18. Note, that the noisy data was used only during Autoencoder
training to improve quality of representation. Clean version of
training and testing data is passed through the Autoencoder
network to produce new representations features
train and features test , respectively. These representations have
higher dimensionality 1024+1024+1024=3072 > 784 than training
data, allowing to encode more information. Moreover, the
Autoencoder automatically decided for us which features are
important, because it was trained with the goal to reconstruct the
input as best as possible.
We use these representations in classification task of recognizing
handwritten digits. We can use new features in any classification
algorithm (random forests, support vector machines)
18
19. There are ten possible classes to predict (digits from
0 to 9) and we produce one hot encoding for labels
with np_utils.to_categorical function.
We train model with 20 epochs and
batch_size=128. Model Checkpoint callback is used
to monitor validation accuracy after each epoch and
save the model with best performance.
19
23. Just after first epoch our validation accuracy is
99.3%! Eventually, we end up with 99.71% accuracy,
in comparison to the same model architecture, but
with original features it is 99.5% accuracy. Obviously,
MNIST dataset used for presentation purposes is
relatively simple and in more complex cases the gain
could be higher.
23
24. Conclusions
In this we constructed Denoising Autoencoders with
Convolutional Neural Networks and learned the purpose and
implementation of Representation Learning with Denoising
Autoencoders.
24