With TensorFlow, deep machine learning transitions from an area of research to mainstream software engineering. In this session, we'll work together to construct and train a neural network that recognises handwritten digits. Along the way, we'll discover some of the "tricks of the trade" used in neural network design, and finally, we'll bring the recognition accuracy of our model above 99%.
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Milan 2017
1. TensorFlow and Deep Learning
without a PhD
Lucio Floretta
CODEMOTION MILAN - SPECIAL EDITION
10 – 11 NOVEMBER 2017
2. ?
MNIST = Mixed National Institute of Standards and Technology - Download the dataset at http://yann.lecun.com/exdb/mnist/
Hello World: handwritten digits classification - MNIST
3. Very simple model: softmax classification
28x28
pixels
softmax
...
...
0 1 2 9
Logit := weighted sum
of all pixels + bias
neuron outputs
784 pixels
4. + b0
b1
b2
b3
… b9
L0,0
L0,1
L0,2
L0,3
… L0,9
L0,0
L1,0
L1,1
L1,2
L1,3
… L1,9
L2,0
L2,1
L2,2
L2,3
… L2,9
L3,0
L3,1
L3,2
L3,3
… L3,9
…
L99,0
L99,1
L99,2
… L99,9
x
x
x
x
x
x
x
x
w0,0
w0,1
w0,2
w0,3
… w0,9
w1,0
w1,1
w1,2
w1,3
… w1,9
w2,0
w2,1
w2,2
w2,3
… w2,9
w3,0
w3,1
w3,2
w3,3
… w3,9
w4,0
w4,1
w4,2
w4,3
… w4,9
w5,0
w5,1
w5,2
w5,3
… w5,9
w6,0
w6,1
w6,2
w6,3
… w6,9
w7,0
w7,1
w7,2
w7,3
… w7,9
w8,0
w8,1
w8,2
w8,3
… w8,9
…
w783,0
w783,1
w783,2
… w783,9
10 columns
784lines
broadcast
In matrix notation, 100 images at a time
784 pixels
X : 100 images,
one per line,
flattened x
x
+ Same 10 biases
on all lines
5. Softmax, on a batch of images
Predictions Images Weights Biases
Y[100, 10] X[100, 784] W[784,10] b[10]
matrix multiply
broadcast
on all lines
applied line
by line
tensor shapes in [ ]
6. Now in TensorFlow (Python)
Y = tf.nn.softmax(tf.matmul(X, W) + b)
tensor shapes: X[100, 784] W[748,10] b[10]
matrix multiply
broadcast
on all lines
10. TensorFlow - initialisation
import tensorflow as tf
X = tf.placeholder(tf.float32, [None, 28, 28, 1])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
init = tf.initialize_all_variables()
this will become the batch size
28 x 28 grayscale images
Training = computing variables W and b
11. # model
Y = tf.nn.softmax(tf.matmul(tf.reshape(X, [-1, 784]), W) + b)
# placeholder for correct answers
Y_ = tf.placeholder(tf.float32, [None, 10])
# loss function
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
TensorFlow - success metrics
“one-hot” encoded
flattening images
12. TensorFlow - training
optimizer = tf.train.GradientDescentOptimizer(0.005)
train_step = optimizer.minimize(cross_entropy)
learning rate
loss function
13. sess = tf.Session()
sess.run(init)
for i in range(1000):
# load batch of images and correct answers
batch_X, batch_Y = mnist.train.next_batch(100)
train_data={X: batch_X, Y_: batch_Y}
# train
sess.run(train_step, feed_dict=train_data)
TensorFlow - run !
running a Tensorflow
computation, feeding
placeholders
14. import tensorflow as tf
X = tf.placeholder(tf.float32, [None, 28, 28, 1])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
init = tf.initialize_all_variables()
# model
Y=tf.nn.softmax(tf.matmul(tf.reshape(X,[-1, 784]), W) + b)
# placeholder for correct answers
Y_ = tf.placeholder(tf.float32, [None, 10])
# loss function
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
TensorFlow - full python code
optimizer = tf.train.GradientDescentOptimizer(0.005)
train_step = optimizer.minimize(cross_entropy)
sess = tf.Session()
sess.run(init)
for i in range(10000):
# load batch of images and correct answers
batch_X, batch_Y = mnist.train.next_batch(100)
train_data={X: batch_X, Y_: batch_Y}
# train
sess.run(train_step, feed_dict=train_data)
initialisation
model
success metrics
training step
Run
39. Martin Görner
Google Developer relations
@martin_gorner
plus.google.com/+MartinGorner
goo.gl/pHeXe7
youtu.be/qyvlt7kiQoI
goo.gl/mVZloU
github.com/martin-gorner/
tensorflow-mnist-tutorial
goo.gl/UuN41S
youtu.be/vq2nnJ4g6N0
github.com/martin-gorner/
tensorflow-rnn-shakespeare
Where to go next
Cartoon images copyright:
alexpokusay / 123RF stock photos
40. TensorFlow : ML for Everyone
It scales from research to production It supports many platforms
Internal launch
Search, Gmail, Translate, Maps, Android,
Photos, Speech, YouTube, Play and many others
+
100s of research projects and papers
iOS
Android
TPU
GPU
CPU Compute Engine
Cloud Machine
Learning Engine
Cloud Vision API
Cloud Speech API
Natural Language API
Google Translate API
Video Intelligence API
Cloud Jobs API
PRIVATE BETA
PRIVATE ALPHA