TensorFlow London 13.09.17 Karim Beguir

Deep Learning vs Learning to Learn
Deep Neural Architecture
Graph Neural Architecture
Conclusion
Contents

Problems of Deep Learning
Deep Learning is quite manual for an AI technology:
▪  Designing and tuning models is 5me-consuming and requires experience.
▪  Which hyper-parameters?
▪  OpHmizaHon Algorithm: SGD? Momentum? RMSProp? Adam? AdaGrad?
▪  Network architecture: so many choices you can’t even list them
▪  Classic example: IncepHon architecture (state of the art 2014)
▪  Could you have guessed? What if other designs were beVer?
▪  In prac5ce: more an art than a science, herding and heurisHcs.

Learning to Learn:
Reinforcement Learning just got nominated by MIT TR in the 10 Breakthrough
Technologies of 2017, like Deep Learning who was nominated in 2013.
“Ar$ﬁcial Intelligence = Reinforcement Learning + Deep Learning”
David Silver (AlphaGo’s inventor) Google DeepMind

MIT 35 under 35 2017 list
▪  Just published (Aug 2017). Previous winners include Larry Page & Sergey
Brin, Mark Zuckerberg, Max Levchin (Paypal).
▪  Nominees rewarded for having contributed new ways for algorithms to do
things “on their own”, or “learn to learn”.

Deep RL for Pong:
▪  Three possible acHons: {UP,STILL, DOWN}.
▪  Policy Gradient algorithm: has probabiliHes for each acHon.
▪  ProbabiliHes: sobmax from raw pixels with a NN. Weights: Random init.
▪  At each step, decide acHon by sampling the sobmax probabiliHes:
▪  Good ﬁnal outcome (win) increases the probabiliHes of ALL the acHons
chosen. A loss decreases all of them. Gradient descent (or RMSProp)
Links: hVps://karpathy.github.io/2016/05/31/rl/
hVps://gist.github.com/greydanus/5036f784eec2036252e1990da21eda18

Deep RL for Neural Nets:
▪  Recent Google breakthrough (Zoph and Le, 2017) uses same approach
▪  A Recurrent Neural Network (RNN) sets the policy (the “controller”).
▪  Weights/bias of the RNN are randomly iniHalized.
▪  But instead of moving a joysHck up and down and see what happens, it
builds a “child” NN and sees what happens.
▪  Instead of score in a video game, score is accuracy R of “child” NN on out-
of-sample validaHon set.
chosen. A loss decreases all of them.

Deep RL for Neural Nets:
▪  Controller: two-layer LSTM with 35 hidden units each
▪  Child: mulH-layer convoluHonal neural network CNN
▪  Possible acHons: Filters in {24, 36, 48, 64}, ﬁlter Height in {1,3,5,7} etc.
▪  Like in Pong example, the acHons are decided sequenHally by sampling the
sobmax probabiliHes (à la np.random.choice) for each feature and moving
to the next. This determines the CNN child architecture.
▪  Training: 45,000 CIFAR images. Accuracy R: 5,000 validaHon images.
▪  Policy Gradient algorithm: REINFORCE. (But other choices are possible).
▪  Obtained results on CIFAR-10 are state of the art with 3.65% error rate

Deep RL for Op5mizers:
▪  The same idea can be applied to boost Deep Learning opHmizers!
▪  An RNN again sets the policy (the “controller”).
▪  But result is now an update rule opHmizer: Adam, RMSProp, SGD or other.
▪  Instead of moving a joysHck up and down like in Pong and see what
happens, controller builds a “sample” opHmizer, uses it to train a neural
network and sees what happens to obtained out-of-sample accuracy.
chosen. A loss decreases them. Same approach as before.

▪  Possible acHons: 1st operand in {g,g^2,g^3, m, sign(g), sign(m) etc}, Unary
ops in {x,-x, e^x, clip(x,10^-4) etc. } and Binary ops in {x+y, x-y, x*y, x, x/y}.
▪  Such ”grammar” makes it possible to build all classical opHmizers:
▪  Like in Pong, acHons are decided sequenHally by sampling the sobmax
probabiliHes for each feature. This builds a candidate op5mizer.
▪  Policy Gradient algorithm: TRPO. (But other choices are possible).

▪  Training on a small convoluHonal neural network, the controller has found
out two new update rules, one of them is:
▪  It generalizes: on a large model (Wide ResNet) on CIFAR-10, beats all usual
opHmizers and improves accuracy by up to 2%.
▪  It also beats ADAM on large NLP tasks.

Limita5ons of previous examples
▪  Requires A LOT of samples! For Neural Architecture Search, authors used
800 GPUs for several weeks!
▪  Neural OpHmizer Search uses 100 CPUs over days to ﬁnd good opHmizers.
▪  REINFORCE Policy Gradient algorithm is quite sample ineﬃcient.
▪  This makes the systemaHc approaches described above a ”heavy
weaponry” not useful for everyday needs.
▪  Can we do something that would sHll provide results on a smaller Hme
scale?

InstaDeep
▪  InstaDeep has developed a platorm to provide users with an improvement
on their favourite architectures quickly.
▪  Suggested architectures are more complex (in terms of graph theory) and
are manufactured using neural architecture design.
▪  Suggested architectures use fewer or equal weights than iniHal one (to
preserve speed).
▪  Platorm automa5cally looks for an op5mizer too, with a search of
opHmizer space, in addiHon to the search of the neural network
architecture itself.

InstaDeep’s plaYorm
▪  Sees neural networks as a graph
▪  OpHmizers too are a graph
▪  For example, the graph on the right
describes a 4-layer neural net of
respecHvely 256, 256, 128 and 10
with a non-linear Relu funcHon on the
ﬁrst layer.
▪  The graph on the right describes
an Adam opHmizer with parameters
Beta1 = 0.5 and beta2 = 0.9 respecHvely

InstaDeep’s plaYorm
▪  Sevng up the environment in a general way makes
it possible to build RL agents to perform acHons.
▪  Compared to the RNN controller approach, our agents are
not limited to a speciﬁc form but have more ﬂexibility
and are less complex.

RNN Controller InstaDeep
Neural search Neural design
Complex to tune Simple to tune
Time Consuming Fast Deployment
OpHmal Results Improved Results

Demo
Fashion Mnist:
“The Mnist for Fashion”
▪  28x28 grayscale pictures
▪  60000 training pics
▪  10000 test pics
▪  10 labelled categories
Experiment:
▪  Start with a 2 layer Neural Net
▪  Target: improve accuracy on test set

Conclusion

▪  Neural Architecture methods are about to redefine Deep Learning, making
design both more effec5ve and way faster.

▪  Already, Meta AI algorithms (like the ones described) select cells, layers, and
opHmizers. Already they can beat the best human-made neural nets.

▪  TensorFlow community well posiHoned to benefit from these advances.

▪  InstaDeep is building its platorm on top of TensorFlow to provide Deep RL
soluHons and boost its customers AI projects.
▪  THANK YOU FOR YOUR TIME!
▪  CONTACT: hello@instadeep.com

TensorFlow London 13.09.17 Karim Beguir

Recomendados

Recomendados

Más contenido relacionado

Similar a TensorFlow London 13.09.17 Karim Beguir

Similar a TensorFlow London 13.09.17 Karim Beguir (20)

Más de Seldon

Más de Seldon (20)

Último

Último (20)

TensorFlow London 13.09.17 Karim Beguir