Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
A Brief Introduction on Recurrent Neural Network and Its Application
1. A Brief Introduction on Recurrent
Neural Network and Its Application
Qiang Gan
All contents are collected online, listed in Reference page.
For Nanjing Deep Learning Meetup Only
2. Outline
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
o LSTM / GRU
3. Neural Machine Translation
o Encoder-decoder framework
4. Attention Mechanism
o Extract information needed from source
5. RNN other applications
o Image captioning
o Question Answering
All contents are collected online, listed in Reference page.
3. Before we start …
All contents are collected online, listed in Reference page.
4. Memory
• We are all familiar with the song 《Two Tigers》
o Two tigers, two tiger …
• What is the 10th word?
• We learned them as a sequence, a kind of
conditional memory.
• More example: driving steps, movie scenes, …
All contents are collected online, listed in Reference page.
5. “Memory” in Neural Network
• Traditional Neural Network
o Output relies only on current input
o input -> hidden -> output
• Network with “Memory”
o Output relies on current input and history information
o (input + prev_hidden) -> hidden -> output
All contents are collected online, listed in Reference page.
6. “Memory” in Neural Network
• Four Steps in Network with “Memory”
1. (input + empty_hidden) -> hidden -> output
• Memory only contains blue information
2. (input + prev_hidden) -> hidden -> output
• Memory contains blue and red information
3. (input + prev_hidden) -> hidden -> output
• Memory contains blue, red and green information
4. (input + prev_hidden) -> hidden -> output
• Memory contains blue, red, green and purple information
All contents are collected online, listed in Reference page.
10. Recurrent Neural Network
• Learning algorithm (Backpropagation Through Time)
o Unfold the RNN into DNN (weights shared)
o Black is the prediction, errors are bright yellow, derivatives
are mustard colored.
All contents are collected online, listed in Reference page.
11. Long-Term Dependencies Problem
• Consider trying to predict the last word in the text “I
grew up in France… I speak fluent French.”
• We need the context of France, from further back.
All contents are collected online, listed in Reference page.
12. Vanishing Gradient Problem
w1,w2,… are the weights, b1,b2,…are the biases,
C is some cost function.
aj = σ(zj), σ is activation function,
zj=wjaj−1+bj is the weighted input to the neuron.
All contents are collected online, listed in Reference page.
14. Long-Short Term Memory
• Standard RNN
• LSTM
o Forget gate, input gate, output gate, cell state
All contents are collected online, listed in Reference page.
17. LSTM / GRU
LSTM GRU
(fewer parameters)
[1]An Empirical Exploration of Recurrent Network Architecture
[2]Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
All contents are collected online, listed in Reference page.
18. Neural Machine Translation
• Encoder-decoder
o Input reversing
• 《Sequence to Sequence Learning with Neural Networks》
o Input doubling
• 《Learning to Execute》
All contents are collected online, listed in Reference page.
19. Attention Mechanism in NMT
Neural machine translation by jointly learning to align and translate. ICLR2015
All contents are collected online, listed in Reference page.
20. Visualization of Attention Matrix
• Translating from English to French
• Elements in each row add up to 1
• in grayscale (0: black, 1: white)
• Alignments found
• La Syrie -> Syria
Neural machine translation by jointly learning to align and translate. ICLR2015
All contents are collected online, listed in Reference page.
21. RNN Applications
• Image captioning
o Encode the image with CNN, and decode the embedded
information into description with RNN.
Li-feifei, Stanford Vision Lab
All contents are collected online, listed in Reference page.
22. RNN Applications
• Question answering
o Encode the document and query with RNN, and predict
the token.
Teaching Machines to Read and Comprehend. NIPS2015
Attentive Reader
All contents are collected online, listed in Reference page.
23. Summary
1. RNN
o Model structure
o Parameters
o Learning algorithm
2. Long-Term Dependencies & Vanishing Gradient Problem
o LSTM / GRU
3. Neural Machine Translation
o Encoder-decoder framework
4. Attention Mechanism
o Extract information needed from source
5. RNN other applications
o Image captioning
o Question Answering
24. Reference
1. Anyone Can Learn To Code an LSTM-RNN in Python
2. Recurrent Neural Network Tutorial WILDML
3. ATTENTION AND MEMORY IN DEEP LEARNING AND
NLP WILDML
4. Neural Networks and Deep Learning
5. Understanding LSTM Networks
6. Sequence to Sequence Learning with Neural
Networks. NIPS2014
7. Teaching Machines to Read and Comprehend.
NIPS2015