SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Generating Sequences
using Deep LSTMs & RNNs
Andre Pemmelaar @QuantixResearch
Julia Tokyo Meetup - April 2015
About Me
Andre Pemmelaar
• 5-yrs Financial System Solutions
• 12 Buy-Side Finance
• 7-yrs Japanese Gov’t Bond Options Market Maker (JGBs)
• 5-yrs Statistical Arbitrage (Global Equities)
• Low latency & Quantitative Algorithm
• Primarily use mixture of basic statistics and machine
learning (Java, F#, Python,R)
• Using Julia for most of my real work (90%) since July, 2014
• Can be reached at @QuantixResearch
Why my interest in LSTMs & RNNs
• In my field, finance, so much of the work involves sequence models.
!
• Most deep learning models are not built for use with sequences. You have
to jury rig them to make it work.
!
• RNNs and LSTM are specifically designed to work with sequence data.
!
• Sequence models can be combined with Reinforcement Learning to
produce some very nice results (more on this and a demo later)
!
• They have begun producing amazing results.
• Better initialization procedures
• Use of Rectified Linear Units for RNNs and “Memory cells” in
LSTM
So what is a Recurrent
Neural Network?
In a word … Feedback
What are Recurrent Neural Networks
1. In their simplest form (RNNs), they are just Neural Networks with a feedback loop
2. The previous time step’s hidden layer and final outputs are fed back into the
network as part of the input to the next time step’s hidden layers.
@QuantixResearch
Why Generate Sequences?!
• To improve classification?!
• To create synthetic training data?!
• Practical tasks like speech synthesis?!
• To simulate situations?!
• To understand the data
This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
Some great examples
Alex Graves!
Formerly at University of Toronto!
Now part of Google Deep Mind Team!
!
Has a great example of generating handwriting using a LSTM!
• 3 inputs: Δx, Δy, pen up/down!
• 121 output units!
• 20 two dimensional Gaussians for x,y = 40 means (linear) + 40!
std. devs (exp) + 20 correlations (tanh) + 20 weights (softmax)!
• 1 sigmoid for up/down!
• 3 hidden Layers, 400 LSTM cells in each!
• 3.6M weights total!
• Trained with RMSprop, learn rate 0.0001, momentum 0.9!
• Error clipped during backward pass (lots of numerical problems)!
• Trained overnight on fast multicore CPU
handwriting demo
http://www.cs.toronto.edu/~graves/handwriting.html
Some great examples
Andrej Karpathy!
Now Stanford University!
!
Has a great example of generating characters
using a LSTM!
• 51 inputs (unique characters)!
• 2 hidden Layers, 20 LSTM cells in each!
• Trained with RMSprop, learn rate 0.0001, momentum
0.9!
• Error clipped during backward pass
Character generation demo
http://cs.stanford.edu/people/karpathy/recurrentjs/
Some great examples
@hardmaru!
Tokyo, Japan!
!
Has a great example of an RNN + Reinforcement learning
using the one of the pole balancing task!
!
• Uses a recurrent neural network!
! !
• Uses genetic algorithms to train the network.!
!
• The demo is doing the balancing inverted double
pendulum task which I suspect is quite hard even for
humans !
!
• All done in Javascript which makes for some great demos!
Pole balancing demo
http://otoro.net/ml/pendulum-esp-mobile/index.html
RecurrentNN.jl
RecurrentNN.jl
• My first public package (Yay!!)
!
• Based on Andrej Karpathy’s implementation in recurrentjs
!
• https://github.com/Andy-P/RecurrentNN.jl
!
• Implements both Recurrent Neural Networks, and Long-Short-Term
Networks
!
• Allows one to compose arbitrary network architecture using graph.jl
!
• Makes use of Rmsprop (a variant of stochastic gradient decent)
graph.jl
• Has functionality to construct arbitrary expression graphs
over which the library can perform automatic differentiation
!
• Similar to what you may find in Theano for Python, or in
Torch.
!
• Basic idea is to allow the user to compose neural networks
then call backprop() and have it all work with the solver
!
• https://github.com/Andy-P/RecurrentNN/src/graph.jl
type Graph
backprop::Array{Function,1}
doBackprop::Bool
function Graph(backPropNeeded::Bool)
new(Array(Function,0),backPropNeeded)
end
end
!
function sigmoid(g::Graph, m::NNMatrix)
…
if g.doBackprop
push!(g.backprop,
function ()
…
@inbounds m.dw[i,j] += out.w[i,j] * (1. - out.w[i,j]) * out.dw[i,j]
end )
end
return out
end
graph.jl
During forward
pass we build
up an array of
anonymous
functions to
calculate each
of the
gradients
graph.jl
type Graph
backprop::Array{Function,1}
doBackprop::Bool
function Graph(backPropNeeded::Bool)
new(Array(Function,0),backPropNeeded)
end
end
!
function sigmoid(g::Graph, m::NNMatrix)
…
if g.doBackprop
push!(g.backprop,
function ()
…
@inbounds m.dw[i,j] += out.w[i,j] * (1. - out.w[i,j]) * out.dw[i,j]
end )
end
return out
end
…
# use built up graph of backprop functions
# to compute backprop (set .dw fields in matirices)
for i = length(g.backprop):-1:1 g.backprop[i]() end
Then we loop
backwards through the
array calling each of
the functions to
propagate the
gradients backwards
through the network
solver.jl
function step(solver::Solver, model::Model, …)
…
for k = 1:length(modelMatices)
@inbounds m = modelMatices[k] # mat ref
@inbounds s = solver.stepcache[k]
for i = 1:m.n
for j = 1:m.d
!
# rmsprop adaptive learning rate
@inbounds mdwi = m.dw[i,j]
@inbounds s.w[i,j] = s.w[i,j] * solver.decayrate + (1.0 - solver.decayrate) * mdwi^2
!
# gradient clip
…
!
# update and regularize
@inbounds m.w[i,j] +=
- stepsize * mdwi / sqrt(s.w[i,j] + solver.smootheps) - regc * m.w[i,j]
end
end
end
…
end
Now that we have
calculated each of the
gradients, we can call
the solver to loop
through and update
each of the weights
based on the gradients
we stored during the
backprop pass
RMSProp uses an adaptive learning
rate for each individual parameter
solve.jl
Examples of RmsProp vs
other optimization algorithms
http://imgur.com/a/Hqolp
example.jl
• Based on I. Sutskever et.al. “Generating Text with
Recurrent Neural Networks” ICML, 2011!
!
• Closely follows Andrej Karpathy’s example!
!
• Read in about 1400 English Sentences from Paul Graham’s essay’s on what makes
a successful start-up!
!
• Learns to predict the next character from the previous character!
!
• Uses perplexity for cost function!
!
• Takes about 8-12hrs to get a good model (need to anneal learning rate)!
!
• letter embedding = 6, hidden units = 100 (note example default is set to 5 & [20,20])
sample output -1hr
• be bet sroud thir an
• the to be startups dalle a boticast that co thas as tame
goudtent wist
• the dase mede dosle on astasing sandiry if the the op
• that the dor slous seof the pos to they wame mace thas
theming obs and secofcagires morlillers dure t
• you i it stark to fon'te nallof the they coulker imn to suof imas
to ge thas int thals le withe the t
sample output -5hrs
!
• you dire prefor reple take stane to of conwe that there cimh the
don't than high breads them one gro
• but startups you month
• work of have not end a will araing thec sow about startup maunost
matate thinkij the show that's but
• you dire prefor reple take stane to of conwe that there cimh the
don't than high breads them one gro
• but cashe the sowe the mont pecipest fitlid just
• Argmax: it's the startups the the seem the startups the the seem the
startups the the seem the startups the
sample output -10hrs
• and if will be dismiss we can all they have to be a demo
every looking
• you stall the right take to grow fast, you won't back
• new rectionally not a lot of that the initial single of optimizing
money you don't prosperity don't pl
• when you she have to probably as one there are on the
startup ideas week
• the startup need of to a company is the doesn't raise in
startups who confident is that doesn't usual
What’s not yet so great
about this package?
What’s not yet so great about this
package?
Garbage Collection
!
• Tried to keep close to the original
implementation to make regression
testing easier
!
• Karpathy’s version frequently uses JS’
push to build arrays of matrices
!
• This is appropriate in Javascript but
creates a lot of GC in Julia.
!
• The likely fix is to create the arrays
only once and then update them inline
on each pass (version 0.2!)
Model Types
!
• Models need some kind of interface
that the solver can call to get the
collection of matrices
!
• At the moment that is implemented
in collectNNMat() function
!
• Could be tightened up by making
this part of the initialization of the
models
!
Thank you!
Andre Pemmelaar @QuantixResearch
Julia Tokyo Meetup - April 2015

Más contenido relacionado

La actualidad más candente

Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras frameworkAlison Marczewski
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowEmanuel Di Nardo
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningS N
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep LearningAsim Jalis
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...MLconf
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetAmazon Web Services
 
Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMsDaniel Perez
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkSigOpt
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksTaegyun Jeon
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with KerasQuantUniversity
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowNicholas McClure
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningMadhu Sanjeevi (Mady)
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Márton Miháltz
 

La actualidad más candente (20)

Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras framework
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflow
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
Le Song, Assistant Professor, College of Computing, Georgia Institute of Tech...
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMs
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural Networks
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep Learning
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 

Destacado

From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learningViet-Trung TRAN
 
RNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq ModelsRNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq ModelsEmory NLP
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Machine learning interviews day3
Machine learning interviews   day3Machine learning interviews   day3
Machine learning interviews day3rajmohanc
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?Sébastien ☁ Stormacq
 
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1Stavros Vassos
 
論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」kurotaki_weblab
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Alexander Korbonits
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowJeongkyu Shin
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationXiaohu ZHU
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSeiya Tokui
 

Destacado (17)

From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
RNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq ModelsRNN, LSTM and Seq-2-Seq Models
RNN, LSTM and Seq-2-Seq Models
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Machine learning interviews day3
Machine learning interviews   day3Machine learning interviews   day3
Machine learning interviews day3
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?
 
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1
Intro to AI STRIPS Planning & Applications in Video-games Lecture6-Part1
 
論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」論文輪読資料「Gated Feedback Recurrent Neural Networks」
論文輪読資料「Gated Feedback Recurrent Neural Networks」
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlow
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its Application
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 

Similar a Generating Sequences with Deep LSTMs & RNNS in julia

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting
 
12. Parallel Algorithms.pptx
12. Parallel Algorithms.pptx12. Parallel Algorithms.pptx
12. Parallel Algorithms.pptxMohAlyasin1
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Modern Java Concurrency (OSCON 2012)
Modern Java Concurrency (OSCON 2012)Modern Java Concurrency (OSCON 2012)
Modern Java Concurrency (OSCON 2012)Martijn Verburg
 
Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)Martijn Verburg
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
Understand regression testing
Understand regression testingUnderstand regression testing
Understand regression testinggaoliang641
 
Actors and Threads
Actors and ThreadsActors and Threads
Actors and Threadsmperham
 
Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Charles Nutter
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMfnothaft
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksHannes Hapke
 
An Introduction to Processing
An Introduction to ProcessingAn Introduction to Processing
An Introduction to ProcessingCate Huston
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012Tomas Doran
 
Using Apache Camel as AKKA
Using Apache Camel as AKKAUsing Apache Camel as AKKA
Using Apache Camel as AKKAJohan Edstrom
 
Rails development environment talk
Rails development environment talkRails development environment talk
Rails development environment talkReuven Lerner
 
Java Closures
Java ClosuresJava Closures
Java ClosuresBen Evans
 
Hunting for anglerfish in datalakes
Hunting for anglerfish in datalakesHunting for anglerfish in datalakes
Hunting for anglerfish in datalakesDominic Egger
 

Similar a Generating Sequences with Deep LSTMs & RNNS in julia (20)

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
 
12. Parallel Algorithms.pptx
12. Parallel Algorithms.pptx12. Parallel Algorithms.pptx
12. Parallel Algorithms.pptx
 
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Modern Java Concurrency (OSCON 2012)
Modern Java Concurrency (OSCON 2012)Modern Java Concurrency (OSCON 2012)
Modern Java Concurrency (OSCON 2012)
 
Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)
 
Software + Babies
Software + BabiesSoftware + Babies
Software + Babies
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Understand regression testing
Understand regression testingUnderstand regression testing
Understand regression testing
 
Actors and Threads
Actors and ThreadsActors and Threads
Actors and Threads
 
Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAM
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
An Introduction to Processing
An Introduction to ProcessingAn Introduction to Processing
An Introduction to Processing
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
 
Using Apache Camel as AKKA
Using Apache Camel as AKKAUsing Apache Camel as AKKA
Using Apache Camel as AKKA
 
Recap m3
Recap m3Recap m3
Recap m3
 
Rails development environment talk
Rails development environment talkRails development environment talk
Rails development environment talk
 
Java Closures
Java ClosuresJava Closures
Java Closures
 
Hunting for anglerfish in datalakes
Hunting for anglerfish in datalakesHunting for anglerfish in datalakes
Hunting for anglerfish in datalakes
 

Último

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Último (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Generating Sequences with Deep LSTMs & RNNS in julia

  • 1. Generating Sequences using Deep LSTMs & RNNs Andre Pemmelaar @QuantixResearch Julia Tokyo Meetup - April 2015
  • 2. About Me Andre Pemmelaar • 5-yrs Financial System Solutions • 12 Buy-Side Finance • 7-yrs Japanese Gov’t Bond Options Market Maker (JGBs) • 5-yrs Statistical Arbitrage (Global Equities) • Low latency & Quantitative Algorithm • Primarily use mixture of basic statistics and machine learning (Java, F#, Python,R) • Using Julia for most of my real work (90%) since July, 2014 • Can be reached at @QuantixResearch
  • 3. Why my interest in LSTMs & RNNs • In my field, finance, so much of the work involves sequence models. ! • Most deep learning models are not built for use with sequences. You have to jury rig them to make it work. ! • RNNs and LSTM are specifically designed to work with sequence data. ! • Sequence models can be combined with Reinforcement Learning to produce some very nice results (more on this and a demo later) ! • They have begun producing amazing results. • Better initialization procedures • Use of Rectified Linear Units for RNNs and “Memory cells” in LSTM
  • 4. So what is a Recurrent Neural Network?
  • 5. In a word … Feedback
  • 6. What are Recurrent Neural Networks 1. In their simplest form (RNNs), they are just Neural Networks with a feedback loop 2. The previous time step’s hidden layer and final outputs are fed back into the network as part of the input to the next time step’s hidden layers. @QuantixResearch
  • 7. Why Generate Sequences?! • To improve classification?! • To create synthetic training data?! • Practical tasks like speech synthesis?! • To simulate situations?! • To understand the data This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
  • 8. This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
  • 9. This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
  • 10. This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
  • 11. This slide is from “Generating Sequences with Recurrent Neural Networks” - Alex Graves
  • 12. Some great examples Alex Graves! Formerly at University of Toronto! Now part of Google Deep Mind Team! ! Has a great example of generating handwriting using a LSTM! • 3 inputs: Δx, Δy, pen up/down! • 121 output units! • 20 two dimensional Gaussians for x,y = 40 means (linear) + 40! std. devs (exp) + 20 correlations (tanh) + 20 weights (softmax)! • 1 sigmoid for up/down! • 3 hidden Layers, 400 LSTM cells in each! • 3.6M weights total! • Trained with RMSprop, learn rate 0.0001, momentum 0.9! • Error clipped during backward pass (lots of numerical problems)! • Trained overnight on fast multicore CPU
  • 14. Some great examples Andrej Karpathy! Now Stanford University! ! Has a great example of generating characters using a LSTM! • 51 inputs (unique characters)! • 2 hidden Layers, 20 LSTM cells in each! • Trained with RMSprop, learn rate 0.0001, momentum 0.9! • Error clipped during backward pass
  • 16. Some great examples @hardmaru! Tokyo, Japan! ! Has a great example of an RNN + Reinforcement learning using the one of the pole balancing task! ! • Uses a recurrent neural network! ! ! • Uses genetic algorithms to train the network.! ! • The demo is doing the balancing inverted double pendulum task which I suspect is quite hard even for humans ! ! • All done in Javascript which makes for some great demos!
  • 19. RecurrentNN.jl • My first public package (Yay!!) ! • Based on Andrej Karpathy’s implementation in recurrentjs ! • https://github.com/Andy-P/RecurrentNN.jl ! • Implements both Recurrent Neural Networks, and Long-Short-Term Networks ! • Allows one to compose arbitrary network architecture using graph.jl ! • Makes use of Rmsprop (a variant of stochastic gradient decent)
  • 20. graph.jl • Has functionality to construct arbitrary expression graphs over which the library can perform automatic differentiation ! • Similar to what you may find in Theano for Python, or in Torch. ! • Basic idea is to allow the user to compose neural networks then call backprop() and have it all work with the solver ! • https://github.com/Andy-P/RecurrentNN/src/graph.jl
  • 21. type Graph backprop::Array{Function,1} doBackprop::Bool function Graph(backPropNeeded::Bool) new(Array(Function,0),backPropNeeded) end end ! function sigmoid(g::Graph, m::NNMatrix) … if g.doBackprop push!(g.backprop, function () … @inbounds m.dw[i,j] += out.w[i,j] * (1. - out.w[i,j]) * out.dw[i,j] end ) end return out end graph.jl During forward pass we build up an array of anonymous functions to calculate each of the gradients
  • 22. graph.jl type Graph backprop::Array{Function,1} doBackprop::Bool function Graph(backPropNeeded::Bool) new(Array(Function,0),backPropNeeded) end end ! function sigmoid(g::Graph, m::NNMatrix) … if g.doBackprop push!(g.backprop, function () … @inbounds m.dw[i,j] += out.w[i,j] * (1. - out.w[i,j]) * out.dw[i,j] end ) end return out end … # use built up graph of backprop functions # to compute backprop (set .dw fields in matirices) for i = length(g.backprop):-1:1 g.backprop[i]() end Then we loop backwards through the array calling each of the functions to propagate the gradients backwards through the network
  • 23. solver.jl function step(solver::Solver, model::Model, …) … for k = 1:length(modelMatices) @inbounds m = modelMatices[k] # mat ref @inbounds s = solver.stepcache[k] for i = 1:m.n for j = 1:m.d ! # rmsprop adaptive learning rate @inbounds mdwi = m.dw[i,j] @inbounds s.w[i,j] = s.w[i,j] * solver.decayrate + (1.0 - solver.decayrate) * mdwi^2 ! # gradient clip … ! # update and regularize @inbounds m.w[i,j] += - stepsize * mdwi / sqrt(s.w[i,j] + solver.smootheps) - regc * m.w[i,j] end end end … end Now that we have calculated each of the gradients, we can call the solver to loop through and update each of the weights based on the gradients we stored during the backprop pass RMSProp uses an adaptive learning rate for each individual parameter
  • 24. solve.jl Examples of RmsProp vs other optimization algorithms http://imgur.com/a/Hqolp
  • 25. example.jl • Based on I. Sutskever et.al. “Generating Text with Recurrent Neural Networks” ICML, 2011! ! • Closely follows Andrej Karpathy’s example! ! • Read in about 1400 English Sentences from Paul Graham’s essay’s on what makes a successful start-up! ! • Learns to predict the next character from the previous character! ! • Uses perplexity for cost function! ! • Takes about 8-12hrs to get a good model (need to anneal learning rate)! ! • letter embedding = 6, hidden units = 100 (note example default is set to 5 & [20,20])
  • 26. sample output -1hr • be bet sroud thir an • the to be startups dalle a boticast that co thas as tame goudtent wist • the dase mede dosle on astasing sandiry if the the op • that the dor slous seof the pos to they wame mace thas theming obs and secofcagires morlillers dure t • you i it stark to fon'te nallof the they coulker imn to suof imas to ge thas int thals le withe the t
  • 27. sample output -5hrs ! • you dire prefor reple take stane to of conwe that there cimh the don't than high breads them one gro • but startups you month • work of have not end a will araing thec sow about startup maunost matate thinkij the show that's but • you dire prefor reple take stane to of conwe that there cimh the don't than high breads them one gro • but cashe the sowe the mont pecipest fitlid just • Argmax: it's the startups the the seem the startups the the seem the startups the the seem the startups the
  • 28. sample output -10hrs • and if will be dismiss we can all they have to be a demo every looking • you stall the right take to grow fast, you won't back • new rectionally not a lot of that the initial single of optimizing money you don't prosperity don't pl • when you she have to probably as one there are on the startup ideas week • the startup need of to a company is the doesn't raise in startups who confident is that doesn't usual
  • 29. What’s not yet so great about this package?
  • 30. What’s not yet so great about this package? Garbage Collection ! • Tried to keep close to the original implementation to make regression testing easier ! • Karpathy’s version frequently uses JS’ push to build arrays of matrices ! • This is appropriate in Javascript but creates a lot of GC in Julia. ! • The likely fix is to create the arrays only once and then update them inline on each pass (version 0.2!) Model Types ! • Models need some kind of interface that the solver can call to get the collection of matrices ! • At the moment that is implemented in collectNNMat() function ! • Could be tightened up by making this part of the initialization of the models !
  • 31. Thank you! Andre Pemmelaar @QuantixResearch Julia Tokyo Meetup - April 2015