SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Introduction  to  Chainer:
A  Flexible  Framework  for  Deep  Learning
2015-‐‑‒06-‐‑‒18  PFI/PFN  Weekly  Seminar
Seiya  Tokui  (Preferred  Networks)
Self-‐‑‒Introduction
l  Seiya  Tokui    @beam2d  (Twitter,  GitHub)
l  Researcher  at  Preferred  Networks
l  Main  focus:  machine  learning
–  Learning  to  Hash  (master  degree)
–  Deep  Learning,  Representation  Learning  (current  focus)
2
3
A Powerful, Flexible, and Intuitive Framework of Neural Networks
Today  I  will  introduce:
l  The  features  of  Chainer
l  How  to  use  Chainer
l  Some  planned  features
l  (Slide  in  English,  talk  in  Japanese)
: The Concept
5
Chainer  is  a  framework  of  neural  networks
l  Official  site:  http://chainer.org  
l  Repository:  https://github.com/pfnet/chainer
l  Provided  as  a  Python  library  (PyPI:  chainer)
l  Main  features
–  Powerful:Supports  CUDA  and  multi-‐‑‒GPU  capability
–  Flexible: Support  almost  arbitrary  architectures
–  Intuitive: Forward  prop  can  be  written  as  a  regular  Python  code
Elements  of  a  neural  network  framework
l  Multi-‐‑‒dimensional  array  implementations
l  Layer  implementations
–  Called  in  various  names  (layers,  modules,  blocks,  primitives,  etc...)
–  The  smallest  units  of  automatic  differentiation
–  Contain  forward  and  backward  implementations
l  Optimizer  implementations
l  Other  stuffs  (data  loading  scheme,  training  loop,  etc...)
–  These  are  also  very  important,  though  Chainer  currently  does  not  
provide  their  abstraction  (future  work)
7
Forward  prop  /  Backprop
l  Forward  prop  is  how  we  want  to  process  the  input  data
l  Backprop  computes  its  gradient  for  the  learnable  parameters
l  Given  backward  procedures  of  all  layers,  backprop  can  be  written  as  
their  combination  (a.k.a.  reverse-‐‑‒mode  automatic  differentiation)
8
input hidden output groundtruth
loss  func
gradgradgrad
hidden
Backprop  Implementation  Paradigm  (1)
Define-‐‑‒and-‐‑‒Run
l  First,  a  computational  graph  is  constructed.  Then,  it  is  periodically  fed  
with  minibatches  to  do  forward/backward
l  The  computational  graph  can  be  seen  as  a  program  and  the  forward/
backward  computation  is  done  by  its  interpreter
u  Caffe:  the  program  is  written  by  Prototxt
u  Torch:  the  program  is  constructed  by  Lua  scripts
u  Theano-‐‑‒based  frameworks:  the  program  is  constructed  by  Python  
scripts
Backprop  Implementation  Paradigm  (2)
Define-‐‑‒and-‐‑‒Run  (cont.)
l  Pros
–  (Almost)  No  need  of  memory  management
–  The  computational  graph  can  be  implicitly  optimized  (cf.  Theano)
l  Cons
–  The  program  is  fixed  within  the  training  loop
–  The  interpreter  must  have  capability  of  defining  various  forward  
computations,  including  control-‐‑‒flow  statements  like  if  and  for
u  Theano  has  the  dedicated  functions  for  them  (ifelse  and  scan),  
which  are  unintuitive  and  not  Pythonic
–  Network  definition  is  hard  to  debug,  since  an  error  occurs  at  the  
forward  computation  that  is  far  apart  from  the  network  definition
Backprop  Implementation  Paradigm  (3)
Define-‐‑‒by-‐‑‒Run
l  The  forward  computation  is  written  as  a  regular  program  code  with  
special  variables  and  operators,  executing  which  simultaneously  involves  
the  forward  computation  and  the  graph  construction  (just  by  storing  the  
order  of  operations).
l  The  graph  is  used  for  the  backward  computation.
l  This  paradigm  enables  us  to  use  arbitrary  control  flow  statements  in  the  
forward  computation
–  No  need  of  a  mini  language  and  its  interpreter
l  It  also  makes  the  forward  computation  intuitive  and  easy  to  debug
Backprop  Implementation  Paradigm  (4)
Define-‐‑‒by-‐‑‒Run  (cont.)
l  The  computational  graph  can  be  modified  within  each  iteration
l  Example:  Truncated  BPTT  (BackProp  Through  Time)
–  BPTT:  Backprop  on  a  recurrent  net
–  Truncated  BPTT:  Truncate  the  backprop  at  some  time  point
–  Truncation  is  one  type  of  modification  of  the  computational  graph
Truncated
Features  of  Chainer
l  Define-‐‑‒by-‐‑‒Run  scheme
–  Forward  computation  can  contain  any  Python  code
u  if-else,  for-else,  break,  continue,  try-except-finally,  
list,  dict,  class,  etc...
–  User  can  modify  the  graph  within  the  loop
u  E.g.  truncation  can  be  done  by  unchain_̲backward  (which  
unchains  the  graph  backward  from  some  variable)
u  See  the  tutorial  on  recurrent  nets
http://docs.chainer.org/en/latest/tutorial/recurrentnet.html
l  Predefined  functions
l  Support  GPU(s)  via  PyCUDA
Example:  Training  a  multi-‐‑‒layer  perceptron  in  one  page
Full  code  is  in  the  tutorial  and  the  example  directory.
# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10))
opt = optimizers.SGD()
opt.setup(
model.collect_parameters())
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(...)
t = Variable(...)
opt.zero_grads()
loss = forward(x, t)
loss.backward()
opt.update()
Example:  Recurrent  net  language  model  in  one  page
Full  code  is  in  the  tutorial  and  the  example  directory.
# Model definition
model = FunctionSet(
emb=F.EmbedID(1000, 100),
x2h=F.Linear( 100, 50),
h2h=F.Linear( 50, 50),
h2y=F.Linear( 50, 1000))
opt = optimizers.SGD()
opt.setup(
model.collect_parameters())
# Forward computation of one step
def fwd1step(h, w, t):
x = F.tanh(model.emb(w))
h = F.tanh(model.x2h(x) + model.h2h(h))
y = model.h2y(h)
return h, F.softmax_cross_entropy(y, t)
# Full RNN forward computation
def forward(seq):
h = Variable(...) # init state
loss = 0
for curw, nextw in 
zip(seq, seq[1:]):
x = Variable(curw)
t = Variable(nextw)
h, new_loss = fwd1step(h, x, t)
loss += new_loss
return loss
: How to Use It
16
Install  Chainer
l  Prepare  a  Python  2.7  environment  with  pip
–  (Pyenv+)Anaconda  is  recommended
l  Install  Chainer  just  by
pip install chainer
l  If  you  want  to  use  GPU(s),  do:
–  Install  CUDA  and  the  corresponding  NVIDIA  driver
–  Install  dependent  packages  by
pip install chainer-cuda-deps
–  You  may  have  to  update  the  six package
pip install –U six
Run  the  MNIST  example  (quick  start)
l  Require  scikit-‐‑‒learn  installed:  pip install scikits.learn
l  Clone  the  repository  of  Chainer:  
git clone https://github.com/pfnet/chainer
l  Go  to  the  example  directory  at  examples/mnist
l  Then,  run  python train_mnist.py
–  Run  on  GPU  by  passing  --gpu=0
l  Other  examples  can  be  similarly  executed  (some  needs  manual  
preparation  of  datasets)
Read  the  documents
l  Read  the  documents  at  http://docs.chainer.org
l  It  includes:
–  Tutorial
–  Reference  manual
l  All  features  given  in  this  talk  are  introduced  by  the  tutorial,  so  please  try  
it  if  you  want  to  know  the  detail.
Basic  concepts  (1)
l  Essential  part  of  Chainer:  Variable  and  Function
l  Variable  is  a  wrapper  of  n-‐‑‒dimensional  arrays  (ndarray  and  GPUArray)
l  Function  is  an  operation  on  Variables
–  Function  application  is  memorized  by  the  returned  Variable(s)
–  All  operations  for  which  you  want  to  backprop  must  be  done  by  
Functions  on  Variables
l  Making  a  Variable  object  is  simple:  just  pass  an  array
x = chainer.Variable(numpy.ndarray(...))
–  The  array  is  stored  in  data  attribute  (x.data)
Basic  concepts  (2)
l  Example  of  the  computational  graph  construction
x = chainer.Variable(...)
y = chainer.Variable(...)
z = x**2 + 2*x*y + y
l  Gradient  of  z(x,  y)  can  be  computed  by  z.backward()
l  Results  are  stored  in  x.grad  and  y.grad
x
y
_ ** 2
2 * _ _ * _ _ + _ z
_ + _
Actually, Split nodes are automatically
inserted (they accumulate the gradients
on backprop)
Basic  concepts  (3)
l  Chainer  provides  many  functions  in  chainer.functions  subpackage
–  This  package  is  often  abbreviated  to  F
l  Parameterized  functions  are  provided  as  classes
–  Linear,  Convolution2D,  EmbedID,  PReLU,  BatchNormalization,  etc.
–  Their  instances  should  be  shared  across  all  iterations
l  Non-‐‑‒parameterized  functions  are  provided  as  Python  functions
–  Activation  functions,  pooling,  array  manipulation,  etc.
Basic  concepts  (4)
l  Use  FunctionSet  to  manage  parameterized  functions
–  It  is  an  object  with  Function  attributes
–  Easy  to  migrate  functions  onto  GPU  devices
–  Easy  to  collect  parameters  and  gradients  (collect_̲parameters)
l  Use  Optimizer  for  numerical  optimization
–  Major  algorithms  are  provided:
SGD,  MomentumSGD,  AdaGrad,  RMSprop,  ADADELTA,  Adam
–  Some  parameter/gradient  manipulations  are  done  via  this  class:
weight  decay,  gradient  clip,  
Easy  to  debug!
l  If  the  forward  computation  has  a  bug,  then  an  error  occurs  immediately  
at  the  appropriate  line  of  the  forward  definition
l  Example
–  This  code  has  inconsistency  of  the  array  size:
x = Variable(np.ndarray((3, 4), dtype=np.float32)
y = Variable(np.ndarray((3, 3), dtype=np.float32)
a = x ** 2 + x
b = a + y * 2
c = b + x * 2
–  Since  an  exception  is  raised  at  the  appropriate  line,  we  can  easily  find  
the  cause  of  bug  (this  is  one  big  difference  from  Define-‐‑‒and-‐‑‒Run  
frameworks)
← an exception is raised at this line
Graph  manipulation  (1)
l  Backward  unchaining:  y.unchain_backward()
–  It  purges  the  nodes  backward  from  y
–  It  is  useful  to  implement  truncated  BPTT  (see  PTB  example)
x f y g z
y g z
y.unchain_backward()
Graph  manipulation  (2)
l  Volatile  variables:  x = Variable(..., volatile=True)
–  Volatile  variable  does  not  build  a  graph
–  Volatility  can  be  accessed  directly  by  x.volatile
x = Variable(..., volatile=True)
y = f(x)
y.volatile = False
z = h(y)
x f y g z
Example:  Training  a  multi-‐‑‒layer  perceptron  in  one  page
Note:  F = chainer.functions
# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10))
opt = optimizers.SGD()
opt.setup(
model.collect_parameters())
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(...)
t = Variable(...)
opt.zero_grads()
loss = forward(x, t)
loss.backward()
opt.update()
Example:  Recurrent  net  language  model  in  one  page
# Model definition
model = FunctionSet(
emb=F.EmbedID(1000, 100),
x2h=F.Linear( 100, 50),
h2h=F.Linear( 50, 50),
h2y=F.Linear( 50, 1000))
opt = optimizers.SGD()
opt.setup(
model.collect_parameters())
# Forward computation of one step
def fwd1step(h, w, t):
x = F.tanh(model.emb(w))
h = F.tanh(model.x2h(x) + model.h2h(h))
y = model.h2y(h)
return h, F.softmax_cross_entropy(y, t)
# Full RNN forward computation
def forward(seq):
h = Variable(...) # init state
loss = 0
for curw, nextw in 
zip(seq, seq[1:]):
x = Variable(curw)
t = Variable(nextw)
h, new_loss = fwd1step(h, x, t)
loss += new_loss
return loss
CUDA  support  (1)
l  Chainer  supports  CUDA  computation
l  Installation
–  Install  CUDA  6.5+
–  Install  CUDA-‐‑‒related  packages  by
pip install chainer-cuda-deps
u  Build  of  PyCUDA  may  fail  if  you  install  CUDA  into  non-‐‑‒standard  
path.  In  such  case,  you  have  to  install  PyCUDA  from  source  code  
with  appropriate  configuration.
CUDA  support  (2)
l  Call  cuda.init() before  any  CUDA-‐‑‒related  operations
l  Converts  numpy.ndarray  into  GPUArray  by  chainer.cuda.to_gpu
data_gpu = chainer.cuda.to_gpu(data_cpu)
l  A  GPUArray  object  can  be  passed  to  the  Variable  constructor
x = Variable(data_gpu)
l  Most  functions  support  GPU  Variables
–  Parameterized  functions  must  be  sent  to  GPU  beforehand  by  
Function.to_gpu  or  FunctionSet.to_gpu
l  Extracts  the  results  to  host  memory  by  chainer.cuda.to_cpu
l  All  examples  support  CUDA  (pass  --gpu=N,  where  N  is  the  GPU  ID)
MLP  example  for  CUDA
# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(
model.collect_parameters())
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)
loss.backward()
opt.update()
CUDA  support  (3)
l  Chainer  also  supports  computation  on  multiple  GPUs  (easily!)
l  Model  parallel
–  Send  FunctionSets  to  appropriate  devices  (to_̲gpu  accepts  GPU  ID)
model_0 = FunctionSet(...).to_gpu(0)
model_1 = FunctionSet(...).to_gpu(1)
–  Copy  Variable  objects  across  GPUs  by  copy  function
x_1 = F.copy(x_0, 1)
u  This  copy  is  tracked  by  the  computational  graph,  so  you  donʼ’t  
need  to  deal  with  it  on  backprop
CUDA  support  (4)
l  Chainer  also  supports  computation  on  multiple  GPUs
l  Data  parallel
–  FunctionSet  can  be  copied  by  copy.copy
model = FunctionSet(...)
model_0 = copy.copy(model_0).to_gpu(0)
model_1 = model_1.to_gpu(1)
–  Set  up  the  optimizer  only  for  the  master  model
opt.setup(model_0.collect_parameters())
–  After  data-‐‑‒parallel  gradient  computation,  gather  them
opt.accumulate_grads(model_1.gradients)
–  After  the  update,  share  them  across  model  copies
model_1.copy_parameters_from(model_0.parameters)
Model  Zoo  support  (in  the  near  future)
l  Model  Zoo  is  a  place  that  pretrained  models  are  registered
–  Provided  by  BVLC  Caffe  team
–  It  contains  the  Caffe  reference  models
l  We  are  planning  to  support  the  Caffe  reference  models  in  three  weeks  
(the  next  minor  release)
–  Current  design  (it  may  be  changed):
f = CaffeFunction(‘path/to/model.caffemodel’)
x, t = Variable(...), Variable(...)
y = f(inputs={‘data’: x, ‘label’: t}, outputs=[‘loss’])
–  It  emulates  Caffe  networks  by  Chainerʼ’s  functions
Note:  development  process
l  Schedule
–  We  are  planning  to  release  updates  biweekly
–  Updates  are  classified  into  three  groups
u  Revision:  bug  fixes,  updates  without  adding/modifying  interfaces
u  Minor:  Updates  that  add/modify  interfaces  without  lacking  
backward  compatibility
u  Major:  Updates  that  are  not  backward-‐‑‒compatible
l  We  are  using  the  GitHub-‐‑‒flow  process
l  We  welcome  your  PRs!
–  Please  send  them  to  the  master  branch
Wrap  up
l  Chainer  is  a  powerful,  flexible,  and  intuitive  framework  of  neural  
networks  in  Python
l  It  is  based  on  Define-‐‑‒by-‐‑‒Run  scheme,  which  makes  it  intuitive  and  
flexible
l  Chainer  is  a  very  young  project  and  immature
–  Its  development  started  at  mid.  April  (just  two  months  ago)
–  We  will  add  many  functionailities  (especially  more  functions)
–  We  may  add  some  abstraction  of  whole  learning  processes

Más contenido relacionado

La actualidad más candente

Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Preferred Networks
 
Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentationkammeyer
 
Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev planSeiya Tokui
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...PyData
 
Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Niketan Pansare
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Intel® Software
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mpranjit banshpal
 
OpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersOpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersDhanashree Prasad
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyTravis Oliphant
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUShohei Hido
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingVengada Karthik Rangaraju
 
Intro to OpenMP
Intro to OpenMPIntro to OpenMP
Intro to OpenMPjbp4444
 
Concurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionConcurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionCherryBerry2
 

La actualidad más candente (20)

Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018
 
Deep parking
Deep parkingDeep parking
Deep parking
 
Buzzwords Numba Presentation
Buzzwords Numba PresentationBuzzwords Numba Presentation
Buzzwords Numba Presentation
 
Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev plan
 
Numba: Flexible analytics written in Python with machine-code speeds and avo...
Numba:  Flexible analytics written in Python with machine-code speeds and avo...Numba:  Flexible analytics written in Python with machine-code speeds and avo...
Numba: Flexible analytics written in Python with machine-code speeds and avo...
 
Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school Notes from 2016 bay area deep learning school
Notes from 2016 bay area deep learning school
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
OpenMP
OpenMPOpenMP
OpenMP
 
Parallelization using open mp
Parallelization using open mpParallelization using open mp
Parallelization using open mp
 
Numba Overview
Numba OverviewNumba Overview
Numba Overview
 
Open mp intro_01
Open mp intro_01Open mp intro_01
Open mp intro_01
 
OpenMP Tutorial for Beginners
OpenMP Tutorial for BeginnersOpenMP Tutorial for Beginners
OpenMP Tutorial for Beginners
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel Programming
 
Intro to OpenMP
Intro to OpenMPIntro to OpenMP
Intro to OpenMP
 
Concurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionConcurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System Discussion
 
OpenMP And C++
OpenMP And C++OpenMP And C++
OpenMP And C++
 
openmp
openmpopenmp
openmp
 

Destacado

TensorFlow White Paperを読む
TensorFlow White Paperを読むTensorFlow White Paperを読む
TensorFlow White Paperを読むYuta Kashino
 
文字認識はCNNで終わるのか?
文字認識はCNNで終わるのか?文字認識はCNNで終わるのか?
文字認識はCNNで終わるのか?Seiichi Uchida
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]Yuta Kikuchi
 
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)Toshihiko Yamakami
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情Yuta Kikuchi
 
深層学習の非常に簡単な説明
深層学習の非常に簡単な説明深層学習の非常に簡単な説明
深層学習の非常に簡単な説明Seiichi Uchida
 

Destacado (7)

TensorFlow White Paperを読む
TensorFlow White Paperを読むTensorFlow White Paperを読む
TensorFlow White Paperを読む
 
OCRは古い技術
OCRは古い技術OCRは古い技術
OCRは古い技術
 
文字認識はCNNで終わるのか?
文字認識はCNNで終わるのか?文字認識はCNNで終わるのか?
文字認識はCNNで終わるのか?
 
A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]A Neural Attention Model for Sentence Summarization [Rush+2015]
A Neural Attention Model for Sentence Summarization [Rush+2015]
 
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)サルでもわかるディープラーニング入門 (2017年) (In Japanese)
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情
 
深層学習の非常に簡単な説明
深層学習の非常に簡単な説明深層学習の非常に簡単な説明
深層学習の非常に簡単な説明
 

Similar a Introduction to Chainer: A Flexible Framework for Deep Learning

Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Valeriy Kravchuk
 
Common Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksCommon Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksKenta Oono
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSTulipp. Eu
 
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Valeriy Kravchuk
 
Tech Days 2015: ARM Programming with GNAT and Ada 2012
Tech Days 2015: ARM Programming with GNAT and Ada 2012Tech Days 2015: ARM Programming with GNAT and Ada 2012
Tech Days 2015: ARM Programming with GNAT and Ada 2012AdaCore
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIYoni Davidson
 
Multithreaded_Programming_in_Python.pdf
Multithreaded_Programming_in_Python.pdfMultithreaded_Programming_in_Python.pdf
Multithreaded_Programming_in_Python.pdfgiridharsripathi
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLPeace Lee
 
Deep learning - the conf br 2018
Deep learning - the conf br 2018Deep learning - the conf br 2018
Deep learning - the conf br 2018Fabio Janiszevski
 
FOSDEM2018 Janus Lua plugin presentation
FOSDEM2018 Janus Lua plugin presentationFOSDEM2018 Janus Lua plugin presentation
FOSDEM2018 Janus Lua plugin presentationLorenzo Miniero
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesDr. Fabio Baruffa
 
[2016/2017] AADL (Architecture Analysis and Design Language)
[2016/2017] AADL (Architecture Analysis and Design Language)[2016/2017] AADL (Architecture Analysis and Design Language)
[2016/2017] AADL (Architecture Analysis and Design Language)Ivano Malavolta
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesKoan-Sin Tan
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report AlmkdadAli
 

Similar a Introduction to Chainer: A Flexible Framework for Deep Learning (20)

NS3 Overview
NS3 OverviewNS3 Overview
NS3 Overview
 
Multicore
MulticoreMulticore
Multicore
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
 
Common Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksCommon Design of Deep Learning Frameworks
Common Design of Deep Learning Frameworks
 
HiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOSHiPEAC 2019 Tutorial - Maestro RTOS
HiPEAC 2019 Tutorial - Maestro RTOS
 
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
Dynamic tracing of MariaDB on Linux - problems and solutions (MariaDB Server ...
 
Tech Days 2015: ARM Programming with GNAT and Ada 2012
Tech Days 2015: ARM Programming with GNAT and Ada 2012Tech Days 2015: ARM Programming with GNAT and Ada 2012
Tech Days 2015: ARM Programming with GNAT and Ada 2012
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
 
Automation tools: making things go... (March 2019)
Automation tools: making things go... (March 2019)Automation tools: making things go... (March 2019)
Automation tools: making things go... (March 2019)
 
Multithreaded_Programming_in_Python.pdf
Multithreaded_Programming_in_Python.pdfMultithreaded_Programming_in_Python.pdf
Multithreaded_Programming_in_Python.pdf
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGL
 
Deep learning - the conf br 2018
Deep learning - the conf br 2018Deep learning - the conf br 2018
Deep learning - the conf br 2018
 
FOSDEM2018 Janus Lua plugin presentation
FOSDEM2018 Janus Lua plugin presentationFOSDEM2018 Janus Lua plugin presentation
FOSDEM2018 Janus Lua plugin presentation
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Onnc intro
Onnc introOnnc intro
Onnc intro
 
[2016/2017] AADL (Architecture Analysis and Design Language)
[2016/2017] AADL (Architecture Analysis and Design Language)[2016/2017] AADL (Architecture Analysis and Design Language)
[2016/2017] AADL (Architecture Analysis and Design Language)
 
Ryu sdn framework
Ryu sdn framework Ryu sdn framework
Ryu sdn framework
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
 
Machine learning Experiments report
Machine learning Experiments report Machine learning Experiments report
Machine learning Experiments report
 

Más de Seiya Tokui

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Seiya Tokui
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerSeiya Tokui
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開Seiya Tokui
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep LearningSeiya Tokui
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Seiya Tokui
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用Seiya Tokui
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Seiya Tokui
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing TrickSeiya Tokui
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待Seiya Tokui
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative ModelsSeiya Tokui
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSeiya Tokui
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Seiya Tokui
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今Seiya Tokui
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelSeiya Tokui
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionSeiya Tokui
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来Seiya Tokui
 

Más de Seiya Tokui (20)

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来
 
Tprimal agh
Tprimal aghTprimal agh
Tprimal agh
 
rinko2011-agh
rinko2011-aghrinko2011-agh
rinko2011-agh
 
rinko2010
rinko2010rinko2010
rinko2010
 

Último

Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 

Último (20)

Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 

Introduction to Chainer: A Flexible Framework for Deep Learning

  • 1. Introduction  to  Chainer: A  Flexible  Framework  for  Deep  Learning 2015-‐‑‒06-‐‑‒18  PFI/PFN  Weekly  Seminar Seiya  Tokui  (Preferred  Networks)
  • 2. Self-‐‑‒Introduction l  Seiya  Tokui    @beam2d  (Twitter,  GitHub) l  Researcher  at  Preferred  Networks l  Main  focus:  machine  learning –  Learning  to  Hash  (master  degree) –  Deep  Learning,  Representation  Learning  (current  focus) 2
  • 3. 3 A Powerful, Flexible, and Intuitive Framework of Neural Networks
  • 4. Today  I  will  introduce: l  The  features  of  Chainer l  How  to  use  Chainer l  Some  planned  features l  (Slide  in  English,  talk  in  Japanese)
  • 6. Chainer  is  a  framework  of  neural  networks l  Official  site:  http://chainer.org   l  Repository:  https://github.com/pfnet/chainer l  Provided  as  a  Python  library  (PyPI:  chainer) l  Main  features –  Powerful:Supports  CUDA  and  multi-‐‑‒GPU  capability –  Flexible: Support  almost  arbitrary  architectures –  Intuitive: Forward  prop  can  be  written  as  a  regular  Python  code
  • 7. Elements  of  a  neural  network  framework l  Multi-‐‑‒dimensional  array  implementations l  Layer  implementations –  Called  in  various  names  (layers,  modules,  blocks,  primitives,  etc...) –  The  smallest  units  of  automatic  differentiation –  Contain  forward  and  backward  implementations l  Optimizer  implementations l  Other  stuffs  (data  loading  scheme,  training  loop,  etc...) –  These  are  also  very  important,  though  Chainer  currently  does  not   provide  their  abstraction  (future  work) 7
  • 8. Forward  prop  /  Backprop l  Forward  prop  is  how  we  want  to  process  the  input  data l  Backprop  computes  its  gradient  for  the  learnable  parameters l  Given  backward  procedures  of  all  layers,  backprop  can  be  written  as   their  combination  (a.k.a.  reverse-‐‑‒mode  automatic  differentiation) 8 input hidden output groundtruth loss  func gradgradgrad hidden
  • 9. Backprop  Implementation  Paradigm  (1) Define-‐‑‒and-‐‑‒Run l  First,  a  computational  graph  is  constructed.  Then,  it  is  periodically  fed   with  minibatches  to  do  forward/backward l  The  computational  graph  can  be  seen  as  a  program  and  the  forward/ backward  computation  is  done  by  its  interpreter u  Caffe:  the  program  is  written  by  Prototxt u  Torch:  the  program  is  constructed  by  Lua  scripts u  Theano-‐‑‒based  frameworks:  the  program  is  constructed  by  Python   scripts
  • 10. Backprop  Implementation  Paradigm  (2) Define-‐‑‒and-‐‑‒Run  (cont.) l  Pros –  (Almost)  No  need  of  memory  management –  The  computational  graph  can  be  implicitly  optimized  (cf.  Theano) l  Cons –  The  program  is  fixed  within  the  training  loop –  The  interpreter  must  have  capability  of  defining  various  forward   computations,  including  control-‐‑‒flow  statements  like  if  and  for u  Theano  has  the  dedicated  functions  for  them  (ifelse  and  scan),   which  are  unintuitive  and  not  Pythonic –  Network  definition  is  hard  to  debug,  since  an  error  occurs  at  the   forward  computation  that  is  far  apart  from  the  network  definition
  • 11. Backprop  Implementation  Paradigm  (3) Define-‐‑‒by-‐‑‒Run l  The  forward  computation  is  written  as  a  regular  program  code  with   special  variables  and  operators,  executing  which  simultaneously  involves   the  forward  computation  and  the  graph  construction  (just  by  storing  the   order  of  operations). l  The  graph  is  used  for  the  backward  computation. l  This  paradigm  enables  us  to  use  arbitrary  control  flow  statements  in  the   forward  computation –  No  need  of  a  mini  language  and  its  interpreter l  It  also  makes  the  forward  computation  intuitive  and  easy  to  debug
  • 12. Backprop  Implementation  Paradigm  (4) Define-‐‑‒by-‐‑‒Run  (cont.) l  The  computational  graph  can  be  modified  within  each  iteration l  Example:  Truncated  BPTT  (BackProp  Through  Time) –  BPTT:  Backprop  on  a  recurrent  net –  Truncated  BPTT:  Truncate  the  backprop  at  some  time  point –  Truncation  is  one  type  of  modification  of  the  computational  graph Truncated
  • 13. Features  of  Chainer l  Define-‐‑‒by-‐‑‒Run  scheme –  Forward  computation  can  contain  any  Python  code u  if-else,  for-else,  break,  continue,  try-except-finally,   list,  dict,  class,  etc... –  User  can  modify  the  graph  within  the  loop u  E.g.  truncation  can  be  done  by  unchain_̲backward  (which   unchains  the  graph  backward  from  some  variable) u  See  the  tutorial  on  recurrent  nets http://docs.chainer.org/en/latest/tutorial/recurrentnet.html l  Predefined  functions l  Support  GPU(s)  via  PyCUDA
  • 14. Example:  Training  a  multi-‐‑‒layer  perceptron  in  one  page Full  code  is  in  the  tutorial  and  the  example  directory. # Model definition model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)) opt = optimizers.SGD() opt.setup( model.collect_parameters()) # Forward computation def forward(x, t): h1 = F.relu(model.l1(x)) h2 = F.relu(model.l2(h1)) y = model.l3(h2) return F.softmax_cross_entropy(y, t) # Training loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(...) t = Variable(...) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update()
  • 15. Example:  Recurrent  net  language  model  in  one  page Full  code  is  in  the  tutorial  and  the  example  directory. # Model definition model = FunctionSet( emb=F.EmbedID(1000, 100), x2h=F.Linear( 100, 50), h2h=F.Linear( 50, 50), h2y=F.Linear( 50, 1000)) opt = optimizers.SGD() opt.setup( model.collect_parameters()) # Forward computation of one step def fwd1step(h, w, t): x = F.tanh(model.emb(w)) h = F.tanh(model.x2h(x) + model.h2h(h)) y = model.h2y(h) return h, F.softmax_cross_entropy(y, t) # Full RNN forward computation def forward(seq): h = Variable(...) # init state loss = 0 for curw, nextw in zip(seq, seq[1:]): x = Variable(curw) t = Variable(nextw) h, new_loss = fwd1step(h, x, t) loss += new_loss return loss
  • 16. : How to Use It 16
  • 17. Install  Chainer l  Prepare  a  Python  2.7  environment  with  pip –  (Pyenv+)Anaconda  is  recommended l  Install  Chainer  just  by pip install chainer l  If  you  want  to  use  GPU(s),  do: –  Install  CUDA  and  the  corresponding  NVIDIA  driver –  Install  dependent  packages  by pip install chainer-cuda-deps –  You  may  have  to  update  the  six package pip install –U six
  • 18. Run  the  MNIST  example  (quick  start) l  Require  scikit-‐‑‒learn  installed:  pip install scikits.learn l  Clone  the  repository  of  Chainer:   git clone https://github.com/pfnet/chainer l  Go  to  the  example  directory  at  examples/mnist l  Then,  run  python train_mnist.py –  Run  on  GPU  by  passing  --gpu=0 l  Other  examples  can  be  similarly  executed  (some  needs  manual   preparation  of  datasets)
  • 19. Read  the  documents l  Read  the  documents  at  http://docs.chainer.org l  It  includes: –  Tutorial –  Reference  manual l  All  features  given  in  this  talk  are  introduced  by  the  tutorial,  so  please  try   it  if  you  want  to  know  the  detail.
  • 20. Basic  concepts  (1) l  Essential  part  of  Chainer:  Variable  and  Function l  Variable  is  a  wrapper  of  n-‐‑‒dimensional  arrays  (ndarray  and  GPUArray) l  Function  is  an  operation  on  Variables –  Function  application  is  memorized  by  the  returned  Variable(s) –  All  operations  for  which  you  want  to  backprop  must  be  done  by   Functions  on  Variables l  Making  a  Variable  object  is  simple:  just  pass  an  array x = chainer.Variable(numpy.ndarray(...)) –  The  array  is  stored  in  data  attribute  (x.data)
  • 21. Basic  concepts  (2) l  Example  of  the  computational  graph  construction x = chainer.Variable(...) y = chainer.Variable(...) z = x**2 + 2*x*y + y l  Gradient  of  z(x,  y)  can  be  computed  by  z.backward() l  Results  are  stored  in  x.grad  and  y.grad x y _ ** 2 2 * _ _ * _ _ + _ z _ + _ Actually, Split nodes are automatically inserted (they accumulate the gradients on backprop)
  • 22. Basic  concepts  (3) l  Chainer  provides  many  functions  in  chainer.functions  subpackage –  This  package  is  often  abbreviated  to  F l  Parameterized  functions  are  provided  as  classes –  Linear,  Convolution2D,  EmbedID,  PReLU,  BatchNormalization,  etc. –  Their  instances  should  be  shared  across  all  iterations l  Non-‐‑‒parameterized  functions  are  provided  as  Python  functions –  Activation  functions,  pooling,  array  manipulation,  etc.
  • 23. Basic  concepts  (4) l  Use  FunctionSet  to  manage  parameterized  functions –  It  is  an  object  with  Function  attributes –  Easy  to  migrate  functions  onto  GPU  devices –  Easy  to  collect  parameters  and  gradients  (collect_̲parameters) l  Use  Optimizer  for  numerical  optimization –  Major  algorithms  are  provided: SGD,  MomentumSGD,  AdaGrad,  RMSprop,  ADADELTA,  Adam –  Some  parameter/gradient  manipulations  are  done  via  this  class: weight  decay,  gradient  clip,  
  • 24. Easy  to  debug! l  If  the  forward  computation  has  a  bug,  then  an  error  occurs  immediately   at  the  appropriate  line  of  the  forward  definition l  Example –  This  code  has  inconsistency  of  the  array  size: x = Variable(np.ndarray((3, 4), dtype=np.float32) y = Variable(np.ndarray((3, 3), dtype=np.float32) a = x ** 2 + x b = a + y * 2 c = b + x * 2 –  Since  an  exception  is  raised  at  the  appropriate  line,  we  can  easily  find   the  cause  of  bug  (this  is  one  big  difference  from  Define-‐‑‒and-‐‑‒Run   frameworks) ← an exception is raised at this line
  • 25. Graph  manipulation  (1) l  Backward  unchaining:  y.unchain_backward() –  It  purges  the  nodes  backward  from  y –  It  is  useful  to  implement  truncated  BPTT  (see  PTB  example) x f y g z y g z y.unchain_backward()
  • 26. Graph  manipulation  (2) l  Volatile  variables:  x = Variable(..., volatile=True) –  Volatile  variable  does  not  build  a  graph –  Volatility  can  be  accessed  directly  by  x.volatile x = Variable(..., volatile=True) y = f(x) y.volatile = False z = h(y) x f y g z
  • 27. Example:  Training  a  multi-‐‑‒layer  perceptron  in  one  page Note:  F = chainer.functions # Model definition model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)) opt = optimizers.SGD() opt.setup( model.collect_parameters()) # Forward computation def forward(x, t): h1 = F.relu(model.l1(x)) h2 = F.relu(model.l2(h1)) y = model.l3(h2) return F.softmax_cross_entropy(y, t) # Training loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(...) t = Variable(...) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update()
  • 28. Example:  Recurrent  net  language  model  in  one  page # Model definition model = FunctionSet( emb=F.EmbedID(1000, 100), x2h=F.Linear( 100, 50), h2h=F.Linear( 50, 50), h2y=F.Linear( 50, 1000)) opt = optimizers.SGD() opt.setup( model.collect_parameters()) # Forward computation of one step def fwd1step(h, w, t): x = F.tanh(model.emb(w)) h = F.tanh(model.x2h(x) + model.h2h(h)) y = model.h2y(h) return h, F.softmax_cross_entropy(y, t) # Full RNN forward computation def forward(seq): h = Variable(...) # init state loss = 0 for curw, nextw in zip(seq, seq[1:]): x = Variable(curw) t = Variable(nextw) h, new_loss = fwd1step(h, x, t) loss += new_loss return loss
  • 29. CUDA  support  (1) l  Chainer  supports  CUDA  computation l  Installation –  Install  CUDA  6.5+ –  Install  CUDA-‐‑‒related  packages  by pip install chainer-cuda-deps u  Build  of  PyCUDA  may  fail  if  you  install  CUDA  into  non-‐‑‒standard   path.  In  such  case,  you  have  to  install  PyCUDA  from  source  code   with  appropriate  configuration.
  • 30. CUDA  support  (2) l  Call  cuda.init() before  any  CUDA-‐‑‒related  operations l  Converts  numpy.ndarray  into  GPUArray  by  chainer.cuda.to_gpu data_gpu = chainer.cuda.to_gpu(data_cpu) l  A  GPUArray  object  can  be  passed  to  the  Variable  constructor x = Variable(data_gpu) l  Most  functions  support  GPU  Variables –  Parameterized  functions  must  be  sent  to  GPU  beforehand  by   Function.to_gpu  or  FunctionSet.to_gpu l  Extracts  the  results  to  host  memory  by  chainer.cuda.to_cpu l  All  examples  support  CUDA  (pass  --gpu=N,  where  N  is  the  GPU  ID)
  • 31. MLP  example  for  CUDA # Model definition model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)).to_gpu() opt = optimizers.SGD() opt.setup( model.collect_parameters()) # Forward computation def forward(x, t): h1 = F.relu(model.l1(x)) h2 = F.relu(model.l2(h1)) y = model.l3(h2) return F.softmax_cross_entropy(y, t) # Training loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(to_gpu(...)) t = Variable(to_gpu(...)) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update()
  • 32. CUDA  support  (3) l  Chainer  also  supports  computation  on  multiple  GPUs  (easily!) l  Model  parallel –  Send  FunctionSets  to  appropriate  devices  (to_̲gpu  accepts  GPU  ID) model_0 = FunctionSet(...).to_gpu(0) model_1 = FunctionSet(...).to_gpu(1) –  Copy  Variable  objects  across  GPUs  by  copy  function x_1 = F.copy(x_0, 1) u  This  copy  is  tracked  by  the  computational  graph,  so  you  donʼ’t   need  to  deal  with  it  on  backprop
  • 33. CUDA  support  (4) l  Chainer  also  supports  computation  on  multiple  GPUs l  Data  parallel –  FunctionSet  can  be  copied  by  copy.copy model = FunctionSet(...) model_0 = copy.copy(model_0).to_gpu(0) model_1 = model_1.to_gpu(1) –  Set  up  the  optimizer  only  for  the  master  model opt.setup(model_0.collect_parameters()) –  After  data-‐‑‒parallel  gradient  computation,  gather  them opt.accumulate_grads(model_1.gradients) –  After  the  update,  share  them  across  model  copies model_1.copy_parameters_from(model_0.parameters)
  • 34. Model  Zoo  support  (in  the  near  future) l  Model  Zoo  is  a  place  that  pretrained  models  are  registered –  Provided  by  BVLC  Caffe  team –  It  contains  the  Caffe  reference  models l  We  are  planning  to  support  the  Caffe  reference  models  in  three  weeks   (the  next  minor  release) –  Current  design  (it  may  be  changed): f = CaffeFunction(‘path/to/model.caffemodel’) x, t = Variable(...), Variable(...) y = f(inputs={‘data’: x, ‘label’: t}, outputs=[‘loss’]) –  It  emulates  Caffe  networks  by  Chainerʼ’s  functions
  • 35. Note:  development  process l  Schedule –  We  are  planning  to  release  updates  biweekly –  Updates  are  classified  into  three  groups u  Revision:  bug  fixes,  updates  without  adding/modifying  interfaces u  Minor:  Updates  that  add/modify  interfaces  without  lacking   backward  compatibility u  Major:  Updates  that  are  not  backward-‐‑‒compatible l  We  are  using  the  GitHub-‐‑‒flow  process l  We  welcome  your  PRs! –  Please  send  them  to  the  master  branch
  • 36. Wrap  up l  Chainer  is  a  powerful,  flexible,  and  intuitive  framework  of  neural   networks  in  Python l  It  is  based  on  Define-‐‑‒by-‐‑‒Run  scheme,  which  makes  it  intuitive  and   flexible l  Chainer  is  a  very  young  project  and  immature –  Its  development  started  at  mid.  April  (just  two  months  ago) –  We  will  add  many  functionailities  (especially  more  functions) –  We  may  add  some  abstraction  of  whole  learning  processes