JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules

SS2016 Modern Neural
Computation
Lecture 2: Synaptic
Learning Rules
Hirokazu Tanaka
School of Information Science
Japan Institute of Science and Technology

Neurons communicate through synapses.
In this lecture we will learn:
• Basic anatomy and physiology of synapses
• Rate coding and spike coding
• Hebbian learning
• Spike-timing-dependent plasticity
• Reward-modulated plasticity

Synaptic plasticity underlies behavioral modification.
Kandel (1979) Scientific American; Kandel (2001) Science

Synapses: electrical and chemical neurotransmission
Figure 5.1, Neuroscience 3rd Edition

Long-term potentiation (LTP) of hippocampal synapses
Figure 24.6, Neuroscience 3rd EditionFigure 24.5, Neuroscience 3rd Edition

Long-term potentiation (LTP) of hippocampal synapses
Figures 24.7 & 24.8, Neuroscience 3rd Edition

Molecular mechanisms underlying hippocampal LTP.
Figures 24.9 & 24.10, Neuroscience 3rd Edition

How does a neuron represent information?
Panzari et al. (2010) Trends in Neurosciences

Rate coding: Number of Spikes matters.
Rate coding hypothesis: a neuron represents information through
its spike rate.
Hartline (1940) Am J Physiol; Hartline (1969) Science
Compound eye of horseshoe crab Recoding from optic nerve
Firing patterns of cortical neurons are highly irregular, which are well
approximated by a random Poisson process (Softky & Koch (1993) J Neurosci;
Shadlen & Newsome (1994) Current Biology).

Temporal coding: Spike timing matters.
Temporal coding hypothesis: a neuron represents information
through its spike timings.
Gollisch & Meister (2008) Science Johansson & Birznieks (2004) Nature Neurosci

Hebb’s postulate of activity dependent plasticity.
"Let us assume that the persistence or repetition of a reverberatory
activity (or "trace") tends to induce lasting cellular axon of cell A is
near enough to excite a cell B and repeatedly or persistently takes
part in firing it, some growth process or metabolic change takes
place in one or both cells such that A's efficiency, as one of the cells
firing B, is increased."
Hebbian theory: a theory in neuroscience that proposes an
explanation for the adaptation of neurons in the brain during the
learning process.
Donald O. Hebb (1904-1985)
The Organization of Behavior (1949)
Image source: Wikipedia, Donald O. Hebb

Synaptic plasticity: rate-coding model
1u
2u
3u
1w
2w
3w
T
v vτ =− + w u
v ( )
( )
T
1
T
1
n
n
u u
w w
=
=
u
w


input ratesoutput rate
synaptic strengths
T
v ≈ w u
If we consider a time scale larger
than τ, then

Hebbian plasticity in equation.
vη∆ =w u
1 1
n n
w u
v
w u
η
∆   
   
=   
   ∆   
 
T
η∆ =w uu w
Hebbian learning with input vector u and output v
Vector form:
Or component form:
If the membrane dynamics is fast compared to the timescale of
synaptic plasticity, the output is approximated as:
Then the Hebbian rule now reads:
T
.v = w u

This form of learning rule is unstable.
T
η∆ =w uu w
T
η η∆= =w uu w Cw
Covariance matrix of random inputs
T
=C uu Wishart matrix
If inputs u1, …, un are i.i.d., their covariance matrix is called the
Wishart matrix (Wishart, 1936):
All eigenvalues of a Wishart matrix are non-negative.
Hebbian learning with single input u
Hebbian learning with input ensemble
Exercise 1

This form of learning rule is unstable.
Eigenvalue decomposition
1, ,i i i i nλ= =Ce e  1 0nλ λ≥ ≥ ≥
η∆ =w Cw
i i
i
a= ∑w e
i i ia aηλ∆ =
All eigenvalues of a Wishart matrix are non-negative.
The eigenvectors form a basis for the n-dim space, and the weight
vector w may be decomposed into the eigenvectors:
Then, the Hebbian learning rule is reduced as:
Therefore, ai grows exponentially, finally diverging to infinity.

Covariance matrix of input has non-negative eigenvalues.
Covariance matrix of random inputs
( )T 2T T T
0≥= =x Cx x uu x u x
1
i i
n
i
a e
=
= ∑x
T
, 1 ,
2
1
,
1
T
n n n
i
i j i j
i j j i j j i i
i
j ia a a a aλ δ λ
= = =
= = =∑ ∑ ∑x Cx e Ce
For any non-zero vector x:
If the vector is decomposed in terms of eigenvectors,
then,
For any {ai} this quantity must be non-negative. Therefore, the
eigenvalues {λi} must be non-negative, too.

Generalization of Hebbian learning.
( )( )v vη∆ = − −w u u
Covariance learning
BCM rule
( )Mv vη θ∆= −w u
Bienenstock, Cooper & Munro (1982) J Neurosci
Sejnowski (1977) Biophys J
φ(v)
v
Synaptic weights change if pre-and post-activities are positively
correlated.
Synaptic plasticity depends linearly on pre-
synaptic activities and nonlinearly on post-
synaptic activity (thresholding).
The thresholding value changes according
to post-synaptic activity (homeostasis).

Generalization of Hebbian learning.
BCM rule
( )Mv vη θ∆= −w u
Bienenstock, Cooper & Munro (1982) J Neurosci
φ(v)
v
Synaptic plasticity depends linearly on pre-
synaptic activities and nonlinearly on post-
synaptic activity (thresholding).
The thresholding value changes according
to post-synaptic activity (homeostasis).
2
EM vθ  =  
( )2
1v vη∆= −w u
There is only one stable fixed point at v=1.

Weight normalization: additive or multiplicative.
vη∆ =w uHebbian learning, , is inherently unstable.
One way to avoid this instability (i.e., divergence) is to impose a
constraint over the weight vector w.
1i
i
w =∑
Additive normalization
Multiplicative normalization
i i j
j
w w v w v
n
η
η∆ = − ∑
( )
( ) ( )
( ) ( )
1
t t
t
t t
+ ∆
+ =
+ ∆
w w
w
w w
1=w
Oja (1982) Neural Networks

Oja learning rule as a principle component analyzer.
Oja learning rule in discrete time
( ) ( ) ( )2
1
v
t v v
v
η
η η
η
+
+ = = + − +
+
w u
w w u w
w u

( ) ( ) ( ) ( ) ( ) ( )( )1t t v t t v t tη+ = + −w w u w
( )
d
v v
dt
η= −
w
u w
( )Td
dt
η= −
w
Cw w Cww
Oja learning rule in continuous time
Oja learning rule in continuous time

Oja learning rule as a principle component analyzer.
( )Td
dt
η= −
w
Cw w Cww
i i
i
a= ∑w e 1, ,i i i i nλ= =Ce e  1 0nλ λ≥ ≥ ≥
2
1
n
i i i j j i
j
a a a aλ λ
=
 
= − 
 
∑
1
i
i
a
b
a
≡
( )1i i ib bλ λ= −
( )1 const, 0 2, ,ia a i n∴ → → =
Eigenvector decomposition

Modeling synapses: conductance-based model.
( )( ) ( )( )rest ex ex in inm
dV
V V g t E V g t E V
dt
τ = − + − + −
LIF excitatory
synapse
inhibitory
synapse
Gerstner (2014) Neuronal Dynamics, Chapter 3
( ) ( )syn
syn syn
f
t t
t
f
g t g e t t
τ
−
−
= Θ −∑
( ) ( ) ( )rise fast slow
syn syn 1 1
f f f
t t t t t t
f
f
g t g e ae a ae t tτ τ τ
− − −
− − −   
   = − + − Θ −
      
∑
exponential with one decay time constant
exponentials with one rise and two decay time constants

dV
V V g t E V g t E V
dt
τ = − + − + −
LIF excitatory
synapse
inhibitory
synapse
( ) ( )syn
syn syn
f
t t
t
f
g t g e t t
τ
−
−
= Θ −∑
syn syn 1 1
f f f
t t t t t t
f
f
g t g e ae a e t tτ τ τ
− − −
− − −   
   = − + − Θ −
      
∑

( ) ( )syn
syn syn
f
t t
t
f
g t g e t t
τ
−
−
= Θ −∑
syn syn 1 1
f f f
t t t t t t
f
f
g t g e ae a e t tτ τ τ
− − −
− − −   
   = − + − Θ −
      
∑
excitatory
inhibitory
rise fast1 ms, 6 msτ τ≈ ≈
rise fast slow25 50 ms, 100 300 ms, 500 1000 msτ τ τ≈ − ≈ − ≈ −
GABAA
GABAB

dV
V V g t E V g t E V
dt
τ = − + − + −
ex
ex ex
dg
g
dt
τ = −
( ) ( )ex ex exg t g t g← +
in
in in
dg
g
dt
τ = −
( ) ( )in in ing t g t g← +
LIF excitatory
synapse
inhibitory
synapse
Dynamics of conductance
Synaptic plasticity: how the peak conductances of excitatory and
inhibitory synapses is modified in an activity-dependent manner.
Song et al. (2000) Nature Neurosci

Spike-timing dependent plasticity (STDP)
Sjöström & Gerstner, Scholarpedia, 5(2):1362. doi:10.4249/scholarpedia.1362
pre-post: potentiation
post-pre: depression

STDP in equations.
( )
:post spikes : pre spikes
n f
ij i j
n f
w W t t∆= −∑ ∑
( )
exp for 0
exp for 0
tA t
W t
tA t
τ
τ
+
+
−
−
  − >   
= 
 − < 
  

Online implementation of STDP learning
( ) ( )
: presynaptic
spike
j f
j j j
f
dx
x a x t t
dt
τ δ+ +=− + −∑ ( ) ( )
: postsynaptic
spike
ni
j i i
n
dy
y a y t t
dt
τ δ− −=− + −∑
xj : presynaptic trace of neuron j
“remembering when presynaptic neuron
j spikes”
yi : postsynaptic trace of neuron i
“remembering when postsynaptic
neuron i spikes”
( ) ( ) ( ) ( )
:postsynaptic : presynaptic
spikes spikes
ij n f
ij j i ij i i
n f
dw
A w x t t A w y t t
dt
δ δ+ −− − −∑ ∑

Weight dependence: hard and soft bounds.
( ) ( ) ( ) ( )
:postsynaptic : presynaptic
spikes spikes
ij ij
ij n f
j i i i
n f
dw
x t t y t tA w A w
dt
δ δ+ −= − − −∑ ∑
Weight learning dynamics
Hard bound rule
(Linear) Soft bound rule
( ) ( ), :A w A w+ − determines the weight dependence of STDP learning rule.
( ) ( )
( ) ( )
maxA w w w
A w w
η
η
+ +
− −
=Θ −
= Θ
For biological reasons, the synaptic weights should be restricted to wmin < w < wmax .
( ) ( )
( )
maxA w w w
A w w
η
η
+ +
− −
= −
=
( )A w+
( )A w−

Temporal all-to-all versus nearest-neighbor spike
interaction.
( ) ( ) ( ) ( )
: presynaptic : postsynaptic
spike spike
,j f ni
j j j i
f n
j i
dx dy
x t t y t t
dt d
a a y
t
xτ δ τ δ+ −+ −=− + − =− + −∑ ∑
Synaptic trace dynamics
( ):a x+ determines how much trace is incremented by spikes.
( ) 1a x+ = ( ) 1a x x+ = −
All-to-all interaction Nearest-neighbor interaction
All spikes contribute additively to the trace,
and the trace is not upper-bounded.
Only the nearest spike contributes to the
trace and the trace is upper-bounded to 1.

Additive vs multiplicative STDP.
van Rossum et al. (2000) J Neurosci.
( )
exp for 0
exp for 0
tA t
W t
tA t
τ
τ
+
+
−
−
  − >   
= 
 − < 
  
( )
exp for 0
exp for 0
tA t
W t
tA tW
τ
τ
+
+
−
−
  − >   
= 
 − < 
  
Additive STDP Multiplicative STDP
Potentiation and depression are
independent of the weight value.
Depression are weight dependent in a
multiplicative way; a large synapse gets
depressed more and a weak synapse less.

Triplet law: three-spike interaction
pre pre1 2
pre1 1 1 1 pre2 2 2 2
post post1 2
post1 1 1 1 post2 2 2 2
if then 1. if then 1.
if then 1. if then 1.
dx dx
x t t x x x t t x x
dt dt
dy dy
y t t y y y t t y y
dt dt
τ τ
τ τ
=− = ← + =− = ← +
=− = ← + =− = ← +
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )pre post
2 1 3 2 1 2 1 3 2 1w t A y t A x t y t t t A x t A y t x t t tε δ ε δ− − + +
   ∆ =− + − − + + − −   
post-pre LTD pre-post-pre LTD pre-post LTP post-pre-post LTP
Pfister & Gerstner (2006) J Neurosci
Dynamics of two presynaptic and two postsynaptic traces
Pre-post-pre LTD and pre-post-pre LTP

STDP for inhibitory synapses
Vogels et al. (2011) Science

Relation of STDP to other learning rules.
• STDP and rate-based Hebbian learning rules
Kempter, R., Gerstner, W., & Van Hemmen, J. L. (1999). Hebbian learning and spiking
neurons. Physical Review E, 59(4), 4498.
• STDP and Bienenstock-Cooper-Munro (BCM) rule
Izhikevich, E. M., & Desai, N. S. (2003). Relating stdp to bcm. Neural computation, 15(7),
1511-1523.
Pfister, J. P., & Gerstner, W. (2006). Triplets of spikes in a model of spike timing-
dependent plasticity. The Journal of neuroscience, 26(38), 9673-9682.
• STDP and temporal-difference learning rule
Rao, R. P., & Sejnowski, T. J. (2001). Spike-timing-dependent Hebbian plasticity as
temporal difference learning. Neural computation, 13(10), 2221-2237.
Exercise 2

Functional consequence: reduced latency
Song & Abbott (2000) Nature Neurosci
potentiated
depressed

Functional consequence: latent pattern detection
Masquelier et al. (2008) PLoS One; (2009) Neural Comput

Functional consequence: latent pattern detection
Masquelier et al. (2008) PLoS One; (2009) Neural Comput
( ) ( ) ( )
j
i j j
t
t t wv t t tη ε= −− + ∑
Spike response model (SRM): membrane potential in integral form.
action potential synaptic potential
presynaptic spikepostsynaptic spike
Spike-timing-dependent plasticity:
presynaptic spike tj and postsynaptic spike ti
if
if
i j
i j
t
j i j
j t
j
t
i j
t
A ew t
w
e t
t
A tw
τ
τ−
+
−
+
−
−
−

 +
→ 

− <
>


Bird-song learning: LMAN provides exploratory noise.
Vocal motor pathway (VMP)
• HVC (High vocal center)
• RA
Anterior forebrain pathway (AFP)
• Area X
• DLM
• LMAN
Kao et al. (2005) Nature

HVC-RA synaptic plasticity modulated by reward.
Fiete & Seung (2007) J Neurophysiol.

Tripartite synaptic plasticity
( ) LMAN LM V
0
N CA H
( ) ( ) ( ) (( )) i
tij
ij ii j
dW
R t e t dt s tG t t
dt
t s s tRη η  ′ − 
′− ′= ∫
Fiete & Seung (2007) J Neurophysiol.
Exercise 3
This tripartite learning rule indeed leads to reward maximization.

Summary
• Synaptic plasticity refers to activity-dependent change of
a synaptic weight between neurons, underlying the
physiological basis for learning and memory.
• Hebbian learning: “Fire together, wire together.”
• Synaptic plasticity may be formulated in terms of rate
coding or spike-timing coding.
• Synaptic plasticity is determined not only among two
connected neurons but also is modulated by other
factors (e.g., reward, homeostasis).

Exercises
1. Prove that all eigenvalues of a Wishart matrix are
positive semidefinite.
2. Read the following paper:
Kempter, R., Gerstner, W., & Van Hemmen, J. L. (1999). Hebbian learning and spiking
neurons. Physical Review E, 59(4), 4498.
From the additive STDP learning rule, derive the
following rate-based Hebbian learning rule (fi and fj are
pre- and post-synaptic activity, respectively):
3. Read the following paper:
Fiete, I. R., & Seung, H. S. (2006). Gradient learning in spiking neural networks by
dynamic perturbation of conductances. Physical review letters, 97(4), 048104.
Prove that the learning rule (slide 46) can be derived as
a consequence of reward maximization.
ij i j jw f f fα β∆ = +

Exercises: Code Implementation of Song et al. (2000)
( )
( ) ( ) ( )( ) ( ) ( )( )m rest ex ex inin
dV t
V V t g t E V t g t E V t
dt
τ = − + − + −
( )
( )
( )
( )ex in
e iex in nx ,
dg t dg t
g t g t
dt dt
τ τ=− =−
Membrane dynamics
Conductance dynamics
( ) ( )ex ex ag t g t g→ + when a-th excitatory input arrives
( ) ( )ini inng t g t g+→ when any inhibitory input arrives
Goal: Implement the STDP rule in Song, Miller & Abbott (2000).

STDP for presynaptic firing:
( )maxmax ,0a a
a a
g M tg g
P P A+
→   
→
+
+
STDP for postsynaptic firing:
when a-th excitatory input arrives
( )max maxmin ,a a ag P tg g
A
g
M M −
→  
→ +
 +
when output neuron fires
Synaptic traces:
( )
( )
( )
( )+
,
a
a
dM t
M t
dt
dP t
P t
dt
τ
τ
− = −
= −

%% parameter setting:
% LIF-neuron parameters:
taum = 20/1000;
Vrest = -70;
Eex = 0;
Ein = -70;
Vth = -54;
Vreset = -60;
% synapse parameters:
Nex = 1000;
Nin = 200;
tauex = 5/1000;
tauin = 5/1000;
gmaxin = 0.05;
gmaxex = 0.015;
% STDP parameters:
Ap = 0.005;
An = Ap*1.05;
taup = 20/1000;
taun = 20/1000;
%simulation parameters:
dt = 0.1/1000;
T = 200;
t = 0:dt:T;
% input firing rates:
Fex = randi([10 30], 1, Nex);
Fin = 10*ones(1,Nin);
%% simulation:
V = zeros(length(t), 1);
M = zeros(length(t), 1);
P = zeros(length(t), Nex);
gex = zeros(length(t), 1);
gin = zeros(length(t), 1);
V(1) = Vreset;
ga = zeros(length(t), Nex);
ga(1,:) = gmaxex*ones(1,Nex);
disp('Now simulating LIF neuron ...');
tic;
for n=1:length(t)-1
% WRITE YOUR CODE HERE:
end
toc;

JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules

JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules

Similar a JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules (20)

Más de hirokazutanaka

Más de hirokazutanaka (11)

Último

Último (20)

JAISTサマースクール2016「脳を知るための理論」講義02 Synaptic Learning rules