eprop_iaf – Current-based leaky integrate-and-fire neuron model with delta-shaped postsynaptic currents for e-prop plasticity

Description

eprop_iaf is an implementation of a leaky integrate-and-fire neuron model with delta-shaped postsynaptic currents used for eligibility propagation (e-prop) plasticity.

E-prop plasticity was originally introduced and implemented in TensorFlow in [1].

Note

The neuron dynamics of the eprop_iaf model (excluding e-prop plasticity) are similar to the neuron dynamics of the iaf_psc_delta model, with minor differences, such as the propagator of the post-synaptic current and the voltage reset upon a spike.

The membrane voltage time course \(v_j^t\) of the neuron \(j\) is given by:

\[\begin{split}v_j^t &= \alpha v_j^{t-1} + \zeta \sum_{i \neq j} W_{ji}^\text{rec} z_i^{t-1} + \zeta \sum_i W_{ji}^\text{in} x_i^t - z_j^{t-1} v_\text{th} \,, \\ \alpha &= e^{ -\frac{ \Delta t }{ \tau_\text{m} } } \,, \\ \zeta &= \begin{cases} 1 \\ 1 - \alpha \end{cases} \,, \\\end{split}\]

where \(W_{ji}^\text{rec}\) and \(W_{ji}^\text{in}\) are the recurrent and input synaptic weight matrices, and \(z_i^{t-1}\) is the recurrent presynaptic state variable, while \(x_i^t\) represents the input at time \(t\).

Descriptions of further parameters and variables can be found in the table below.

The spike state variable is expressed by a Heaviside function:

\[\begin{split}z_j^t = H \left( v_j^t - v_\text{th} \right) \,. \\\end{split}\]

If the membrane voltage crosses the threshold voltage \(v_\text{th}\), a spike is emitted and the membrane voltage is reduced by \(v_\text{th}\) in the next time step. After the time step of the spike emission, the neuron is not able to spike for an absolute refractory period \(t_\text{ref}\).

An additional state variable and the corresponding differential equation represents a piecewise constant external current.

See the documentation on the iaf_psc_delta neuron model for more information on the integration of the subthreshold dynamics.

The change of the synaptic weight is calculated from the gradient \(g^t\) of the loss \(E^t\) with respect to the synaptic weight \(W_{ji}\): \(\frac{ \text{d} E^t }{ \text{d} W_{ij} }\) which depends on the presynaptic spikes \(z_i^{t-2}\), the surrogate gradient or pseudo-derivative of the spike state variable with respect to the postsynaptic membrane voltage \(\psi_j^{t-1}\) (the product of which forms the eligibility trace \(e_{ji}^{t-1}\)), and the learning signal \(L_j^t\) emitted by the readout neurons.

Surrogate gradients help overcome the challenge of the spiking function’s non-differentiability, facilitating the use of gradient-based learning techniques such as e-prop. The non-existent derivative of the spiking variable with respect to the membrane voltage, \(\frac{\partial z^t_j}{ \partial v^t_j}\), can be effectively replaced with a variety of surrogate gradient functions, as detailed in various studies (see, e.g., [3]). NEST currently provides four different surrogate gradient functions:

  1. A piecewise linear function used among others in [1]:

\[\begin{split}\psi_j^t = \frac{ \gamma }{ v_\text{th} } \text{max} \left( 0, 1-\beta \left| \frac{ v_j^t - v_\text{th} }{ v_\text{th} }\right| \right) \,. \\\end{split}\]
  1. An exponential function used in [4]:

\[\begin{split}\psi_j^t = \gamma \exp \left( -\beta \left| v_j^t - v_\text{th} \right| \right) \,. \\\end{split}\]
  1. The derivative of a fast sigmoid function used in [5]:

\[\begin{split}\psi_j^t = \gamma \left( 1 + \beta \left| v_j^t - v_\text{th} \right| \right)^2 \,. \\\end{split}\]
  1. An arctan function used in [6]:

\[\begin{split}\psi_j^t = \frac{\gamma}{\pi} \frac{1}{ 1 + \left( \beta \pi \left( v_j^t - v_\text{th} \right) \right)^2 } \,. \\\end{split}\]

In the interval between two presynaptic spikes, the gradient is calculated at each time step until the cutoff time point. This computation occurs over the time range:

\(t \in \left[ t_\text{spk,prev}, \min \left( t_\text{spk,prev} + \Delta t_\text{c}, t_\text{spk,curr} \right) \right]\).

Here, \(t_\text{spk,prev}\) represents the time of the previous spike that passed the synapse, while \(t_\text{spk,curr}\) is the time of the current spike, which triggers the application of the learning rule and the subsequent synaptic weight update. The cutoff \(\Delta t_\text{c}\) defines the maximum allowable interval for integration between spikes. The expression for the gradient is given by:

\[\begin{split}\frac{ \text{d} E^t }{ \text{d} W_{ji} } &= L_j^t \bar{e}_{ji}^{t-1} \,, \\ e_{ji}^{t-1} &= \psi_j^{t-1} \bar{z}_i^{t-2} \,, \\\end{split}\]

The eligibility trace and the presynaptic spike trains are low-pass filtered with the following exponential kernels:

\[\begin{split}\bar{e}_{ji}^t &= \mathcal{F}_\kappa \left( e_{ji}^t \right) = \kappa \bar{e}_{ji}^{t-1} + \left( 1 - \kappa \right) e_{ji}^t \,, \\ \bar{z}_i^t &= \mathcal{F}_\alpha \left( z_{i}^t \right)= \alpha \bar{z}_i^{t-1} + \zeta z_i^t \,. \\\end{split}\]

Furthermore, a firing rate regularization mechanism keeps the exponential moving average of the postsynaptic neuron’s firing rate \(f_j^{\text{ema},t}\) close to a target firing rate \(f^\text{target}\). The gradient \(g_\text{reg}^t\) of the regularization loss \(E_\text{reg}^t\) with respect to the synaptic weight \(W_{ji}\) is given by:

\[\begin{split}\frac{ \text{d} E_\text{reg}^t }{ \text{d} W_{ji}} &\approx c_\text{reg} \left( f^{\text{ema},t}_j - f^\text{target} \right) \bar{e}_{ji}^t \,, \\ f^{\text{ema},t}_j &= \mathcal{F}_{\kappa_\text{reg}} \left( \frac{z_j^t}{\Delta t} \right) = \kappa_\text{reg} f^{\text{ema},t-1}_j + \left( 1 - \kappa_\text{reg} \right) \frac{z_j^t}{\Delta t} \,, \\\end{split}\]

where \(c_\text{reg}\) is a constant scaling factor.

The overall gradient is given by the addition of the two gradients.

As a last step for every round in the loop over the time steps \(t\), the new weight is retrieved by feeding the current gradient \(g^t\) to the optimizer (see weight_optimizer for more information on the available optimizers):

\[\begin{split}w^t = \text{optimizer} \left( t, g^t, w^{t-1} \right) \,. \\\end{split}\]

After the loop has terminated, the filtered dynamic variables of e-prop are propagated from the end of the cutoff until the next spike:

\[\begin{split}p &= \text{max} \left( 0, t_\text{s}^{t} - \left( t_\text{s}^{t-1} + {\Delta t}_\text{c} \right) \right) \,, \\ \bar{e}_{ji}^{t+p} &= \bar{e}_{ji}^t \kappa^p \,, \\ \bar{z}_i^{t+p} &= \bar{z}_i^t \alpha^p \,. \\\end{split}\]

For more information on e-prop plasticity, see the documentation on the other e-prop models:

Details on the event-based NEST implementation of e-prop can be found in [2].

Parameters

The following parameters can be set in the status dictionary.

Neuron parameters

Parameter

Unit

Math equivalent

Default

Description

C_m

pF

\(C_\text{m}\)

250.0

Capacitance of the membrane

E_L

mV

\(E_\text{L}\)

-70.0

Leak / resting membrane potential

I_e

pA

\(I_\text{e}\)

0.0

Constant external input current

t_ref

ms

\(t_\text{ref}\)

2.0

Duration of the refractory period

tau_m

ms

\(\tau_\text{m}\)

10.0

Time constant of the membrane

V_min

mV

\(v_\text{min}\)

negative maximum value representable by a double type in C++

Absolute lower bound of the membrane voltage

V_th

mV

\(v_\text{th}\)

-55.0

Spike threshold voltage

E-prop parameters

Parameter

Unit

Math equivalent

Default

Description

c_reg

\(c_\text{reg}\)

0.0

Coefficient of firing rate regularization

eprop_isi_trace_cutoff

ms

\({\Delta t}_\text{c}\)

maximum value representable by a long type in C++

Cutoff for integration of e-prop update between two spikes

f_target

Hz

\(f^\text{target}\)

10.0

Target firing rate of rate regularization

kappa

\(\kappa\)

0.97

Low-pass filter of the eligibility trace

kappa_reg

\(\kappa_\text{reg}\)

0.97

Low-pass filter of the firing rate for regularization

beta

\(\beta\)

1.0

Width scaling of surrogate gradient / pseudo-derivative of membrane voltage

gamma

\(\gamma\)

0.3

Height scaling of surrogate gradient / pseudo-derivative of membrane voltage

surrogate_gradient_function

\(\psi\)

“piecewise_linear”

Surrogate gradient / pseudo-derivative function [“piecewise_linear”, “exponential”, “fast_sigmoid_derivative” , “arctan”]

Recordables

The following state variables evolve during simulation and can be recorded.

Neuron state variables and recordables

State variable

Unit

Math equivalent

Initial value

Description

V_m

mV

\(v_j\)

-70.0

Membrane voltage

E-prop state variables and recordables

State variable

Unit

Math equivalent

Initial value

Description

learning_signal

pA

\(L_j\)

0.0

Learning signal

surrogate_gradient

\(\psi_j\)

0.0

Surrogate gradient / pseudo-derivative of membrane voltage

Usage

This model can only be used in combination with the other e-prop models and the network architecture requires specific wiring, input, and output. The usage is demonstrated in several supervised regression and classification tasks reproducing among others the original proof-of-concept tasks in [1].

References

Sends

SpikeEvent

Receives

SpikeEvent, CurrentEvent, LearningSignalConnectionEvent, DataLoggingRequest

See also

Examples using this model