# Learning with Delayed Synaptic Plasticity

**Authors:** Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George, Fletcher, Mykola Pechenizkiy

arXiv: 1903.09393 · 2020-12-21

## TL;DR

This paper introduces a novel method for learning in neural networks with delayed reinforcement signals by evolving synaptic plasticity rules using genetic algorithms, improving training effectiveness over simpler methods.

## Contribution

It extends Hebbian learning rules with neuron activation traces and employs genetic algorithms to optimize delayed synaptic plasticity rules for better learning with delayed rewards.

## Key findings

- DSP rules outperform hill climbing in training performance
- NATs enable effective learning with delayed reinforcement
- Evolved rules adapt to distal reward scenarios

## Abstract

The plasticity property of biological neural networks allows them to perform learning and optimize their behavior by changing their configuration. Inspired by biology, plasticity can be modeled in artificial neural networks by using Hebbian learning rules, i.e. rules that update synapses based on the neuron activations and reinforcement signals. However, the distal reward problem arises when the reinforcement signals are not available immediately after each network output to associate the neuron activations that contributed to receiving the reinforcement signal. In this work, we extend Hebbian plasticity rules to allow learning in distal reward cases. We propose the use of neuron activation traces (NATs) to provide additional data storage in each synapse to keep track of the activation of the neurons. Delayed reinforcement signals are provided after each episode relative to the networks' performance during the previous episode. We employ genetic algorithms to evolve delayed synaptic plasticity (DSP) rules and perform synaptic updates based on NATs and delayed reinforcement signals. We compare DSP with an analogous hill climbing algorithm that does not incorporate domain knowledge introduced with the NATs, and show that the synaptic updates performed by the DSP rules demonstrate more effective training performance relative to the HC algorithm.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.09393/full.md

## Figures

30 figures with captions in the complete paper: https://tomesphere.com/paper/1903.09393/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1903.09393/full.md

---
Source: https://tomesphere.com/paper/1903.09393