# First-spike based visual categorization using reward-modulated STDP

**Authors:** Milad Mozafari, Saeed Reza Kheradpisheh, Timoth\'ee Masquelier, Abbas, Nowzari-Dalini, Mohammad Ganjtabesh

arXiv: 1705.09132 · 2018-07-11

## TL;DR

This paper introduces a reward-modulated STDP learning rule for spiking neural networks that enables efficient, online object recognition in natural images without external classifiers, demonstrating superior performance over traditional STDP.

## Contribution

The novel use of reward-modulated STDP for training SNNs on visual categorization tasks without external classifiers is presented, enabling online learning and energy-efficient spike-based processing.

## Key findings

- R-STDP outperforms classic STDP on multiple datasets
- The approach enables online, adaptive learning in SNNs
- Feature extraction and classification are achieved with at most one spike per neuron

## Abstract

Reinforcement learning (RL) has recently regained popularity, with major achievements such as beating the European game of Go champion. Here, for the first time, we show that RL can be used efficiently to train a spiking neural network (SNN) to perform object recognition in natural images without using an external classifier. We used a feedforward convolutional SNN and a temporal coding scheme where the most strongly activated neurons fire first, while less activated ones fire later, or not at all. In the highest layers, each neuron was assigned to an object category, and it was assumed that the stimulus category was the category of the first neuron to fire. If this assumption was correct, the neuron was rewarded, i.e. spike-timing-dependent plasticity (STDP) was applied, which reinforced the neuron's selectivity. Otherwise, anti-STDP was applied, which encouraged the neuron to learn something else. As demonstrated on various image datasets (Caltech, ETH-80, and NORB), this reward modulated STDP (R-STDP) approach extracted particularly discriminative visual features, whereas classic unsupervised STDP extracts any feature that consistently repeats. As a result, R-STDP outperformed STDP on these datasets. Furthermore, R-STDP is suitable for online learning, and can adapt to drastic changes such as label permutations. Finally, it is worth mentioning that both feature extraction and classification were done with spikes, using at most one spike per neuron. Thus the network is hardware friendly and energy efficient.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.09132/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1705.09132/full.md

## References

87 references — full list in the complete paper: https://tomesphere.com/paper/1705.09132/full.md

---
Source: https://tomesphere.com/paper/1705.09132