Spikformer: When Spiking Neural Network Meets Transformer

Zhaokun Zhou; Yuesheng Zhu; Chao He; Yaowei Wang; Shuicheng Yan,; Yonghong Tian; Li Yuan

arXiv:2209.15425·cs.NE·November 23, 2022·103 cites

Spikformer: When Spiking Neural Network Meets Transformer

Zhaokun Zhou, Yuesheng Zhu, Chao He, Yaowei Wang, Shuicheng Yan,, Yonghong Tian, Li Yuan

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces Spikformer, a novel spiking transformer model that combines self-attention with biological plausibility, achieving state-of-the-art accuracy in image classification with low energy consumption.

Contribution

It proposes Spiking Self Attention (SSA) and the Spikformer framework, integrating self-attention into SNNs for improved efficiency and performance.

Findings

01

Spikformer outperforms existing SNN frameworks on image classification.

02

Achieves 74.81% top-1 accuracy on ImageNet with 66.3M parameters.

03

SSA mechanism is efficient, sparse, and avoids multiplication, reducing energy consumption.

Abstract

We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the self-attention mechanism. The former offers an energy-efficient and event-driven paradigm for deep learning, while the latter has the ability to capture feature dependencies, enabling Transformer to achieve good performance. It is intuitively promising to explore the marriage between them. In this paper, we consider leveraging both self-attention capability and biological properties of SNNs, and propose a novel Spiking Self Attention (SSA) as well as a powerful framework, named Spiking Transformer (Spikformer). The SSA mechanism in Spikformer models the sparse visual feature by using spike-form Query, Key, and Value without softmax. Since its computation is sparse and avoids multiplication, SSA is efficient and has low computational energy consumption. It is shown that Spikformer with SSA can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Spikformer: When Spiking Neural Network Meets Transformer· slideslive

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Neural dynamics and brain function

MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Multi-Head Attention · Adam · Dense Connections · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dropout · Layer Normalization