SpikingResformer: Bridging ResNet and Vision Transformer in Spiking   Neural Networks

Xinyu Shi; Zecheng Hao; Zhaofei Yu

arXiv:2403.14302·cs.NE·March 29, 2024·2 cites

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Xinyu Shi, Zecheng Hao, Zhaofei Yu

PDF

Open Access 2 Repos

TL;DR

This paper introduces SpikingResformer, a novel SNN architecture combining ResNet and Vision Transformer features with a new self-attention mechanism, achieving state-of-the-art accuracy and efficiency on ImageNet.

Contribution

The paper proposes Dual Spike Self-Attention (DSSA) and integrates it into SpikingResformer, enhancing local feature extraction, scalability, and energy efficiency in SNNs.

Findings

01

Achieves 79.40% top-1 accuracy on ImageNet with 4 time-steps.

02

Outperforms existing spiking Vision Transformers in accuracy and energy efficiency.

03

Reduces parameters compared to previous models.

Abstract

The remarkable success of Vision Transformers in Artificial Neural Networks (ANNs) has led to a growing interest in incorporating the self-attention mechanism and transformer-based architecture into Spiking Neural Networks (SNNs). While existing methods propose spiking self-attention mechanisms that are compatible with SNNs, they lack reasonable scaling methods, and the overall architectures proposed by these methods suffer from a bottleneck in effectively extracting local features. To address these challenges, we propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method. Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer, which combines the ResNet-based multi-stage architecture with our proposed DSSA to improve both performance and energy efficiency while reducing parameters.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · CCD and CMOS Imaging Sensors

MethodsAttention Is All You Need · Softmax · Dropout · Byte Pair Encoding · Absolute Position Encodings · Residual Connection · Position-Wise Feed-Forward Layer · Linear Layer · Dense Connections · Label Smoothing