SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks
Xinyu Shi, Zecheng Hao, Zhaofei Yu

TL;DR
This paper introduces SpikingResformer, a novel SNN architecture combining ResNet and Vision Transformer features with a new self-attention mechanism, achieving state-of-the-art accuracy and efficiency on ImageNet.
Contribution
The paper proposes Dual Spike Self-Attention (DSSA) and integrates it into SpikingResformer, enhancing local feature extraction, scalability, and energy efficiency in SNNs.
Findings
Achieves 79.40% top-1 accuracy on ImageNet with 4 time-steps.
Outperforms existing spiking Vision Transformers in accuracy and energy efficiency.
Reduces parameters compared to previous models.
Abstract
The remarkable success of Vision Transformers in Artificial Neural Networks (ANNs) has led to a growing interest in incorporating the self-attention mechanism and transformer-based architecture into Spiking Neural Networks (SNNs). While existing methods propose spiking self-attention mechanisms that are compatible with SNNs, they lack reasonable scaling methods, and the overall architectures proposed by these methods suffer from a bottleneck in effectively extracting local features. To address these challenges, we propose a novel spiking self-attention mechanism named Dual Spike Self-Attention (DSSA) with a reasonable scaling method. Based on DSSA, we propose a novel spiking Vision Transformer architecture called SpikingResformer, which combines the ResNet-based multi-stage architecture with our proposed DSSA to improve both performance and energy efficiency while reducing parameters.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · CCD and CMOS Imaging Sensors
MethodsAttention Is All You Need · Softmax · Dropout · Byte Pair Encoding · Absolute Position Encodings · Residual Connection · Position-Wise Feed-Forward Layer · Linear Layer · Dense Connections · Label Smoothing
