SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in   Spiking Transformer

Yue Liu; Shanlin Xiao; Bo Li; Zhiyi Yu

arXiv:2311.08806·cs.CV·November 16, 2023·1 cites

SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer

Yue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu

PDF

Open Access

TL;DR

SparseSpikformer introduces a co-design framework that combines token and weight pruning to significantly reduce model size and computational cost in Spiking Transformers, while maintaining high performance.

Contribution

The paper proposes a novel co-design framework for Spikformer that leverages the Lottery Ticket Hypothesis and a token selector to achieve over 90% sparsity with minimal accuracy loss.

Findings

01

Achieves over 90% sparsity in model parameters.

02

Reduces GFLOPs by 20% without accuracy degradation.

03

Maintains competitive performance with a highly sparse model.

Abstract

As the third-generation neural network, the Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency, making it suitable for implementation on edge devices. More recently, the most advanced SNN, Spikformer, combines the self-attention module from Transformer with SNN to achieve remarkable performance. However, it adopts larger channel dimensions in MLP layers, leading to an increased number of redundant model parameters. To effectively decrease the computational complexity and weight parameters of the model, we explore the Lottery Ticket Hypothesis (LTH) and discover a very sparse ( $\geq$ 90%) subnetwork that achieves comparable performance to the original network. Furthermore, we also design a lightweight token selector module, which can remove unimportant background information from images based on the average spike firing rate of neurons,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Neural Networks and Reservoir Computing · Ferroelectric and Negative Capacitance Devices

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Softmax · Position-Wise Feed-Forward Layer · Label Smoothing · Dense Connections · Absolute Position Encodings · Spiking Neural Networks