Vision SmolMamba: Spike-Guided Token Pruning for Energy-Efficient Spiking State-Space Vision Models

Dewei Bai; Hongxiang Peng; Yunyun Zeng; Ziyu Zhang; Hong Qu; Yi Zhang

arXiv:2604.25570·cs.CV·April 29, 2026

Vision SmolMamba: Spike-Guided Token Pruning for Energy-Efficient Spiking State-Space Vision Models

Dewei Bai, Hongxiang Peng, Yunyun Zeng, Ziyu Zhang, Hong Qu, Yi Zhang

PDF

TL;DR

Vision SmolMamba introduces a spike-guided token pruning method within a spiking state-space architecture, significantly improving energy efficiency and accuracy in long-range visual modeling tasks.

Contribution

It proposes a novel spike-guided token pruner and integrates spike dynamics with state-space recurrence for scalable, energy-efficient spiking vision models.

Findings

01

Reduces energy cost by at least 1.5x compared to prior baselines.

02

Achieves superior accuracy-efficiency trade-offs on multiple benchmarks.

03

Maintains competitive or improved accuracy with token sparsity.

Abstract

Spiking Transformers have shown strong potential for long-range visual modeling through spike-driven self-attention. However, their quadratic token interactions remain fundamentally misaligned with the sparse and event-driven nature of spiking neural computation. To address this limitation, we propose Vision SmolMamba, an energy-efficient spiking state-space architecture that integrates spike-driven dynamics with linear-time selective recurrence. The key idea is a Spike-Guided Spatio-Temporal Token Pruner (SST-TP), which estimates token importance using both spike activation strength and first-spike latency. This mechanism progressively removes redundant tokens while preserving salient spatio-temporal information, enabling efficient scaling with token sparsity. Based on this mechanism, the proposed SmolMamba block incorporates spike events directly into bidirectional state-space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.