Ge$^\text{2}$mS-T: Multi-Dimensional Grouping for Ultra-High Energy Efficiency in Spiking Transformer

Zecheng Hao; Shenghao Xie; Kang Chen; Wenxuan Liu; Zhaofei Yu; Tiejun Huang

arXiv:2604.08894·cs.NE·April 13, 2026

Ge$^\text{2}$mS-T: Multi-Dimensional Grouping for Ultra-High Energy Efficiency in Spiking Transformer

Zecheng Hao, Shenghao Xie, Kang Chen, Wenxuan Liu, Zhaofei Yu, Tiejun Huang

PDF

TL;DR

This paper introduces Ge$^2$mS-T, a novel multi-dimensional grouped computation architecture for Spiking Vision Transformers that significantly improves energy efficiency, accuracy, and memory usage.

Contribution

It proposes a new architecture with grouped computation across multiple dimensions, including a lossless conversion model and a multi-scale attention mechanism, addressing key limitations in SNNs.

Findings

01

Achieves superior performance on challenging benchmarks.

02

Demonstrates ultra-high energy efficiency compared to existing methods.

03

First systematic approach to multi-dimensional grouping in S-ViTs.

Abstract

Spiking Neural Networks (SNNs) offer superior energy efficiency over Artificial Neural Networks (ANNs). However, they encounter significant deficiencies in training and inference metrics when applied to Spiking Vision Transformers (S-ViTs). Existing paradigms including ANN-SNN Conversion and Spatial-Temporal Backpropagation (STBP) suffer from inherent limitations, precluding concurrent optimization of memory, accuracy and energy consumption. To address these issues, we propose Ge $^{2}$ mS-T, a novel architecture implementing grouped computation across temporal, spatial and network structure dimensions. Specifically, we introduce the Grouped-Exponential-Coding-based IF (ExpG-IF) model, enabling lossless conversion with constant training overhead and precise regulation for spike patterns. Additionally, we develop Group-wise Spiking Self-Attention (GW-SSA) to reduce computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.