TDFormer: A Top-Down Attention-Controlled Spiking Transformer

Zizheng Zhu; Yingchao Yu; Zeqi Zheng; Zhaofei Yu; Yaochu Jin

arXiv:2505.15840·cs.NE·May 26, 2025

TDFormer: A Top-Down Attention-Controlled Spiking Transformer

Zizheng Zhu, Yingchao Yu, Zeqi Zheng, Zhaofei Yu, Yaochu Jin

PDF

TL;DR

TDFormer introduces a top-down feedback mechanism in spiking neural networks, enhancing temporal information representation and gradient flow, leading to state-of-the-art performance on ImageNet.

Contribution

The paper proposes TDFormer, a novel top-down feedback spiking transformer that leverages hierarchical high-order representations to improve temporal information processing.

Findings

01

Increases mutual information across time steps during forward propagation.

02

Alleviates vanishing gradients along the time dimension.

03

Achieves 86.83% accuracy on ImageNet, setting a new state-of-the-art.

Abstract

Traditional spiking neural networks (SNNs) can be viewed as a combination of multiple subnetworks with each running for one time step, where the parameters are shared, and the membrane potential serves as the only information link between them. However, the implicit nature of the membrane potential limits its ability to effectively represent temporal information. As a result, each time step cannot fully leverage information from previous time steps, seriously limiting the model's performance. Inspired by the top-down mechanism in the brain, we introduce TDFormer, a novel model with a top-down feedback structure that functions hierarchically and leverages high-order representations from earlier time steps to modulate the processing of low-order information at later stages. The feedback structure plays a role from two perspectives: 1) During forward propagation, our model increases the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.