WaveFormer: Frequency-Time Decoupled Vision Modeling with Wave Equation
Zishan Shu, Juntong Wu, Wei Yan, Xudong Liu, Hongyu Zhang, Chang Liu, Youdong Mao, Jie Chen

TL;DR
WaveFormer introduces a wave equation-based approach to vision modeling, explicitly capturing frequency and spatial information propagation, resulting in efficient, accurate, and versatile vision models that outperform traditional attention mechanisms.
Contribution
The paper presents a novel wave equation-inspired framework and the Wave Propagation Operator, enabling frequency-time decoupled vision modeling with improved efficiency and accuracy.
Findings
Achieves up to 1.6x higher throughput than attention-based models.
Reduces FLOPs by 30% compared to standard ViTs and CNNs.
Effectively captures both global coherence and high-frequency details.
Abstract
Vision modeling has advanced rapidly with Transformers, whose attention mechanisms capture visual dependencies but lack a principled account of how semantic information propagates spatially. We revisit this problem from a wave-based perspective: feature maps are treated as spatial signals whose evolution over an internal propagation time (aligned with network depth) is governed by an underdamped wave equation. In this formulation, spatial frequency-from low-frequency global layout to high-frequency edges and textures-is modeled explicitly, and its interaction with propagation time is controlled rather than implicitly fixed. We derive a closed-form, frequency-time decoupled solution and implement it as the Wave Propagation Operator (WPO), a lightweight module that models global interactions in O(N log N) time-far lower than attention. Building on WPO, we propose a family of WaveFormer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Advanced Memory and Neural Computing
