CHASM: Cross-frequency Harmonized Axis-Separable Mixing for Spectral Token Operators
Pengcheng Fang, Hongli Chen, Yuxia Chen, Tengjiao Sun, Jiaxin Liu, Xiaohao Cai

TL;DR
CHASM introduces a structured spectral token mixer that shares a learned basis across frequencies while maintaining frequency-specific gains, improving performance in various image reconstruction tasks.
Contribution
It proposes a novel cross-frequency harmonized operator with shared basis and positive gains, enhancing spectral token mixing in vision models.
Findings
Consistently improves MRI reconstruction and segmentation results.
Shared basis constraint is crucial for performance.
Cross-frequency harmonization acts as a beneficial inductive bias.
Abstract
Spectral token mixers based on Fourier transforms provide an efficient way to model global interactions in visual feature maps. Existing designs often either apply filter-wise spectral responses along fixed channel axes, or learn adaptive frequency-indexed channel mixing without explicitly aligning the channel directions used across frequencies. We propose CHASM, a Cross-frequency Harmonized Axis-Separable Mixer, as a structured middle ground. CHASM separates what should be shared from what should remain frequency-specific: all frequencies share a learned channel eigenbasis, while each frequency retains its own positive spectral gains. The shared basis makes channel directions comparable across the spectrum, whereas the positive gains preserve local spectral adaptivity. CHASM applies this structured operator separably along the height and width axes and is used as a drop-in replacement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
