CHASM: Cross-frequency Harmonized Axis-Separable Mixing for Spectral Token Operators

Pengcheng Fang; Hongli Chen; Yuxia Chen; Tengjiao Sun; Jiaxin Liu; Xiaohao Cai

arXiv:2605.14727·cs.CV·May 15, 2026

CHASM: Cross-frequency Harmonized Axis-Separable Mixing for Spectral Token Operators

Pengcheng Fang, Hongli Chen, Yuxia Chen, Tengjiao Sun, Jiaxin Liu, Xiaohao Cai

PDF

TL;DR

CHASM introduces a structured spectral token mixer that shares a learned basis across frequencies while maintaining frequency-specific gains, improving performance in various image reconstruction tasks.

Contribution

It proposes a novel cross-frequency harmonized operator with shared basis and positive gains, enhancing spectral token mixing in vision models.

Findings

01

Consistently improves MRI reconstruction and segmentation results.

02

Shared basis constraint is crucial for performance.

03

Cross-frequency harmonization acts as a beneficial inductive bias.

Abstract

Spectral token mixers based on Fourier transforms provide an efficient way to model global interactions in visual feature maps. Existing designs often either apply filter-wise spectral responses along fixed channel axes, or learn adaptive frequency-indexed channel mixing without explicitly aligning the channel directions used across frequencies. We propose CHASM, a Cross-frequency Harmonized Axis-Separable Mixer, as a structured middle ground. CHASM separates what should be shared from what should remain frequency-specific: all frequencies share a learned channel eigenbasis, while each frequency retains its own positive spectral gains. The shared basis makes channel directions comparable across the spectrum, whereas the positive gains preserve local spectral adaptivity. CHASM applies this structured operator separably along the height and width axes and is used as a drop-in replacement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.