Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
Seunghun Oh, Unsang Park

TL;DR
This paper introduces Attention Frequency Modulation (AFM), a training-free spectral control method for diffusion cross-attention that enables spatial scale biasing of token competition during inference.
Contribution
AFM is a novel, plug-and-play spectral modulation technique that manipulates cross-attention in the Fourier domain without retraining or prompt editing.
Findings
AFM reliably redistributes attention spectra in diffusion models.
AFM produces substantial visual edits while maintaining semantic alignment.
Entropy acts as an adaptive gain on frequency-based attention modulation.
Abstract
Cross-attention is the primary interface through which text conditions latent diffusion models, yet its step-wise multi-resolution dynamics remain under-characterized, limiting principled training-free control. We cast diffusion cross-attention as a spatiotemporal signal on the latent grid by summarizing token-softmax weights into token-agnostic concentration maps and tracking their radially binned Fourier power over denoising. Across prompts and seeds, encoder cross-attention exhibits a consistent coarse-to-fine spectral progression, yielding a stable time-frequency fingerprint of token competition. Building on this structure, we introduce Attention Frequency Modulation (AFM), a plug-and-play inference-time intervention that edits token-wise pre-softmax cross-attention logits in the Fourier domain: low- and high-frequency bands are reweighted with a progress-aligned schedule and can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
