Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
Donghoon Ahn, Jiwon Kang, Sanghyun Lee, Minjae Kim, Jaewon Min, Wooseok Jang, Sangwu Lee, Sayak Paul, Susung Hong, Seungryong Kim

TL;DR
This paper introduces HeadHunter and SoftPAG, novel methods for attention head perturbation in diffusion models, enabling fine-grained, interpretable control over image generation quality and style attributes.
Contribution
It provides the first head-level analysis of attention perturbation, revealing head specialization and proposing systematic, user-centric perturbation strategies for diffusion models.
Findings
Attention heads govern distinct visual concepts such as structure, style, and texture.
HeadHunter effectively selects attention heads aligned with user objectives.
SoftPAG offers continuous control over perturbation strength, improving generation quality.
Abstract
Recent guidance methods in diffusion models steer reverse sampling by perturbing the model to construct an implicit weak model and guide generation away from it. Among these approaches, attention perturbation has demonstrated strong empirical performance in unconditional scenarios where classifier-free guidance is not applicable. However, existing attention perturbation methods lack principled approaches for determining where perturbations should be applied, particularly in Diffusion Transformer (DiT) architectures where quality-relevant computations are distributed across layers. In this paper, we investigate the granularity of attention perturbations, ranging from the layer level down to individual attention heads, and discover that specific heads govern distinct visual concepts such as structure, style, and texture quality. Building on this insight, we propose "HeadHunter", a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAbsolute Position Encodings · Layer Normalization · Byte Pair Encoding · Label Smoothing · Softmax · Dropout · Dense Connections · Transformer · Attention Is All You Need · Diffusion
