Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance
Kwanyoung Kim

TL;DR
This paper introduces ASAG, a novel adversarial guidance method for diffusion models that improves sample quality and controllability by disrupting attention scores through optimal transport techniques, without retraining the model.
Contribution
The paper proposes a new adversarial attention guidance method based on Sinkhorn algorithm to enhance diffusion model sampling without retraining.
Findings
Consistent improvement in text-to-image diffusion quality.
Enhanced controllability and fidelity in downstream applications.
Lightweight, plug-and-play method that improves reliability.
Abstract
Diffusion models have demonstrated strong generative performance when using guidance methods such as classifier-free guidance (CFG), which enhance output quality by modifying the sampling trajectory. These methods typically improve a target output by intentionally degrading another, often the unconditional output, using heuristic perturbation functions such as identity mixing or blurred conditions. However, these approaches lack a principled foundation and rely on manually designed distortions. In this work, we propose Adversarial Sinkhorn Attention Guidance (ASAG), a novel method that reinterprets attention scores in diffusion models through the lens of optimal transport and intentionally disrupt the transport cost via Sinkhorn algorithm. Instead of naively corrupting the attention mechanism, ASAG injects an adversarial cost within self-attention layers to reduce pixel-wise similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
