Guiding Diffusion Models with Semantically Degraded Conditions
Shilong Han, Yuming Zhang, Hongxia Wang

TL;DR
This paper introduces Condition-Degradation Guidance (CDG), a novel method that replaces the null prompt in classifier-free guidance with strategically degraded conditions, improving semantic precision in text-to-image diffusion models.
Contribution
The paper proposes a new guidance paradigm that uses semantically degraded conditions, enhancing compositional accuracy without additional training or external models.
Findings
CDG improves compositional accuracy across multiple architectures.
It enhances text-image alignment with negligible computational overhead.
The method demonstrates the importance of adaptive, semantically-aware negative samples.
Abstract
Classifier-Free Guidance (CFG) is a cornerstone of modern text-to-image models, yet its reliance on a semantically vacuous null prompt () generates a guidance signal prone to geometric entanglement. This is a key factor limiting its precision, leading to well-documented failures in complex compositional tasks. We propose Condition-Degradation Guidance (CDG), a novel paradigm that replaces the null prompt with a strategically degraded condition, . This reframes guidance from a coarse "good vs. null" contrast to a more refined "good vs. almost good" discrimination, thereby compelling the model to capture fine-grained semantic distinctions. We find that tokens in transformer text encoders split into two functional roles: content tokens encoding object semantics, and context-aggregating tokens capturing global context. By selectively degrading only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
