Enhancing Prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion
Hongyu Chen, Yiqi Gao, Min Zhou, Peng Wang, Xubin Li, Tiezheng Ge, Bo, Zheng

TL;DR
This paper introduces a training-free method called Mask-guided Prompt Following (MGPF) that uses object masks to improve visual control in text-to-image models, especially when visual controls are misaligned with prompts.
Contribution
The paper proposes a novel training-free approach using object masks and a Masked ControlNet to enhance prompt following with visual control, addressing misalignment issues.
Findings
MGPF outperforms existing methods in aligning visual controls with prompts.
The approach effectively handles misaligned visual controls in T2I models.
Quantitative and qualitative results demonstrate the superiority of MGPF.
Abstract
Recently, integrating visual controls into text-to-image~(T2I) models, such as ControlNet method, has received significant attention for finer control capabilities. While various training-free methods make efforts to enhance prompt following in T2I models, the issue with visual control is still rarely studied, especially in the scenario that visual controls are misaligned with text prompts. In this paper, we address the challenge of ``Prompt Following With Visual Control" and propose a training-free approach named Mask-guided Prompt Following (MGPF). Object masks are introduced to distinct aligned and misaligned parts of visual controls and prompts. Meanwhile, a network, dubbed as Masked ControlNet, is designed to utilize these object masks for object generation in the misaligned visual control region. Further, to improve attribute matching, a simple yet efficient loss is designed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPiezoelectric Actuators and Control
MethodsALIGN
