Instilling Multi-round Thinking to Text-guided Image Generation
Lidong Zeng, Zhedong Zheng, Yinwei Wei, Tat-seng Chua

TL;DR
This paper introduces a multi-round regularization technique for text-guided image editing that enhances detail preservation and consistency across multiple interaction rounds, improving editing fidelity and robustness.
Contribution
It proposes a novel self-supervised multi-round regularization method that maintains consistency regardless of modification order, addressing limitations of single-round editing.
Findings
Achieves high-fidelity local modifications in image editing.
Demonstrates robustness to irregular text inputs.
Improves retrieval performance on FahisonIQ and Fashion200k.
Abstract
This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes. Despite recent advancements, a persistent challenge remains that the single-round generation often overlooks crucial details, particularly in the realm of fine-grained changes like shoes or sleeves. This issue compounds over multiple rounds of interaction, severely limiting customization quality. In an attempt to address this challenge, we introduce a new self-supervised regularization, \ie, multi-round regularization, which is compatible with existing methods. Specifically, the multi-round regularization encourages the model to maintain consistency across different modification orders. It builds upon the observation that the modification order generally should not affect the final result. Different from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
MethodsFocus
