Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Sifan Li, Ming Tao, Hao Zhao, Ling Shao, Hao Tang

TL;DR
This paper introduces a method to improve concept alignment in counterfactual text-to-image generation by replacing objects in images step-by-step using controllable models and guiding instructions generated by a language model, enhancing the factuality and coherence of synthesized images.
Contribution
It proposes a novel strategy using Explicit Logical Narrative Prompt (ELNP) and DeepSeek to enhance concept alignment in counterfactual T2I, addressing a key challenge in versatile AIGC applications.
Findings
Boosts concept alignment in counterfactual T2I.
Uses ELNP and DeepSeek for guided object replacement.
Demonstrates improved factual consistency through experiments.
Abstract
Text-to-Image (T2I) has been prevalent in recent years, with most common condition tasks having been optimized nicely. Besides, counterfactual Text-to-Image is obstructing us from a more versatile AIGC experience. For those scenes that are impossible to happen in real world and anti-physics, we should spare no efforts in increasing the factual feel, which means synthesizing images that people think very likely to be happening, and concept alignment, which means all the required objects should be in the same frame. In this paper, we focus on concept alignment. As controllable T2I models have achieved satisfactory performance for real applications, we utilize this technology to replace the objects in a synthesized image in latent space step-by-step to change the image from a common scene to a counterfactual scene to meet the prompt. We propose a strategy to instruct this replacing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Computational and Text Analysis Methods
MethodsFocus
