Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion
Aoxue Li, Mingyang Yi, Zhenguo Li

TL;DR
This paper introduces MaSaFusion, a novel method that enhances text-to-image editing by incorporating human annotations and a hybrid fusion approach within diffusion models, leading to more consistent and accurate edits.
Contribution
It proposes a mask-informed fusion technique that integrates external human annotations to improve the quality and consistency of diffusion-based text-to-image editing.
Findings
Significant improvement over existing T2I editing methods.
Better preservation of source image details during editing.
Enhanced alignment with textual prompts in generated images.
Abstract
Recently, text-to-image (T2I) editing has been greatly pushed forward by applying diffusion models. Despite the visual promise of the generated images, inconsistencies with the expected textual prompt remain prevalent. This paper aims to systematically improve the text-guided image editing techniques based on diffusion models, by addressing their limitations. Notably, the common idea in diffusion-based editing firstly reconstructs the source image via inversion techniques e.g., DDIM Inversion. Then following a fusion process that carefully integrates the source intermediate (hidden) states (obtained by inversion) with the ones of the target image. Unfortunately, such a standard pipeline fails in many cases due to the interference of texture retention and the new characters creation in some regions. To mitigate this, we incorporate human annotation as an external knowledge to confine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
