Improving Cross-modal Alignment for Text-Guided Image Inpainting
Yucheng Zhou, Guodong Long

TL;DR
This paper introduces a novel text-guided image inpainting model that leverages vision-language pre-trained models and cross-modal alignment techniques to improve the quality of restored images, achieving state-of-the-art results.
Contribution
The work proposes a new CMA model that enhances cross-modal alignment in TGII using distillation and adversarial training, addressing previous computational and alignment limitations.
Findings
Achieves state-of-the-art performance on benchmark datasets.
Effectively restores complex missing regions guided by text.
Improves cross-modal alignment through distillation techniques.
Abstract
Text-guided image inpainting (TGII) aims to restore missing regions based on a given text in a damaged image. Existing methods are based on a strong vision encoder and a cross-modal fusion model to integrate cross-modal features. However, these methods allocate most of the computation to visual encoding, while light computation on modeling modality interactions. Moreover, they take cross-modal fusion for depth features, which ignores a fine-grained alignment between text and image. Recently, vision-language pre-trained models (VLPM), encapsulating rich cross-modal alignment knowledge, have advanced in most multimodal tasks. In this work, we propose a novel model for TGII by improving cross-modal alignment (CMA). CMA model consists of a VLPM as a vision-language encoder, an image generator and global-local discriminators. To explore cross-modal alignment knowledge for image restoration,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Mycobacterium research and diagnosis
MethodsInpainting
