DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting
Jihoon Lee, Yunhong Min, Hwidong Kim, Sangtae Ahn

TL;DR
DAFT-GAN introduces a dual affine transformation approach to improve semantic consistency in text-guided image inpainting, effectively handling corrupted regions and outperforming existing models on multiple benchmarks.
Contribution
The paper proposes a novel dual affine transformation GAN that enhances text-image feature integration and reduces information leakage for better inpainting results.
Findings
Outperforms existing GAN models on MS-COCO, CUB, and Oxford datasets.
Maintains semantic consistency between text and image during inpainting.
Effectively separates corrupted and uncorrupted regions for fine-grained generation.
Abstract
In recent years, there has been a significant focus on research related to text-guided image inpainting. However, the task remains challenging due to several constraints, such as ensuring alignment between the image and the text, and maintaining consistency in distribution between corrupted and uncorrupted regions. In this paper, thus, we propose a dual affine transformation generative adversarial network (DAFT-GAN) to maintain the semantic consistency for text-guided inpainting. DAFT-GAN integrates two affine transformation networks to combine text and image features gradually for each decoding block. Moreover, we minimize information leakage of uncorrupted features for fine-grained image generation by encoding corrupted and uncorrupted regions of the masked image separately. Our proposed model outperforms the existing GAN-based models in both qualitative and quantitative assessments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
