DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for   Text-Guided Image Inpainting

Jihoon Lee; Yunhong Min; Hwidong Kim; Sangtae Ahn

arXiv:2408.04962·cs.CV·August 12, 2024

DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting

Jihoon Lee, Yunhong Min, Hwidong Kim, Sangtae Ahn

PDF

TL;DR

DAFT-GAN introduces a dual affine transformation approach to improve semantic consistency in text-guided image inpainting, effectively handling corrupted regions and outperforming existing models on multiple benchmarks.

Contribution

The paper proposes a novel dual affine transformation GAN that enhances text-image feature integration and reduces information leakage for better inpainting results.

Findings

01

Outperforms existing GAN models on MS-COCO, CUB, and Oxford datasets.

02

Maintains semantic consistency between text and image during inpainting.

03

Effectively separates corrupted and uncorrupted regions for fine-grained generation.

Abstract

In recent years, there has been a significant focus on research related to text-guided image inpainting. However, the task remains challenging due to several constraints, such as ensuring alignment between the image and the text, and maintaining consistency in distribution between corrupted and uncorrupted regions. In this paper, thus, we propose a dual affine transformation generative adversarial network (DAFT-GAN) to maintain the semantic consistency for text-guided inpainting. DAFT-GAN integrates two affine transformation networks to combine text and image features gradually for each decoding block. Moreover, we minimize information leakage of uncorrupted features for fine-grained image generation by encoding corrupted and uncorrupted regions of the masked image separately. Our proposed model outperforms the existing GAN-based models in both qualitative and quantitative assessments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus