Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation
Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li

TL;DR
This paper introduces Adma-GAN, a novel text-to-image generation model that uses attribute memory and joint learning schemes to improve image quality and semantic accuracy, especially when multiple attributes are involved.
Contribution
The paper proposes an attribute memory mechanism and joint learning scheme to enhance text-to-image synthesis by capturing key attribute information and aligning multi-modal features.
Findings
Significant FID score improvements on CUB and COCO datasets.
Effective attribute memory control enhances image realism.
Joint training scheme improves semantic consistency.
Abstract
As a challenging task, text-to-image generation aims to generate photo-realistic and semantically consistent images according to the given text descriptions. Existing methods mainly extract the text information from only one sentence to represent an image and the text representation effects the quality of the generated image well. However, directly utilizing the limited information in one sentence misses some key attribute descriptions, which are the crucial factors to describe an image accurately. To alleviate the above problem, we propose an effective text representation method with the complements of attribute information. Firstly, we construct an attribute memory to jointly control the text-to-image generation with sentence input. Secondly, we explore two update mechanisms, sample-aware and sample-joint mechanisms, to dynamically optimize a generalized attribute memory. Furthermore,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
