TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image   Generation

Ruicheng Zhang; Guoheng Huang; Yejing Huo; Xiaochen Yuan; Zhizhen; Zhou; Xuhang Chen; Guo Zhong

arXiv:2410.17855·cs.CV·October 24, 2024

TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image Generation

Ruicheng Zhang, Guoheng Huang, Yejing Huo, Xiaochen Yuan, Zhizhen, Zhou, Xuhang Chen, Guo Zhong

PDF

Open Access

TL;DR

TAGE is a novel image generation network that enables stable, attribute-controlled image editing in few-shot scenarios by leveraging a codebook and semantic cues within a Transformer framework.

Contribution

The paper introduces TAGE, a new few-shot image generation method with modules for codebook learning, attribute prediction, and semantic prompting, improving stability and control.

Findings

01

Achieves superior performance on Animal Faces, Flowers, and VGGFaces datasets.

02

Demonstrates high stability compared to existing few-shot image generation methods.

03

Effectively manipulates category-agnostic attributes for unseen categories.

Abstract

Generative Adversarial Networks (GANs) have emerged as a prominent research focus for image editing tasks, leveraging the powerful image generation capabilities of the GAN framework to produce remarkable results.However, prevailing approaches are contingent upon extensive training datasets and explicit supervision, presenting a significant challenge in manipulating the diverse attributes of new image classes with limited sample availability. To surmount this hurdle, we introduce TAGE, an innovative image generation network comprising three integral modules: the Codebook Learning Module (CLM), the Code Prediction Module (CPM) and the Prompt-driven Semantic Module (PSM). The CPM module delves into the semantic dimensions of category-agnostic attributes, encapsulating them within a discrete codebook. This module is predicated on the concept that images are assemblages of attributes, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsLinear Layer · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Attention Is All You Need · Dense Connections · Softmax · Multi-Head Attention · Adam · Dropout