ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with   Reward Feedback Learning

Weifeng Chen; Jiacheng Zhang; Jie Wu; Hefeng Wu; Xuefeng Xiao; Liang; Lin

arXiv:2404.15449·cs.CV·April 25, 2024·2 cites

ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning

Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang, Lin

PDF

Open Access

TL;DR

ID-Aligner is a novel framework that improves identity preservation and aesthetic quality in text-to-image generation by using reward feedback learning, compatible with various model adaptation techniques.

Contribution

It introduces a universal feedback fine-tuning framework that enhances identity retention and aesthetic appeal in ID-T2I models, compatible with LoRA and Adapter methods.

Findings

01

Significant improvement in identity preservation on SD1.5 and SDXL models.

02

Enhanced aesthetic quality with human-annotated preference feedback.

03

Effective integration of face recognition feedback for identity consistency.

Abstract

The rapid development of diffusion models has triggered diverse applications. Identity-preserving text-to-image generation (ID-T2I) particularly has received significant attention due to its wide range of application scenarios like AI portrait and advertising. While existing ID-T2I methods have demonstrated impressive results, several key challenges remain: (1) It is hard to maintain the identity characteristics of reference portraits accurately, (2) The generated images lack aesthetic appeal especially while enforcing identity retention, and (3) There is a limitation that cannot be compatible with LoRA-based and Adapter-based methods simultaneously. To address these issues, we present \textbf{ID-Aligner}, a general feedback learning framework to enhance ID-T2I performance. To resolve identity features lost, we introduce identity consistency reward fine-tuning to utilize the feedback…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games

MethodsAdapter · Diffusion