AnyStory: Towards Unified Single and Multiple Subject Personalization in   Text-to-Image Generation

Junjie He; Yuxiang Tuo; Binghui Chen; Chongyang Zhong; Yifeng Geng,; Liefeng Bo

arXiv:2501.09503·cs.CV·May 2, 2025

AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation

Junjie He, Yuxiang Tuo, Binghui Chen, Chongyang Zhong, Yifeng Geng,, Liefeng Bo

PDF

Open Access 1 Repo 1 Models

TL;DR

AnyStory introduces a unified method for high-fidelity personalization in text-to-image generation, effectively handling both single and multiple subjects without compromising detail or accuracy.

Contribution

The paper presents a novel encode-then-route framework utilizing ReferenceNet and CLIP for personalized subject encoding and a decoupled router for precise subject placement in generated images.

Findings

01

Achieves high-fidelity personalization for single subjects.

02

Effectively handles multiple subjects without loss of detail.

03

Demonstrates superior alignment with text descriptions.

Abstract

Recently, large-scale generative models have demonstrated outstanding text-to-image generation capabilities. However, generating high-fidelity personalized images with specific subjects still presents challenges, especially in cases involving multiple subjects. In this paper, we propose AnyStory, a unified approach for personalized subject generation. AnyStory not only achieves high-fidelity personalization for single subjects, but also for multiple subjects, without sacrificing subject fidelity. Specifically, AnyStory models the subject personalization problem in an "encode-then-route" manner. In the encoding step, AnyStory utilizes a universal and powerful image encoder, i.e., ReferenceNet, in conjunction with CLIP vision encoder to achieve high-fidelity encoding of subject features. In the routing step, AnyStory utilizes a decoupled instance-aware subject router to accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junjiehe96/AnyStory
pytorchOfficial

Models

🤗
Junjie96/AnyStory
model· 89 dl· ♡ 3
89 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Digital Humanities and Scholarship · Topic Modeling

MethodsContrastive Language-Image Pre-training