FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation

Yunpeng Zhang; Qiang Wang; Fan Jiang; Yaqi Fan; Mu Xu; Yonggang Qi

arXiv:2502.13995·cs.GR·February 26, 2025

FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation

Yunpeng Zhang, Qiang Wang, Fan Jiang, Yaqi Fan, Mu Xu, Yonggang Qi

PDF

Open Access 1 Repo 1 Models

TL;DR

FantasyID is a tuning-free video generation framework that enhances face knowledge in diffusion transformers, ensuring identity preservation and realistic facial dynamics through 3D priors and adaptive feature injection.

Contribution

It introduces a novel face knowledge enhancement method with 3D priors and adaptive feature guidance for identity-preserving text-to-video synthesis.

Findings

01

Outperforms existing tuning-free IPT2V methods.

02

Effectively maintains facial identity during dynamic video synthesis.

03

Improves facial expression and pose diversity in generated videos.

Abstract

Tuning-free approaches adapting large-scale pre-trained video diffusion models for identity-preserving text-to-video generation (IPT2V) have gained popularity recently due to their efficacy and scalability. However, significant challenges remain to achieve satisfied facial dynamics while keeping the identity unchanged. In this work, we present a novel tuning-free IPT2V framework by enhancing face knowledge of the pre-trained video model built on diffusion transformers (DiT), dubbed FantasyID. Essentially, 3D facial geometry prior is incorporated to ensure plausible facial structures during video synthesis. To prevent the model from learning copy-paste shortcuts that simply replicate reference face across frames, a multi-view face augmentation strategy is devised to capture diverse 2D facial appearance features, hence increasing the dynamics over the facial expressions and head poses.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Fantasy-AMAP/fantasy-id
pytorchOfficial

Models

🤗
acvlab/FantasyID
model· 3 dl· ♡ 4
3 dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques