StableAnimator: High-Quality Identity-Preserving Human Image Animation

Shuyuan Tu; Zhen Xing; Xintong Han; Zhi-Qi Cheng; Qi Dai; Chong Luo,; Zuxuan Wu

arXiv:2411.17697·cs.CV·November 28, 2024

StableAnimator: High-Quality Identity-Preserving Human Image Animation

Shuyuan Tu, Zhen Xing, Xintong Han, Zhi-Qi Cheng, Qi Dai, Chong Luo,, Zuxuan Wu

PDF

Open Access 1 Repo 2 Models

TL;DR

StableAnimator is an innovative end-to-end video diffusion framework that ensures high-quality, identity-preserving human image animation by integrating novel modules and optimization techniques, significantly improving ID consistency in generated videos.

Contribution

It introduces a novel ID-preserving video diffusion framework with a distribution-aware ID Adapter and HJB-based optimization for enhanced identity consistency.

Findings

01

Effective identity preservation demonstrated on multiple benchmarks.

02

High-quality video synthesis without post-processing.

03

Quantitative and qualitative improvements over existing methods.

Abstract

Current diffusion models for human image animation struggle to ensure identity (ID) consistency. This paper presents StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a sequence of poses. Building upon a video diffusion model, StableAnimator contains carefully designed modules for both training and inference striving for identity consistency. In particular, StableAnimator begins by computing image and face embeddings with off-the-shelf extractors, respectively and face embeddings are further refined by interacting with image embeddings using a global content-aware Face Encoder. Then, StableAnimator introduces a novel distribution-aware ID Adapter that prevents interference caused by temporal layers while preserving ID via alignment. During inference, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Francis-Rings/StableAnimator
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis

MethodsAdapter · Diffusion