FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint

Jiapeng Tang; Kai Li; Chengxiang Yin; Liuhao Ge; Fei Jiang; Jiu Xu; Matthias Nie{\ss}ner; Christian H\"ane; Timur Bagautdinov; Egor Zakharov; Peihong Guo

arXiv:2512.11645·cs.CV·December 15, 2025

FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint

Jiapeng Tang, Kai Li, Chengxiang Yin, Liuhao Ge, Fei Jiang, Jiu Xu, Matthias Nie{\ss}ner, Christian H\"ane, Timur Bagautdinov, Egor Zakharov, Peihong Guo

PDF

Open Access

TL;DR

FactorPortrait is a novel video diffusion technique that enables realistic, controllable portrait animation from a single image, allowing independent manipulation of expressions, head poses, and viewpoints with high fidelity.

Contribution

It introduces a disentangled control framework using a pre-trained encoder and a diffusion transformer for lifelike portrait animation with multi-faceted control.

Findings

01

Outperforms existing methods in realism and expressiveness

02

Achieves accurate control of facial expressions and head movements

03

Enables novel view synthesis from arbitrary viewpoints

Abstract

We introduce FactorPortrait, a video diffusion method for controllable portrait animation that enables lifelike synthesis from disentangled control signals of facial expressions, head movement, and camera viewpoints. Given a single portrait image, a driving video, and camera trajectories, our method animates the portrait by transferring facial expressions and head movements from the driving video while simultaneously enabling novel view synthesis from arbitrary viewpoints. We utilize a pre-trained image encoder to extract facial expression latents from the driving video as control signals for animation generation. Such latents implicitly capture nuanced facial expression dynamics with identity and pose information disentangled, and they are efficiently injected into the video diffusion transformer through our proposed expression controller. For camera and head pose control, we employ…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Emotion and Mood Recognition