FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint
Jiapeng Tang, Kai Li, Chengxiang Yin, Liuhao Ge, Fei Jiang, Jiu Xu, Matthias Nie{\ss}ner, Christian H\"ane, Timur Bagautdinov, Egor Zakharov, Peihong Guo

TL;DR
FactorPortrait is a novel video diffusion technique that enables realistic, controllable portrait animation from a single image, allowing independent manipulation of expressions, head poses, and viewpoints with high fidelity.
Contribution
It introduces a disentangled control framework using a pre-trained encoder and a diffusion transformer for lifelike portrait animation with multi-faceted control.
Findings
Outperforms existing methods in realism and expressiveness
Achieves accurate control of facial expressions and head movements
Enables novel view synthesis from arbitrary viewpoints
Abstract
We introduce FactorPortrait, a video diffusion method for controllable portrait animation that enables lifelike synthesis from disentangled control signals of facial expressions, head movement, and camera viewpoints. Given a single portrait image, a driving video, and camera trajectories, our method animates the portrait by transferring facial expressions and head movements from the driving video while simultaneously enabling novel view synthesis from arbitrary viewpoints. We utilize a pre-trained image encoder to extract facial expression latents from the driving video as control signals for animation generation. Such latents implicitly capture nuanced facial expression dynamics with identity and pose information disentangled, and they are efficiently injected into the video diffusion transformer through our proposed expression controller. For camera and head pose control, we employ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Emotion and Mood Recognition
