FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu

TL;DR
FaceCam is a novel system that enables customizable, high-quality portrait video generation with precise camera control, overcoming previous limitations by using a scale-aware representation without 3D priors.
Contribution
We propose a face-specific scale-aware camera representation and training strategies that improve controllability and visual quality in portrait video synthesis.
Findings
Outperforms existing methods in camera controllability and visual quality
Maintains identity and motion consistency in generated videos
Effective on diverse in-the-wild videos
Abstract
We introduce FaceCam, a system that generates video under customizable camera trajectories for monocular human portrait video input. Recent camera control approaches based on large video-generation models have shown promising progress but often exhibit geometric distortions and visual artifacts on portrait videos due to scale-ambiguous camera representations or 3D reconstruction errors. To overcome these limitations, we propose a face-tailored scale-aware representation for camera transformations that provides deterministic conditioning without relying on 3D priors. We train a video generation model on both multi-view studio captures and in-the-wild monocular videos, and introduce two camera-control data generation strategies: synthetic camera motion and multi-shot stitching, to exploit stationary training cameras while generalizing to dynamic, continuous camera trajectories at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques
