FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Weijie Lyu; Ming-Hsuan Yang; Zhixin Shu

arXiv:2603.05506·cs.CV·March 6, 2026

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu

PDF

Open Access 1 Models

TL;DR

FaceCam is a novel system that enables customizable, high-quality portrait video generation with precise camera control, overcoming previous limitations by using a scale-aware representation without 3D priors.

Contribution

We propose a face-specific scale-aware camera representation and training strategies that improve controllability and visual quality in portrait video synthesis.

Findings

01

Outperforms existing methods in camera controllability and visual quality

02

Maintains identity and motion consistency in generated videos

03

Effective on diverse in-the-wild videos

Abstract

We introduce FaceCam, a system that generates video under customizable camera trajectories for monocular human portrait video input. Recent camera control approaches based on large video-generation models have shown promising progress but often exhibit geometric distortions and visual artifacts on portrait videos due to scale-ambiguous camera representations or 3D reconstruction errors. To overcome these limitations, we propose a face-tailored scale-aware representation for camera transformations that provides deterministic conditioning without relying on 3D priors. We train a video generation model on both multi-view studio captures and in-the-wild monocular videos, and introduce two camera-control data generation strategies: synthetic camera motion and multi-shot stitching, to exploit stationary training cameras while generalizing to dynamic, continuous camera trajectories at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
AEmotionStudio/facecam-wan2.2-14b-bf16
model· ♡ 2
♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques