Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation

Min-Jung Kim; Jeongho Kim; Hoiyeong Jin; Junha Hyung; Jaegul Choo

arXiv:2512.17040·cs.CV·December 22, 2025

Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation

Min-Jung Kim, Jeongho Kim, Hoiyeong Jin, Junha Hyung, Jaegul Choo

PDF

Open Access 1 Models

TL;DR

InfCam introduces a novel depth-free video generation framework that uses infinite homography warping and data augmentation to achieve high-fidelity, camera-controlled video synthesis with diverse trajectories, outperforming existing methods.

Contribution

The paper proposes InfCam, a depth-free, camera-controlled video generation method that encodes 3D rotations in 2D latent space and enhances data diversity, improving pose fidelity and visual quality.

Findings

01

Outperforms baseline methods in camera-pose accuracy

02

Generalizes well from synthetic to real-world data

03

Achieves high visual fidelity in generated videos

Abstract

Recent progress in video diffusion models has spurred growing interest in camera-controlled novel-view video generation for dynamic scenes, aiming to provide creators with cinematic camera control capabilities in post-production. A key challenge in camera-controlled video generation is ensuring fidelity to the specified camera pose, while maintaining view consistency and reasoning about occluded geometry from limited observations. To address this, existing methods either train trajectory-conditioned video generation model on trajectory-video pair dataset, or estimate depth from the input video to reproject it along a target trajectory and generate the unprojected regions. Nevertheless, existing methods struggle to generate camera-pose-faithful, high-quality videos for two main reasons: (1) reprojection-based approaches are highly susceptible to errors caused by inaccurate depth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
emjay73/InfCam
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation