ConfCtrl: Enabling Precise Camera Control in Video Diffusion via Confidence-Aware Interpolation
Liudi Yang, George Eskandar, Fengyi Shen, Mohammad Altillawi, Yang Bai, Chi Zhang, Ziyuan Liu, Abhinav Valada

TL;DR
ConfCtrl introduces a confidence-aware interpolation method for diffusion models to synthesize novel views from two images, accurately reconstructing unseen regions while adhering to prescribed camera trajectories.
Contribution
The paper presents ConfCtrl, a novel framework that combines confidence-weighted point cloud projections with a Kalman-inspired update to improve view synthesis accuracy.
Findings
Produces geometrically consistent novel views
Effectively reconstructs occluded regions
Maintains stable camera trajectories during synthesis
Abstract
We address the challenge of novel view synthesis from only two input images under large viewpoint changes. Existing regression-based methods lack the capacity to reconstruct unseen regions, while camera-guided diffusion models often deviate from intended trajectories due to noisy point cloud projections or insufficient conditioning from camera poses. To address these issues, we propose ConfCtrl, a confidence-aware video interpolation framework that enables diffusion models to follow prescribed camera poses while completing unseen regions. ConfCtrl initializes the diffusion process by combining a confidence-weighted projected point cloud latent with noise as the conditioning input. It then applies a Kalman-inspired predict-update mechanism, treating the projected point cloud as a noisy measurement and using learned residual corrections to balance pose-driven predictions with noisy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
