ConfCtrl: Enabling Precise Camera Control in Video Diffusion via Confidence-Aware Interpolation

Liudi Yang; George Eskandar; Fengyi Shen; Mohammad Altillawi; Yang Bai; Chi Zhang; Ziyuan Liu; Abhinav Valada

arXiv:2603.09819·cs.CV·March 11, 2026

ConfCtrl: Enabling Precise Camera Control in Video Diffusion via Confidence-Aware Interpolation

Liudi Yang, George Eskandar, Fengyi Shen, Mohammad Altillawi, Yang Bai, Chi Zhang, Ziyuan Liu, Abhinav Valada

PDF

Open Access

TL;DR

ConfCtrl introduces a confidence-aware interpolation method for diffusion models to synthesize novel views from two images, accurately reconstructing unseen regions while adhering to prescribed camera trajectories.

Contribution

The paper presents ConfCtrl, a novel framework that combines confidence-weighted point cloud projections with a Kalman-inspired update to improve view synthesis accuracy.

Findings

01

Produces geometrically consistent novel views

02

Effectively reconstructs occluded regions

03

Maintains stable camera trajectories during synthesis

Abstract

We address the challenge of novel view synthesis from only two input images under large viewpoint changes. Existing regression-based methods lack the capacity to reconstruct unseen regions, while camera-guided diffusion models often deviate from intended trajectories due to noisy point cloud projections or insufficient conditioning from camera poses. To address these issues, we propose ConfCtrl, a confidence-aware video interpolation framework that enables diffusion models to follow prescribed camera poses while completing unseen regions. ConfCtrl initializes the diffusion process by combining a confidence-weighted projected point cloud latent with noise as the conditioning input. It then applies a Kalman-inspired predict-update mechanism, treating the projected point cloud as a noisy measurement and using learned residual corrections to balance pose-driven predictions with noisy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis