Vid-CamEdit: Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry

Junyoung Seo; Jisang Han; Jaewoo Jung; Siyoon Jin; Joungbin Lee; Takuya Narihira; Kazumi Fukuda; Takashi Shibuya; Donghoon Ahn; Shoukang Hu; Seungryong Kim; Yuki Mitsufuji

arXiv:2506.13697·cs.CV·June 17, 2025

Vid-CamEdit: Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry

Junyoung Seo, Jisang Han, Jaewoo Jung, Siyoon Jin, Joungbin Lee, Takuya Narihira, Kazumi Fukuda, Takashi Shibuya, Donghoon Ahn, Shoukang Hu, Seungryong Kim, Yuki Mitsufuji

PDF

Open Access

TL;DR

Vid-CamEdit is a framework that enables realistic video re-synthesis along new camera paths by combining geometry estimation with generative rendering, even in challenging in-the-wild scenarios.

Contribution

It introduces a geometry-guided generative approach for camera trajectory editing that does not require extensive 4D training data.

Findings

01

Outperforms baselines in extreme trajectory extrapolation

02

Produces plausible videos from novel camera paths in real-world footage

03

Effectively handles in-the-wild videos with limited multi-view data

Abstract

We introduce Vid-CamEdit, a novel framework for video camera trajectory editing, enabling the re-synthesis of monocular videos along user-defined camera paths. This task is challenging due to its ill-posed nature and the limited multi-view video data for training. Traditional reconstruction methods struggle with extreme trajectory changes, and existing generative models for dynamic novel view synthesis cannot handle in-the-wild videos. Our approach consists of two steps: estimating temporally consistent geometry, and generative rendering guided by this geometry. By integrating geometric priors, the generative model focuses on synthesizing realistic details where the estimated geometry is uncertain. We eliminate the need for extensive 4D training data through a factorized fine-tuning framework that separately trains spatial and temporal components using multi-view image and video data.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Advanced Vision and Imaging · 3D Shape Modeling and Analysis