Latent-Reframe: Enabling Camera Control for Video Diffusion Model   without Training

Zhenghong Zhou; Jie An; Jiebo Luo

arXiv:2412.06029·cs.CV·December 10, 2024

Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training

Zhenghong Zhou, Jie An, Jiebo Luo

PDF

Open Access

TL;DR

Latent-Reframe enables precise camera control in pre-trained video diffusion models during sampling, avoiding fine-tuning and maintaining high video quality by reframing latent codes with a novel approach.

Contribution

It introduces a novel sampling-stage method for camera control in video diffusion models without fine-tuning or additional datasets.

Findings

01

Achieves comparable or better camera control accuracy than training-based methods.

02

Maintains original model distribution and efficiency during sampling.

03

Produces high-quality videos with precise camera adjustments.

Abstract

Precise camera pose control is crucial for video generation with diffusion models. Existing methods require fine-tuning with additional datasets containing paired videos and camera pose annotations, which are both data-intensive and computationally costly, and can disrupt the pre-trained model distribution. We introduce Latent-Reframe, which enables camera control in a pre-trained video diffusion model without fine-tuning. Unlike existing methods, Latent-Reframe operates during the sampling stage, maintaining efficiency while preserving the original model distribution. Our approach reframes the latent code of video frames to align with the input camera trajectory through time-aware point clouds. Latent code inpainting and harmonization then refine the model latent space, ensuring high-quality video generation. Experimental results demonstrate that Latent-Reframe achieves comparable or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Advanced Data Compression Techniques

MethodsALIGN · Diffusion · Inpainting