CameraCtrl: Enabling Camera Control for Text-to-Video Generation

Hao He; Yinghao Xu; Yuwei Guo; Gordon Wetzstein; Bo Dai; Hongsheng Li,; Ceyuan Yang

arXiv:2404.02101·cs.CV·March 17, 2025·2 cites

CameraCtrl: Enabling Camera Control for Text-to-Video Generation

Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li,, Ceyuan Yang

PDF

Open Access 1 Repo

TL;DR

CameraCtrl introduces a method for precise camera pose control in text-to-video generation, enhancing narrative expressiveness and customization by training a plug-and-play module on top of existing diffusion models.

Contribution

It presents a novel camera control module that can be integrated into video diffusion models without altering their core, improving controllability and generalization in video synthesis.

Findings

01

CameraCtrl achieves accurate camera pose control in generated videos.

02

Training on datasets with diverse camera distributions improves controllability.

03

The method enhances dynamic and customized video storytelling capabilities.

Abstract

Controllability plays a crucial role in video generation, as it allows users to create and edit content more precisely. Existing models, however, lack control of camera pose that serves as a cinematic language to express deeper narrative nuances. To alleviate this issue, we introduce CameraCtrl, enabling accurate camera pose control for video diffusion models. Our approach explores effective camera trajectory parameterization along with a plug-and-play camera pose control module that is trained on top of a video diffusion model, leaving other modules of the base model untouched. Moreover, a comprehensive study on the effect of various training datasets is conducted, suggesting that videos with diverse camera distributions and similar appearance to the base model indeed enhance controllability and generalization. Experimental results demonstrate the effectiveness of CameraCtrl in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hehao13/cameractrl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Motion and Animation · Advanced Image and Video Retrieval Techniques