OmniCamera: A Unified Framework for Multi-task Video Generation with Arbitrary Camera Control

Yukun Wang; Ruihuang Li; Jiale Tao; Shiyuan Yang; Liyi Chen; Zhantao Yang; Handz; Yulan Guo; Shuai Shao; Qinglin Lu

arXiv:2604.06010·cs.CV·April 8, 2026

OmniCamera: A Unified Framework for Multi-task Video Generation with Arbitrary Camera Control

Yukun Wang, Ruihuang Li, Jiale Tao, Shiyuan Yang, Liyi Chen, Zhantao Yang, Handz, Yulan Guo, Shuai Shao, Qinglin Lu

PDF

TL;DR

OmniCamera is a unified framework that disentangles scene content and camera motion, enabling flexible, independent control over video generation with high visual quality.

Contribution

It introduces a novel hybrid dataset and a dual-level curriculum co-training strategy to improve multi-task video generation with arbitrary camera control.

Findings

01

Achieves state-of-the-art performance in flexible camera control.

02

Maintains high visual quality in generated videos.

03

Successfully disentangles content and camera motion for independent manipulation.

Abstract

Video fundamentally intertwines two crucial axes: the dynamic content of a scene and the camera motion through which it is observed. However, existing generation models often entangle these factors, limiting independent control. In this work, we introduce OmniCamera, a unified framework designed to explicitly disentangle and command these two dimensions. This compositional approach enables flexible video generation by allowing arbitrary pairings of camera and content conditions, unlocking unprecedented creative control. To overcome the fundamental challenges of modality conflict and data scarcity inherent in such a system, we present two key innovations. First, we construct OmniCAM, a novel hybrid dataset combining curated real-world videos with synthetic data that provides diverse paired examples for robust multi-task learning. Second, we propose a Dual-level Curriculum Co-Training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.