Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation
Xincheng Shuai, Henghui Ding, Zhenyuan Qin, Hao Luo, Xingjun Ma, Dacheng Tao

TL;DR
This paper introduces SynFMC, a synthetic dataset with comprehensive 6D pose annotations, and proposes FMC, a method for precise 3D-aware motion control in video generation, enabling independent or simultaneous control of camera and object movements.
Contribution
The paper presents a new synthetic dataset SynFMC with detailed 6D pose annotations and a novel motion control method FMC that achieves high-fidelity, controllable video generation with compatibility to various T2I models.
Findings
FMC outperforms previous methods in multiple scenarios.
SynFMC enables models to learn disentangled motion effects.
FMC allows independent or simultaneous control of camera and object motions.
Abstract
Controlling the movements of dynamic objects and the camera within generated videos is a meaningful yet challenging task. Due to the lack of datasets with comprehensive 6D pose annotations, existing text-to-video methods can not simultaneously control the motions of both camera and objects in 3D-aware manner, resulting in limited controllability over generated contents. To address this issue and facilitate the research in this field, we introduce a Synthetic Dataset for Free-Form Motion Control (SynFMC). The proposed SynFMC dataset includes diverse object and environment categories and covers various motion patterns according to specific rules, simulating common and complex real-world scenarios. The complete 6D pose information facilitates models learning to disentangle the motion effects from objects and the camera in a video.~To provide precise 3D-aware motion control, we further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
