Motion Prompting: Controlling Video Generation with Motion Trajectories

Daniel Geng; Charles Herrmann; Junhwa Hur; Forrester Cole; Serena; Zhang; Tobias Pfaff; Tatiana Lopez-Guevara; Carl Doersch; Yusuf Aytar,; Michael Rubinstein; Chen Sun; Oliver Wang; Andrew Owens; Deqing Sun

arXiv:2412.02700·cs.CV·March 31, 2025

Motion Prompting: Controlling Video Generation with Motion Trajectories

Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena, Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar,, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun

PDF

Open Access

TL;DR

This paper introduces motion prompts, a flexible method for controlling video generation using motion trajectories, enabling nuanced dynamic actions, scene control, and editing, with demonstrated applications and strong quantitative and human evaluation results.

Contribution

We propose motion prompts, a novel flexible motion conditioning method for video generation that encodes diverse trajectories and translates user requests into detailed motion controls.

Findings

01

Effective control of object and scene motion in generated videos.

02

Versatility demonstrated through applications like motion transfer and editing.

03

Strong quantitative and human evaluation results.

Abstract

Motion control is crucial for generating expressive and compelling video content; however, most existing video generation models rely mainly on text prompts for control, which struggle to capture the nuances of dynamic actions and temporal compositions. To this end, we train a video generation model conditioned on spatio-temporally sparse or dense motion trajectories. In contrast to prior motion conditioning work, this flexible representation can encode any number of trajectories, object-specific or global scene motion, and temporally sparse motion; due to its flexibility we refer to this conditioning as motion prompts. While users may directly specify sparse trajectories, we also show how to translate high-level user requests into detailed, semi-dense motion prompts, a process we term motion prompt expansion. We demonstrate the versatility of our approach through various applications,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies