LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer
Changgu Chen, Xiaoyan Yang, Junwei Shu, Changbo Wang, Yang Li

TL;DR
This paper introduces LMP, a framework that leverages motion priors in zero-shot video generation using diffusion transformers, enabling precise motion control from reference videos in both text-to-video and image-to-video tasks.
Contribution
The paper proposes a novel LMP framework with modules for foreground-background disentanglement, motion transfer, and appearance separation, improving motion control and generation quality in zero-shot video synthesis.
Findings
Achieves state-of-the-art results in video quality and control
Demonstrates effective motion transfer from reference videos
Enhances prompt-video consistency and diversity
Abstract
In recent years, large-scale pre-trained diffusion transformer models have made significant progress in video generation. While current DiT models can produce high-definition, high-frame-rate, and highly diverse videos, there is a lack of fine-grained control over the video content. Controlling the motion of subjects in videos using only prompts is challenging, especially when it comes to describing complex movements. Further, existing methods fail to control the motion in image-to-video generation, as the subject in the reference image often differs from the subject in the reference video in terms of initial position, size, and shape. To address this, we propose the Leveraging Motion Prior (LMP) framework for zero-shot video generation. Our framework harnesses the powerful generative capabilities of pre-trained diffusion transformers to enable motion in the generated videos to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image and Video Stabilization
MethodsDiffusion
