MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Muyao Niu, Xiaodong Cun, Xintao Wang, Yong Zhang, Ying Shan, and, Yinqiang Zheng

TL;DR
MOFA-Video introduces a controllable image-to-video generation method that uses domain-aware motion adapters to incorporate various control signals, enabling more flexible and stable video animation from a single image.
Contribution
The paper proposes MOFA-Adapters, a novel domain-aware motion control mechanism that improves controllability and stability in image-to-video diffusion models.
Findings
Effective control with multiple signals like landmarks and trajectories.
Stable video generation with multi-scale feature guidance.
Adapters trained separately can be combined for enhanced control.
Abstract
We present MOFA-Video, an advanced controllable image animation method that generates video from the given image using various additional controllable signals (such as human landmarks reference, manual trajectories, and another even provided video) or their combinations. This is different from previous methods which only can work on a specific motion domain or show weak control abilities with diffusion prior. To achieve our goal, we design several domain-aware motion field adapters (\ie, MOFA-Adapters) to control the generated motions in the video generation pipeline. For MOFA-Adapters, we consider the temporal motion consistency of the video and generate the dense motion flow from the given sparse control conditions first, and then, the multi-scale features of the given image are wrapped as a guided feature for stable video diffusion generation. We naively train two motion adapters for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image and Signal Denoising Methods · Computer Graphics and Visualization Techniques
MethodsDiffusion
