MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu,, Huaian Chen, Jiaqi Wang, Yi Jin

TL;DR
MotionClone introduces a training-free method for controllable video generation that transfers motion from reference videos using sparse temporal attention, enabling flexible and efficient motion cloning without model retraining.
Contribution
It proposes a novel training-free framework leveraging temporal attention for motion cloning, eliminating the need for model training or fine-tuning.
Findings
Effective in transferring both global and local motions
Outperforms existing methods in motion fidelity and temporal consistency
Enables versatile motion-controlled video generation from various inputs
Abstract
Motion-based controllable video generation offers the potential for creating captivating visual content. Existing methods typically necessitate model training to encode particular motion cues or incorporate fine-tuning to inject certain motion patterns, resulting in limited flexibility and generalization. In this work, we propose MotionClone, a training-free framework that enables motion cloning from reference videos to versatile motion-controlled video generation, including text-to-video and image-to-video. Based on the observation that the dominant components in temporal-attention maps drive motion synthesis, while the rest mainly capture noisy or very subtle motions, MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Vision and Imaging · Human Motion and Animation
MethodsDiffusion
