MotionAdapter: Video Motion Transfer via Content-Aware Attention Customization
Zhexin Zhang, Yangyang Xu, Yifeng Zhu, Long Chen, Yong Du, Shengfeng He, Jun Yu

TL;DR
MotionAdapter introduces a content-aware framework for transferring complex, semantically aligned motion between videos within diffusion models, enhancing quality and flexibility.
Contribution
It proposes a novel disentanglement and customization approach for motion transfer in diffusion-based video generation, improving robustness and semantic alignment.
Findings
Outperforms state-of-the-art in qualitative and quantitative metrics.
Supports complex motion transfer and editing tasks.
Effectively disentangles motion from appearance using attention analysis.
Abstract
Recent advances in diffusion-based text-to-video models, particularly those built on the diffusion transformer architecture, have achieved remarkable progress in generating high-quality and temporally coherent videos. However, transferring complex motions between videos remains challenging. In this work, we present MotionAdapter, a content-aware motion transfer framework that enables robust and semantically aligned motion transfer within DiT-based video diffusion models. Our key insight is that effective motion transfer requires 1) explicit disentanglement of motion from appearance and 2) adaptive customization of motion to target content. MotionAdapter first isolates motion by analyzing cross-frame attention within 3D full-attention modules to extract attention-derived motion fields. To bridge the semantic gap between reference and target videos, we further introduce a DINO-guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
