Video Diffusion Models are Training-free Motion Interpreter and   Controller

Zeqi Xiao; Yifan Zhou; Shuai Yang; Xingang Pan

arXiv:2405.14864·cs.CV·November 13, 2024

Video Diffusion Models are Training-free Motion Interpreter and Controller

Zeqi Xiao, Yifan Zhou, Shuai Yang, Xingang Pan

PDF

Open Access 1 Repo

TL;DR

This paper reveals that video diffusion models inherently encode motion features, introduces a training-free motion control method using these features, and demonstrates its effectiveness in generating natural, controllable video motion.

Contribution

It uncovers the existence of interpretable motion-aware features in video diffusion models and proposes a training-free framework for motion control leveraging these features.

Findings

01

Motion-aware features are inherently encoded in diffusion models.

02

The proposed MOFT method enables training-free motion extraction.

03

The framework achieves competitive results in natural motion generation.

Abstract

Video generation primarily aims to model authentic and customized motion across frames, making understanding and controlling the motion a crucial topic. Most diffusion-based studies on video motion focus on motion customization with training-based paradigms, which, however, demands substantial training resources and necessitates retraining for diverse models. Crucially, these approaches do not explore how video diffusion models encode cross-frame motion information in their features, lacking interpretability and transparency in their effectiveness. To answer this question, this paper introduces a novel perspective to understand, localize, and manipulate motion-aware features in video diffusion models. Through analysis using Principal Component Analysis (PCA), our work discloses that robust motion-aware feature already exists in video diffusion models. We present a new MOtion FeaTure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ditflow/ditflow
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks

MethodsSparse Evolutionary Training · Focus · Diffusion