NewMove: Customizing text-to-video models with novel motions

Joanna Materzynska; Josef Sivic; Eli Shechtman; Antonio Torralba,; Richard Zhang; Bryan Russell

arXiv:2312.04966·cs.CV·December 11, 2024·2 cites

NewMove: Customizing text-to-video models with novel motions

Joanna Materzynska, Josef Sivic, Eli Shechtman, Antonio Torralba,, Richard Zhang, Bryan Russell

PDF

Open Access

TL;DR

NewMove enables personalized motion customization in text-to-video models by learning from few examples, allowing diverse, multi-person, and multimodal video generation with improved accuracy over previous methods.

Contribution

The paper introduces a novel finetuning approach with regularization for customizing motions in text-to-video models using few samples, extending capabilities to multiple subjects and multimodal customization.

Findings

01

Outperforms prior appearance-based methods in motion customization

02

Enables multi-person and multimodal video generation with personalized motions

03

Provides a systematic quantitative evaluation and ablation study

Abstract

We introduce an approach for augmenting text-to-video generation models with customized motions, extending their capabilities beyond the motions depicted in the original training data. By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our contributions are threefold. First, to achieve our results, we finetune an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Second, by leveraging the motion priors in a pretrained model, our method can produce novel videos featuring multiple people doing the custom motion, and can invoke the motion in combination with other motions. Furthermore, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques