Still-Moving: Customized Video Generation without Customized Video Data

Hila Chefer; Shiran Zada; Roni Paiss; Ariel Ephrat; Omer Tov; Michael; Rubinstein; Lior Wolf; Tali Dekel; Tomer Michaeli; Inbar Mosseri

arXiv:2407.08674·cs.CV·July 12, 2024·1 cites

Still-Moving: Customized Video Generation without Customized Video Data

Hila Chefer, Shiran Zada, Roni Paiss, Ariel Ephrat, Omer Tov, Michael, Rubinstein, Lior Wolf, Tali Dekel, Tomer Michaeli, Inbar Mosseri

PDF

Open Access

TL;DR

This paper introduces Still-Moving, a framework that enables customization of text-to-video models without needing customized video data, by adapting a pre-trained text-to-image model with lightweight adapters trained on static videos.

Contribution

The paper presents a novel method using Spatial and Motion Adapters to customize T2V models based on T2I models trained on still images, without requiring video data.

Findings

01

Effective in personalized, stylized, and conditional generation tasks

02

Seamlessly integrates spatial priors from T2I with motion priors of T2V

03

Maintains motion prior while adhering to customized spatial features

Abstract

Customizing text-to-image (T2I) models has seen tremendous progress recently, particularly in areas such as personalization, stylization, and conditional generation. However, expanding this progress to video generation is still in its infancy, primarily due to the lack of customized video data. In this work, we introduce Still-Moving, a novel generic framework for customizing a text-to-video (T2V) model, without requiring any customized video data. The framework applies to the prominent T2V design where the video model is built over a text-to-image (T2I) model (e.g., via inflation). We assume access to a customized version of the T2I model, trained only on still image data (e.g., using DreamBooth or StyleDrop). Naively plugging in the weights of the customized T2I model into the T2V model often leads to significant artifacts or insufficient adherence to the customization data. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimedia Communication and Technology

MethodsAdapter