DreamVideo: Composing Your Dream Videos with Customized Subject and   Motion

Yujie Wei; Shiwei Zhang; Zhiwu Qing; Hangjie Yuan; Zhiheng Liu; Yu; Liu; Yingya Zhang; Jingren Zhou; Hongming Shan

arXiv:2312.04433·cs.CV·December 8, 2023·6 cites

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu, Liu, Yingya Zhang, Jingren Zhou, Hongming Shan

PDF

Open Access 1 Repo 4 Models

TL;DR

DreamVideo introduces a two-stage diffusion-based approach for personalized video generation, effectively capturing subject appearance and motion from limited static images and videos, outperforming existing methods.

Contribution

It proposes a novel decoupled framework with subject and motion adapters, enabling flexible and high-quality customized video synthesis from minimal input data.

Findings

01

Outperforms state-of-the-art methods in personalized video generation

02

Effectively captures fine subject details from limited images

03

Accurately models target motion patterns from few videos

Abstract

Customized generation using diffusion models has made impressive progress in image generation, but remains unsatisfactory in the challenging video generation task, as it requires the controllability of both subjects and motions. To that end, we present DreamVideo, a novel approach to generating personalized videos from a few static images of the desired subject and a few videos of target motion. DreamVideo decouples this task into two stages, subject learning and motion learning, by leveraging a pre-trained video diffusion model. The subject learning aims to accurately capture the fine appearance of the subject from provided images, which is achieved by combining textual inversion and fine-tuning of our carefully designed identity adapter. In motion learning, we architect a motion adapter and fine-tune it on the given videos to effectively model the target motion pattern. Combining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ali-vilab/VGen
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Video Analysis and Summarization

MethodsDiffusion · Adapter