VideoComposer: Compositional Video Synthesis with Motion Controllability

Xiang Wang; Hangjie Yuan; Shiwei Zhang; Dayou Chen; Jiuniu Wang,; Yingya Zhang; Yujun Shen; Deli Zhao; Jingren Zhou

arXiv:2306.02018·cs.CV·June 7, 2023·43 cites

VideoComposer: Compositional Video Synthesis with Motion Controllability

Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang,, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou

PDF

Open Access 4 Repos 6 Models 1 Video

TL;DR

VideoComposer introduces a novel method for controllable video synthesis that integrates textual, spatial, and temporal conditions, including motion vectors, to generate videos with high inter-frame consistency and flexible content control.

Contribution

It presents a new compositional framework for video synthesis that effectively incorporates motion vectors and a spatio-temporal encoder for enhanced controllability and consistency.

Findings

01

Achieves high inter-frame consistency in synthesized videos.

02

Enables control via multiple input modalities like text, sketches, and motion.

03

Demonstrates flexible and precise video content creation.

Abstract

The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis. However, achieving controllable video synthesis remains challenging due to the large variation of temporal dynamics and the requirement of cross-frame temporal consistency. Based on the paradigm of compositional generation, this work presents VideoComposer that allows users to flexibly compose a video with textual conditions, spatial conditions, and more importantly temporal conditions. Specifically, considering the characteristic of video data, we introduce the motion vector from compressed videos as an explicit control signal to provide guidance regarding temporal dynamics. In addition, we develop a Spatio-Temporal Condition encoder (STC-encoder) that serves as a unified interface to effectively incorporate the spatial and temporal relations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

VideoComposer: Compositional Video Synthesis with Motion Controllability· slideslive

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging