Video-to-Video Synthesis
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan, Kautz, Bryan Catanzaro

TL;DR
This paper introduces a novel generative adversarial network framework for high-resolution, temporally coherent video synthesis from various input formats, significantly advancing the quality and length of synthesized videos.
Contribution
The paper presents a new video-to-video synthesis method with specialized architectures and a spatio-temporal adversarial loss, enabling high-resolution, realistic, and coherent videos from diverse inputs.
Findings
Achieves 2K resolution videos of street scenes up to 30 seconds long.
Outperforms existing methods on multiple benchmarks.
Effective for future video prediction tasks.
Abstract
We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image synthesis problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality. In this paper, we propose a novel video-to-video synthesis approach under the generative adversarial learning framework. Through carefully-designed generator and discriminator architectures, coupled with a spatio-temporal adversarial objective, we achieve high-resolution, photorealistic,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Image Processing Techniques
