Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Shuai Yang; Yifan Zhou; Ziwei Liu; Chen Change Loy

arXiv:2306.07954·cs.CV·September 19, 2023·1 cites

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces a zero-shot, text-guided video translation framework that adapts image diffusion models to generate temporally consistent videos without re-training, leveraging hierarchical constraints and patch matching.

Contribution

It presents a novel two-part framework for video translation that ensures temporal coherence using hierarchical constraints and patch matching, compatible with existing diffusion models.

Findings

01

Achieves high-quality, temporally-coherent videos

02

Operates without re-training or optimization

03

Compatible with existing diffusion techniques

Abstract

Large text-to-image diffusion models have exhibited impressive proficiency in generating high-quality images. However, when applying these models to video domain, ensuring temporal consistency across video frames remains a formidable challenge. This paper proposes a novel zero-shot text-guided video-to-video translation framework to adapt image models to videos. The framework includes two parts: key frame translation and full video translation. The first part uses an adapted diffusion model to generate key frames, with hierarchical cross-frame constraints applied to enforce coherence in shapes, textures and colors. The second part propagates the key frames to other frames with temporal-aware patch matching and frame blending. Our framework achieves global style and local texture temporal consistency at a low cost (without re-training or optimization). The adaptation is compatible with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

williamyang1991/rerender_a_video
pytorch

Datasets

diffusers/community-pipelines-mirror
dataset· 29k dl
29k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques

MethodsDiffusion