Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Wen Wang, Yan Jiang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao,, Xinlong Wang, Chunhua Shen

TL;DR
This paper introduces vid2vid-zero, a zero-shot video editing method that uses pre-existing image diffusion models without training on videos, achieving consistent and high-quality edits in real-world videos.
Contribution
It presents a novel zero-shot video editing approach leveraging off-the-shelf image diffusion models with modules for text-video alignment, temporal consistency, and fidelity, without any video training.
Findings
Effective zero-shot editing of real-world videos.
Maintains temporal consistency across frames.
Enables editing of attributes, subjects, and scenes.
Abstract
Large-scale text-to-image diffusion models achieve unprecedented success in image generation and editing. However, how to extend such success to video editing is unclear. Recent initial attempts at video editing require significant text-to-video data and computation resources for training, which is often not accessible. In this work, we propose vid2vid-zero, a simple yet effective method for zero-shot video editing. Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video. At the core of our method is a null-text inversion module for text-to-video alignment, a cross-frame modeling module for temporal consistency, and a spatial regularization module for fidelity to the original video. Without any training, we leverage the dynamic nature of the attention mechanism to enable bi-directional temporal modeling at test time. Experiments and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
MethodsTest · Diffusion
