Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models   Using Spatio-Temporal Slices

Nathaniel Cohen; Vladimir Kulikov; Matan Kleiner; Inbar; Huberman-Spiegelglas; Tomer Michaeli

arXiv:2405.12211·cs.CV·May 21, 2024

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner, Inbar, Huberman-Spiegelglas, Tomer Michaeli

PDF

Open Access 1 Repo

TL;DR

Slicedit introduces a novel video editing approach using spatiotemporal slices and pretrained text-to-image diffusion models to achieve consistent, structure-preserving edits guided by text, overcoming challenges of nonrigid motion.

Contribution

It proposes a new method that applies T2I diffusion models on spatiotemporal slices for zero-shot video editing, enhancing temporal consistency without explicit correspondence mechanisms.

Findings

01

Effective editing of real-world videos with preserved motion and structure

02

Outperforms existing methods in temporal consistency and editing quality

03

Works across diverse video content and editing tasks

Abstract

Text-to-image (T2I) diffusion models achieve state-of-the-art results in image synthesis and editing. However, leveraging such pretrained models for video editing is considered a major challenge. Many existing works attempt to enforce temporal consistency in the edited video through explicit correspondence mechanisms, either in pixel space or between deep features. These methods, however, struggle with strong nonrigid motion. In this paper, we introduce a fundamentally different approach, which is based on the observation that spatiotemporal slices of natural videos exhibit similar characteristics to natural images. Thus, the same T2I diffusion model that is normally used only as a prior on video frames, can also serve as a strong prior for enhancing temporal consistency by applying it on spatiotemporal slices. Based on this observation, we present Slicedit, a method for text-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fallenshock/Slicedit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion