DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing
Hyeonho Jeong, Jinho Chang, Geon Yeong Park, Jong Chul Ye

TL;DR
DreamMotion introduces a novel space-time self-similarity matching technique during score distillation to enable effective zero-shot text-driven video editing, preserving original motion and structure.
Contribution
It proposes a new method that maintains motion consistency in video editing by matching space-time self-similarities during score distillation, applicable across diffusion frameworks.
Findings
Outperforms existing methods in appearance alteration and motion preservation.
Effectively maintains original video structure during editing.
Applicable to both cascaded and non-cascaded diffusion models.
Abstract
Text-driven diffusion-based video editing presents a unique challenge not encountered in image editing literature: establishing real-world motion. Unlike existing video editing approaches, here we focus on score distillation sampling to circumvent the standard reverse diffusion process and initiate optimization from videos that already exhibit natural motion. Our analysis reveals that while video score distillation can effectively introduce new content indicated by target text, it can also cause significant structure and motion deviation. To counteract this, we propose to match space-time self-similarities of the original video and the edited video during the score distillation. Thanks to the use of score distillation, our approach is model-agnostic, which can be applied for both cascaded and non-cascaded video diffusion frameworks. Through extensive comparisons with leading methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Generative Adversarial Networks and Image Synthesis · Video Coding and Compression Technologies
MethodsDiffusion · Focus
