StableV2V: Stablizing Shape Consistency in Video-to-Video Editing
Chang Liu, Rui Li, Kaidong Zhang, Yunwei Lan, Dong Liu

TL;DR
StableV2V introduces a shape-consistent video editing approach that aligns motions with user prompts, ensuring improved visual consistency and efficiency across frames, validated on a new DAVIS-Edit benchmark.
Contribution
The paper proposes a novel shape-consistent video editing pipeline that enhances motion-content alignment and introduces a comprehensive DAVIS-Edit benchmark for evaluation.
Findings
Outperforms existing methods in visual consistency
Achieves higher inference efficiency
Demonstrates robustness across various prompts and difficulties
Abstract
Recent advancements of generative AI have significantly promoted content creation and editing, where prevailing studies further extend this exciting progress to video editing. In doing so, these studies mainly transfer the inherent motion patterns from the source videos to the edited ones, where results with inferior consistency to user prompts are often observed, due to the lack of particular alignments between the delivered motions and edited contents. To address this limitation, we present a shape-consistent video editing method, namely StableV2V, in this paper. Our method decomposes the entire editing pipeline into several sequential procedures, where it edits the first video frame, then establishes an alignment between the delivered motions and user prompts, and eventually propagates the edited contents to all other frames based on such alignment. Furthermore, we curate a testing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
