MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos

Yangyi Cao; Yuanhang Li; Lan Chen; Qi Mao

arXiv:2602.02123·cs.CV·March 4, 2026

MLV-Edit: Towards Consistent and Highly Efficient Editing for Minute-Level Videos

Yangyi Cao, Yuanhang Li, Lan Chen, Qi Mao

PDF

Open Access

TL;DR

MLV-Edit is a flow-based, training-free framework that enables consistent and efficient minute-level video editing by addressing temporal coherence and structural drift through segment-wise processing and specialized modules.

Contribution

It introduces a novel divide-and-conquer approach with Velocity Blend and Attention Sink modules to improve long-duration video editing quality without training overhead.

Findings

01

Outperforms state-of-the-art in temporal stability

02

Reduces flickering and boundary artifacts

03

Maintains semantic fidelity in long videos

Abstract

We propose MLV-Edit, a training-free, flow-based framework that address the unique challenges of minute-level video editing. While existing techniques excel in short-form video manipulation, scaling them to long-duration videos remains challenging due to prohibitive computational overhead and the difficulty of maintaining global temporal consistency across thousands of frames. To address this, MLV-Edit employs a divide-and-conquer strategy for segment-wise editing, facilitated by two core modules: Velocity Blend rectifies motion inconsistencies at segment boundaries by aligning the flow fields of adjacent chunks, eliminating flickering and boundary artifacts commonly observed in fragmented video processing; and Attention Sink anchors local segment features to global reference frames, effectively suppressing cumulative structural drift. Extensive quantitative and qualitative experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Video Analysis and Summarization · Video Coding and Compression Technologies