TL;DR
FlowAnchor is a training-free, flow-based video editing framework that stabilizes editing signals in high-dimensional latent spaces to achieve coherent and efficient multi-object video editing.
Contribution
It introduces explicit anchoring mechanisms, Spatial-aware Attention Refinement and Adaptive Magnitude Modulation, to stabilize editing signals in inversion-free video editing.
Findings
Achieves more faithful and temporally coherent video edits.
Handles multi-object scenes and fast motion effectively.
Offers computationally efficient editing process.
Abstract
We propose FlowAnchor, a training-free framework for stable and efficient inversion-free, flow-based video editing. Inversion-free editing methods have recently shown impressive efficiency and structure preservation in images by directly steering the sampling trajectory with an editing signal. However, extending this paradigm to videos remains challenging, often failing in multi-object scenes or with increased frame counts. We identify the root cause as the instability of the editing signal in high-dimensional video latent spaces, which arises from imprecise spatial localization and length-induced magnitude attenuation. To overcome this challenge, FlowAnchor explicitly anchors both where to edit and how strongly to edit. It introduces Spatial-aware Attention Refinement, which enforces consistent alignment between textual guidance and spatial regions, and Adaptive Magnitude Modulation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
