Re-Attentional Controllable Video Diffusion Editing
Yuanzhi Wang, Yong Li, Mengyi Liu, Xiaoya Zhang, Xin Liu, Zhen Cui,, Antoni B. Chan

TL;DR
This paper introduces ReAtCo, a novel method for controllable video editing using diffusion models, which improves spatial alignment and preserves invariant regions, resulting in more accurate and high-fidelity edited videos.
Contribution
The paper proposes Re-Attentional Diffusion and Invariant Region-guided Joint Sampling strategies to enhance controllability and fidelity in text-guided video diffusion editing without additional training.
Findings
ReAtCo improves spatial alignment of edited objects.
ReAtCo reduces border artifacts in invariant regions.
ReAtCo achieves superior editing performance compared to existing methods.
Abstract
Editing videos with textual guidance has garnered popularity due to its streamlined process which mandates users to solely edit the text prompt corresponding to the source video. Recent studies have explored and exploited large-scale text-to-image diffusion models for text-guided video editing, resulting in remarkable video editing capabilities. However, they may still suffer from some limitations such as mislocated objects, incorrect number of objects. Therefore, the controllability of video editing remains a formidable challenge. In this paper, we aim to challenge the above limitations by proposing a Re-Attentional Controllable Video Diffusion Editing (ReAtCo) method. Specially, to align the spatial placement of the target objects with the edited text prompt in a training-free manner, we propose a Re-Attentional Diffusion (RAD) to refocus the cross-attention activation responses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Multimedia Communication and Technology · Video Analysis and Summarization
MethodsDiffusion · ALIGN
