TL;DR
CoEdit introduces a coopetitive, training-free image editing framework that enhances semantic alignment and consistency by transforming attention control into a negotiation process, improving editing harmony across spatial and temporal domains.
Contribution
It proposes a novel zero-shot, coopetitive attention control mechanism with dual-entropy manipulation and latent refinement, advancing training-free text-guided image editing.
Findings
Outperforms existing methods in editing quality and structural preservation.
Improves localization of editable and preservable regions.
Ensures consistent semantic transitions during editing.
Abstract
Text-guided image editing, a pivotal task in modern multimedia content creation, has seen remarkable progress with training-free methods that eliminate the need for additional optimization. Despite recent progress, existing methods are typically constrained by a competitive paradigm in which the editing and reconstruction branches are independently driven by their respective objectives to maximize alignment with target and source prompts. The adversarial strategy causes semantic conflicts and unpredictable outcomes due to the lack of coordination between branches. To overcome these issues, we propose Coopetitive Training-Free Image Editing (CoEdit), a novel zero-shot framework that transforms attention control from competition to coopetitive negotiation, achieving editing harmony across spatial and temporal dimensions. Spatially, CoEdit introduces Dual-Entropy Attention Manipulation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
