TL;DR
CPAM introduces a zero-shot, context-preserving framework for complex real image editing using diffusion models, effectively maintaining object identity and background details across various architectures.
Contribution
A novel zero-shot editing method with modules for preserving object features and mitigating interference, compatible with multiple diffusion models, and validated on a new benchmark dataset.
Findings
Outperforms existing methods in human preference tests
Maintains object shape, texture, and identity during editing
Demonstrates strong generalization across different diffusion architectures
Abstract
Editing natural images using textual descriptions in text-to-image diffusion models remains a significant challenge, particularly in achieving consistent generation and handling complex, non-rigid objects. Existing methods often struggle to preserve textures and identity, require extensive fine-tuning, and exhibit limitations in editing specific spatial regions or objects while retaining background details. This paper proposes Context-Preserving Adaptive Manipulation (CPAM), a novel zero-shot framework for complicated, non-rigid real image editing. Specifically, we propose a preservation adaptation module that adjusts self-attention mechanisms to preserve and independently control the object and background effectively. This ensures that the objects' shapes, textures, and identities are maintained while keeping the background undistorted during the editing process using the mask guidance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
