LooseRoPE: Content-aware Attention Manipulation for Semantic Harmonization
Etai Sella, Yoav Baron, Hadar Averbuch-Elor, Daniel Cohen-Or, Or Patashnik

TL;DR
LooseRoPE introduces a novel attention manipulation technique that balances identity preservation and contextual harmonization in prompt-free image editing by relaxing positional constraints.
Contribution
We propose LooseRoPE, a saliency-guided modulation of rotational positional encoding, enabling continuous control over attention focus for improved image editing without text prompts.
Findings
Achieves seamless object insertion with balanced identity retention and blending.
Provides flexible, intuitive control over image editing results.
Operates without reliance on textual guidance or complex user input.
Abstract
Recent diffusion-based image editing methods commonly rely on text or high-level instructions to guide the generation process, offering intuitive but coarse control. In contrast, we focus on explicit, prompt-free editing, where the user directly specifies the modification by cropping and pasting an object or sub-object into a chosen location within an image. This operation affords precise spatial and visual control, yet it introduces a fundamental challenge: preserving the identity of the pasted object while harmonizing it with its new context. We observe that attention maps in diffusion-based editing models inherently govern whether image regions are preserved or adapted for coherence. Building on this insight, we introduce LooseRoPE, a saliency-guided modulation of rotational positional encoding (RoPE) that loosens the positional constraints to continuously control the attention field…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
