Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun; Jinhwan Nam; Sunghyun Cho; Jungseul Ok

arXiv:2412.04715·cs.CV·August 26, 2025

Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun, Jinhwan Nam, Sunghyun Cho, Jungseul Ok

PDF

Open Access

TL;DR

This paper identifies attribute leakage in diffusion-based image editing caused by semantic entanglement in text embeddings and introduces ALE, a framework with new techniques and benchmarks to significantly reduce leakage and improve editing accuracy.

Contribution

The paper proposes ALE, a novel framework combining disentangled embeddings, spatial attention, and background preservation to address attribute leakage in text-guided image editing.

Findings

01

ALE significantly reduces attribute leakage in experiments.

02

The proposed ALE-Bench provides comprehensive evaluation metrics.

03

ALE enables more accurate multi-object, text-driven image editing.

Abstract

Text-based image editing, powered by generative diffusion models, lets users modify images through natural-language prompts and has dramatically simplified traditional workflows. Despite these advances, current methods still suffer from a critical problem: attribute leakage, where edits meant for specific objects unintentionally affect unrelated regions or other target objects. Our analysis reveals the root cause as the semantic entanglement inherent in End-of-Sequence (EOS) embeddings generated by autoregressive text encoders, which indiscriminately aggregate attributes across prompts. To address this issue, we introduce Attribute-Leakage-free Editing (ALE), a framework that tackles attribute leakage at its source. ALE combines Object-Restricted Embeddings (ORE) to disentangle text embeddings, Region-Guided Blending for Cross-Attention Masking (RGB-CAM) for spatially precise attention,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques · 3D Shape Modeling and Analysis · Reinforcement Learning in Robotics

MethodsSoftmax · Attention Is All You Need · ALIGN