Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing

Yitong Yang; Yinglin Wang; Tian Zhang; Jing Wang; Shuting He

arXiv:2408.13623·cs.CV·August 12, 2025

Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing

Yitong Yang, Yinglin Wang, Tian Zhang, Jing Wang, Shuting He

PDF

Open Access

TL;DR

This paper analyzes text embeddings in Stable Diffusion XL to improve image editing precision and introduces PSP, a training-free method that uses free-text embedding control for object manipulation and style transfer.

Contribution

The paper provides a detailed analysis of text embeddings and proposes PSP, a novel, training-free image editing technique leveraging free-text embedding control within diffusion models.

Findings

01

PSP enables precise object addition and replacement.

02

PSP achieves effective style transfer.

03

Text embeddings can be manipulated for targeted image editing.

Abstract

While text-driven diffusion models demonstrate remarkable performance in image editing, the critical components of their text embeddings remain underexplored. The ambiguity and entanglement of these embeddings pose challenges for precise editing. In this paper, we provide a comprehensive analysis of text embeddings in Stable Diffusion XL, offering three key insights: (1) \textit{aug embedding}~\footnote{\textit{aug embedding} is obtained by combining the pooled output of the final text encoder with the timestep embeddings. https://github.com/huggingface/diffusers} retains complete textual semantics but contributes minimally to image generation as it is only fused via the ResBlocks. More text information weakens its local semantics while preserving most global semantics. (2) \textit{BOS} and \textit{padding embedding} do not contain any semantic information. (3) \textit{EOS} holds the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques

MethodsDiffusion