Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing
Yitong Yang, Yinglin Wang, Tian Zhang, Jing Wang, Shuting He

TL;DR
This paper analyzes text embeddings in Stable Diffusion XL to improve image editing precision and introduces PSP, a training-free method that uses free-text embedding control for object manipulation and style transfer.
Contribution
The paper provides a detailed analysis of text embeddings and proposes PSP, a novel, training-free image editing technique leveraging free-text embedding control within diffusion models.
Findings
PSP enables precise object addition and replacement.
PSP achieves effective style transfer.
Text embeddings can be manipulated for targeted image editing.
Abstract
While text-driven diffusion models demonstrate remarkable performance in image editing, the critical components of their text embeddings remain underexplored. The ambiguity and entanglement of these embeddings pose challenges for precise editing. In this paper, we provide a comprehensive analysis of text embeddings in Stable Diffusion XL, offering three key insights: (1) \textit{aug embedding}~\footnote{\textit{aug embedding} is obtained by combining the pooled output of the final text encoder with the timestep embeddings. https://github.com/huggingface/diffusers} retains complete textual semantics but contributes minimally to image generation as it is only fused via the ResBlocks. More text information weakens its local semantics while preserving most global semantics. (2) \textit{BOS} and \textit{padding embedding} do not contain any semantic information. (3) \textit{EOS} holds the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques
MethodsDiffusion
