Uncovering the Text Embedding in Text-to-Image Diffusion Models
Hu Yu, Hao Luo, Fan Wang, Feng Zhao

TL;DR
This paper explores the text embedding space in text-to-image diffusion models, revealing its semantic properties and potential for controllable image editing without additional learning, thereby improving understanding of these models.
Contribution
It uncovers the importance of per-word embeddings, their contextual correlations, and the semantic diversity within text embeddings using a learning-free approach.
Findings
Per-word embedding and contextual correlation are crucial for image editing.
Text embedding inherently contains diverse semantic potentials.
Singular value decomposition reveals the semantic structure of embeddings.
Abstract
The correspondence between input text and the generated image exhibits opacity, wherein minor textual modifications can induce substantial deviations in the generated image. While, text embedding, as the pivotal intermediary between text and images, remains relatively underexplored. In this paper, we address this research gap by delving into the text embedding space, unleashing its capacity for controllable image editing and explicable semantic direction attributes within a learning-free framework. Specifically, we identify two critical insights regarding the importance of per-word embedding and their contextual correlations within text embedding, providing instructive principles for learning-free image editing. Additionally, we find that text embedding inherently possesses diverse semantic potentials, and further reveal this property through the lens of singular value decomposition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Authorship Attribution and Profiling · Computational and Text Analysis Methods
MethodsDiffusion
