Investigating Conceptual Blending of a Diffusion Model for Improving   Nonword-to-Image Generation

Chihaya Matsuhira; Marc A. Kastner; Takahiro Komamizu; Takatsugu; Hirayama; Ichiro Ide

arXiv:2411.03595·cs.MM·November 7, 2024

Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation

Chihaya Matsuhira, Marc A. Kastner, Takahiro Komamizu, Takatsugu, Hirayama, Ichiro Ide

PDF

TL;DR

This paper analyzes how diffusion models blend concepts in nonword-to-image generation, proposing a new embedding conversion method that enhances image quality and conceptual blending for more intuitive and creative outputs.

Contribution

It introduces a quantitative analysis of conceptual blending in diffusion models and proposes an improved embedding space conversion method for better nonword-to-image generation.

Findings

01

High percentage of blended concepts in generated images.

02

Proposed embedding conversion improves image quality.

03

Blending mainly occurs in specific high-dimensional embedding dimensions.

Abstract

Text-to-image diffusion models sometimes depict blended concepts in the generated images. One promising use case of this effect would be the nonword-to-image generation task which attempts to generate images intuitively imaginable from a non-existing word (nonword). To realize nonword-to-image generation, an existing study focused on associating nonwords with similar-sounding words. Since each nonword can have multiple similar-sounding words, generating images containing their blended concepts would increase intuitiveness, facilitating creative activities and promoting computational psycholinguistics. Nevertheless, no existing study has quantitatively evaluated this effect in either diffusion models or the nonword-to-image generation paradigm. Therefore, this paper first analyzes the conceptual blending in a pretrained diffusion model, Stable Diffusion. The analysis reveals that a high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.