Kiki or Bouba? Sound Symbolism in Vision-and-Language Models

Morris Alper; Hadar Averbuch-Elor

arXiv:2310.16781·cs.CV·April 3, 2024·2 cites

Kiki or Bouba? Sound Symbolism in Vision-and-Language Models

Morris Alper, Hadar Averbuch-Elor

PDF

Open Access

TL;DR

This paper investigates whether vision-and-language models like CLIP and Stable Diffusion exhibit sound symbolism, revealing that these models do reflect the kiki-bouba effect through zero-shot probing, thus paralleling human psycholinguistic phenomena.

Contribution

It introduces a novel computational method to detect sound symbolism in vision-and-language models, demonstrating their inherent knowledge of cross-modal associations.

Findings

01

Models show the kiki-bouba effect in zero-shot probing.

02

Sound symbolism is reflected in vision-and-language models.

03

The method provides a new way to study cross-modal associations.

Abstract

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and well-demonstrated with regards to cross-modal associations between language and the visual domain. In this work, we address the question of whether sound symbolism is reflected in vision-and-language models such as CLIP and Stable Diffusion. Using zero-shot knowledge probing to investigate the inherent knowledge of these models, we find strong evidence that they do show this pattern, paralleling the well-known kiki-bouba effect in psycholinguistics. Our work provides a novel method for demonstrating sound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage, Metaphor, and Cognition · Language and cultural evolution · Categorization, perception, and language

MethodsDiffusion · Contrastive Language-Image Pre-training