Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Space

Si Wu; Sebastian Bruch

arXiv:2505.23029·cs.CL·August 8, 2025

Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Space

Si Wu, Sebastian Bruch

PDF

1 Repo

TL;DR

This paper introduces an unsupervised measure called Neighborhood Stability Measure (NSM) that estimates imageability and concreteness of words directly from text embeddings, outperforming existing methods.

Contribution

It proposes a novel, distribution-free, unsupervised approach to estimate psycholinguistic properties from text embeddings without relying on visual data.

Findings

01

NSM correlates more strongly with ground-truth ratings than existing methods.

02

NSM effectively predicts imageability and concreteness for classification tasks.

03

The approach requires only text data, avoiding the need for multimodal datasets.

Abstract

Imageability (potential of text to evoke a mental image) and concreteness (perceptibility of text) are two psycholinguistic properties that link visual and semantic spaces. It is little surprise that computational methods that estimate them do so using parallel visual and semantic spaces, such as collections of image-caption pairs or multi-modal models. In this paper, we work on the supposition that text itself in an image-caption dataset offers sufficient signals to accurately estimate these properties. We hypothesize, in particular, that the peakedness of the neighborhood of a word in the semantic embedding space reflects its degree of imageability and concreteness. We then propose an unsupervised, distribution-free measure, which we call Neighborhood Stability Measure (NSM), that quantifies the sharpness of peaks. Extensive experiments show that NSM correlates more strongly with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

artificial-memory-lab/imageability
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.