The aftermath of compounds: Investigating Compounds and their Semantic Representations

Swarang Joshi

arXiv:2510.27477·cs.CL·November 3, 2025

The aftermath of compounds: Investigating Compounds and their Semantic Representations

Swarang Joshi

PDF

Open Access

TL;DR

This paper compares static and contextualized embeddings in modeling human judgments of compound word semantics, finding BERT better captures compositional semantics and predictability influences transparency.

Contribution

It demonstrates that BERT embeddings more accurately reflect human semantic judgments of compounds than GloVe, advancing understanding of semantic modeling.

Findings

01

BERT outperforms GloVe in capturing compound semantics.

02

Predictability ratings strongly predict semantic transparency.

03

Embedding-based metrics correlate with human judgments.

Abstract

This study investigates how well computational embeddings align with human semantic judgments in the processing of English compound words. We compare static word vectors (GloVe) and contextualized embeddings (BERT) against human ratings of lexeme meaning dominance (LMD) and semantic transparency (ST) drawn from a psycholinguistic dataset. Using measures of association strength (Edinburgh Associative Thesaurus), frequency (BNC), and predictability (LaDEC), we compute embedding-derived LMD and ST metrics and assess their relationships with human judgments via Spearmans correlation and regression analyses. Our results show that BERT embeddings better capture compositional semantics than GloVe, and that predictability ratings are strong predictors of semantic transparency in both human and model data. These findings advance computational psycholinguistics by clarifying the factors that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Action Observation and Synchronization · Categorization, perception, and language