The Grounding Gap: How LLMs Anchor the Meaning of Abstract Concepts Differently from Humans
Odysseas S. Chlapanis, Orfeas Menis Mastromichalakis, Christos H. Papadimitriou

TL;DR
This study reveals that large language models rely heavily on word associations and lack human-like grounding of abstract concepts, especially regarding emotion and internal states, leading to a significant grounding gap.
Contribution
It systematically compares LLMs to humans in grounding abstract concepts, identifies the grounding gap, and explores internal features related to grounding dimensions.
Findings
LLMs have a low correlation (r=0.37) with human responses on property-generation tasks.
Alignment with human judgment improves with larger models.
Internal features related to grounding dimensions are identifiable in LLMs.
Abstract
Abstract concepts - justice, theory, availability - have no single perceivable referent; in the human brain, their meaning emerges from a web of experiences, affect, and social context. Do large language models (LLMs) ground abstract concepts in a similar way? We study this by replicating property-generation experiments from cognitive science on 21 frontier and open-weight LLMs. Across models and experiments, we find a consistent pattern: when compared to humans, models rely too heavily on word associations, and underproduce properties tied to emotion and internal states. This yields a large and consistent grounding gap: no model exceeds a Pearson correlation r=0.37 with human responses, compared to a human-to-human ceiling above r=0.9. To better interpret this gap, we also replicate a rating experiment on grounding categories and find that here LLMs align more closely with human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
