Do language models have coherent mental models of everyday things?
Yuling Gu, Bhavana Dalvi Mishra, Peter Clark

TL;DR
This paper investigates whether language models possess coherent mental models of everyday objects by testing their understanding of parts and relationships, revealing partial knowledge and proposing a constraint-based method to improve consistency.
Contribution
The paper introduces a benchmark dataset for evaluating LM understanding of everyday objects and proposes a constraint satisfaction layer to enhance model coherence.
Findings
Language models have partial knowledge of everyday objects.
Adding constraints improves accuracy by 16-20%.
Models still lack fully coherent mental models.
Abstract
When people think of everyday things like an egg, they typically have a mental image associated with it. This allows them to correctly judge, for example, that "the yolk surrounds the shell" is a false statement. Do language models similarly have a coherent picture of such everyday things? To investigate this, we propose a benchmark dataset consisting of 100 everyday things, their parts, and the relationships between these parts, expressed as 11,720 "X relation Y?" true/false questions. Using these questions as probes, we observe that state-of-the-art pre-trained language models (LMs) like GPT-3 and Macaw have fragments of knowledge about these everyday things, but do not have fully coherent "parts mental models" (54-59% accurate, 19-43% conditional constraint violation). We propose an extension where we add a constraint satisfaction layer on top of the LM's raw predictions to apply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Inverse Square Root Schedule · Weight Decay · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Adafactor · Dropout · Linear Layer
