Probing Neural Language Models for Human Tacit Assumptions
Nathaniel Weir, Adam Poliak, Benjamin Van Durme

TL;DR
This paper investigates whether neural language models inherently encode human stereotypic tacit assumptions by evaluating their ability to retrieve concepts based on associated properties, using prompts inspired by psychological studies.
Contribution
The study introduces a diagnostic prompt set to assess if neural models capture human-like stereotypic assumptions about concepts, providing empirical evidence of such representations.
Findings
Models effectively retrieve concepts from associated properties.
Neural models encode stereotypic conceptual associations.
Empirical evidence supports the presence of tacit assumptions in models.
Abstract
Humans carry stereotypic tacit assumptions (STAs) (Prince, 1978), or propositional beliefs about generic concepts. Such associations are crucial for understanding natural language. We construct a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs. Our prompts are based on human responses in a psychological study of conceptual associations. We find models to be profoundly effective at retrieving concepts given associated properties. Our results demonstrate empirical evidence that stereotypic conceptual representations are captured in neural models derived from semi-supervised linguistic exposure.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
