Underspecification in Language Modeling Tasks: A Causality-Informed Study of Gendered Pronoun Resolution
Emily McMilin

TL;DR
This paper introduces a causal framework to understand underspecification in language models, revealing how it causes spurious correlations in gendered pronoun resolution and providing new evaluation methods across diverse models.
Contribution
The study presents a simple causal model of underspecification, leading to lightweight black-box evaluation techniques for detecting inference-time underspecification in language models.
Findings
Detected gender vs. time and gender vs. location spurious correlations in various LLMs.
Evaluation methods effective across models from BERT-base to GPT-4 Turbo.
Insights applicable to models with different training objectives and stages.
Abstract
Modern language modeling tasks are often underspecified: for a given token prediction, many words may satisfy the user's intent of producing natural language at inference time, however only one word will minimize the task's loss function at training time. We introduce a simple causal mechanism to describe the role underspecification plays in the generation of spurious correlations. Despite its simplicity, our causal model directly informs the development of two lightweight black-box evaluation methods, that we apply to gendered pronoun resolution tasks on a wide range of LLMs to 1) aid in the detection of inference-time task underspecification by exploiting 2) previously unreported gender vs. time and gender vs. location spurious correlations on LLMs with a range of A) sizes: from BERT-base to GPT-4 Turbo Preview, B) pre-training objectives: from masked & autoregressive language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Attention Dropout · Weight Decay · Discriminative Fine-Tuning · Residual Connection · Adam · Layer Normalization
