Greedy or not, here I come: Language production under vocabulary constraints in humans and resource-rational models
Thomas Hikaru Clark, Sihan Chen, Laura Nicolae

TL;DR
This study examines how humans communicate under vocabulary constraints, comparing their behavior to greedy and optimal algorithms, revealing that humans are mostly greedy but can exhibit non-greedy revisions.
Contribution
It provides the first empirical comparison of human language production under vocabulary constraints to resource-rational models using advanced inference methods.
Findings
Humans resemble greedy sampling more than optimal sampling.
More skilled humans tend to backtrack and revise, showing non-greedy behavior.
In high-constraint settings, humans rely on semantically light words, a pattern not explained by greedy or optimal models.
Abstract
Communicating using only a limited vocabulary is a common but challenging cognitive phenomenon, requiring an ideal communicator to plan carefully to optimize for intelligibility while circumventing a constrained lexicon. In this work, we investigate how humans respond to a broad array of questions under variable vocabulary limitations, consisting of only 250 highly frequent words at the most restrictive. We provide theoretically motivated comparisons to greedy and globally optimal sampling algorithms using Sequential Monte Carlo inference with large language models. Humans generally resemble greedy sampling more than globally optimal sampling, though more skilled humans are more likely to backtrack and revise -- a non-greedy behavior. An observed human pattern of leaning on semantically light words in high-constraint settings falls out of both greedy and globally optimal sampling. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
