How Well Do LLMs Predict Human Behavior? A Measure of their Pretrained Knowledge
Wayne Gao, Sukjin Han, Annie Liang

TL;DR
This paper introduces a measure called equivalent sample size to evaluate how much pretrained large language models (LLMs) know about predicting human behavior, comparing their predictive accuracy to models trained on varying amounts of domain-specific data.
Contribution
It proposes a novel measure for assessing LLM knowledge in prediction tasks and develops a statistical inference method based on asymptotic theory for cross-validated errors.
Findings
LLMs encode significant information for some economic variables.
The predictive value of LLMs varies across different domains.
The method provides insights into when LLMs can substitute for domain-specific data.
Abstract
Large language models (LLMs) are increasingly used to predict human behavior. We propose a measure for evaluating how much knowledge a pretrained LLM brings to such a prediction: its equivalent sample size, defined as the amount of task-specific data needed to match the predictive accuracy of the LLM. We estimate this measure by comparing the prediction error of a fixed LLM in a given domain to that of flexible machine learning models trained on increasing samples of domain-specific data. We further provide a statistical inference procedure by developing a new asymptotic theory for cross-validated prediction error. Finally, we apply this method to the Panel Study of Income Dynamics. We find that LLMs encode considerable predictive information for some economic variables but much less for others, suggesting that their value as substitutes for domain-specific data differs markedly across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Computational and Text Analysis Methods · Text Readability and Simplification
