AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction
Junsol Kim, Byungkyu Lee

TL;DR
This paper introduces an LLM-based framework to predict missing survey responses, enabling the recovery of historical opinion trends and enhancing survey analysis.
Contribution
It presents novel applications of LLMs for retrodiction and unasked opinion prediction in survey research, improving trend recovery and response prediction accuracy.
Findings
LLM models effectively retrodict missing opinions in historical survey data.
Models outperform benchmarks in predicting unasked opinions for certain topics.
The approach helps identify when public attitudes shifted, such as support for same-sex marriage.
Abstract
Nationally representative surveys track public opinion, yet they ask only a limited set of questions each year, limiting its potential to capture historical changes. To fill this gap, we develop a large language model (LLM)-based framework for predicting missing responses in repeated cross-sectional surveys by incorporating embeddings for questions, respondents, and survey periods. We introduce two new applications of LLMs to survey research: retrodiction (predicting year-level missing opinions) and unasked opinion prediction (predicting entirely missing opinions). Using data from the 1972-2021 General Social Surveys, our LLM-based models perform strongly in retrodicting masked GSS opinions through cross-validation and public opinions measured by other organizations in years when the GSS did not ask them. These capabilities enable us to recover missing trends and pinpoint when public…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
