Semantic Scaling: Bayesian Ideal Point Estimates with Large Language Models
Michael Burnham

TL;DR
Semantic Scaling is a new method that uses large language models and item response theory to estimate ideological positions from text, offering flexibility and improved accuracy over existing methods.
Contribution
It introduces a flexible, text-based scaling approach that leverages large language models and IRT, enabling ideological measurement beyond traditional surveys.
Findings
Outperforms Tweetscores in public opinion estimation.
Recaptures DW-NOMINATE in Congress data.
Works with documents of varying length and content.
Abstract
This paper introduces "Semantic Scaling," a novel method for ideal point estimation from text. I leverage large language models to classify documents based on their expressed stances and extract survey-like data. I then use item response theory to scale subjects from these data. Semantic Scaling significantly improves on existing text-based scaling methods, and allows researchers to explicitly define the ideological dimensions they measure. This represents the first scaling approach that allows such flexibility outside of survey instruments and opens new avenues of inquiry for populations difficult to survey. Additionally, it works with documents of varying length, and produces valid estimates of both mass and elite ideology. I demonstrate that the method can differentiate between policy preferences and in-group/out-group affect. Among the public, Semantic Scaling out-preforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
