RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation
Weronika {\L}ajewska, Paul Missault, George Davidson, Saab Mansour

TL;DR
RADIUS is a new evaluation suite for survey simulation with LLMs that measures ranking and distribution alignment, including significance testing, addressing limitations of existing metrics.
Contribution
It introduces a comprehensive, standardized alignment suite with significance testing for survey simulation evaluation, improving upon ad hoc existing metrics.
Findings
Existing metrics are fragmented and non-standardized.
RADIUS effectively captures ranking and distribution alignment.
Provides open-source tools for reproducible assessment.
Abstract
Simulation of surveys using LLMs is emerging as a powerful application for generating human-like responses at scale. Prior work evaluates survey simulation using metrics borrowed from other domains, which are often ad hoc, fragmented, and non-standardized, leading to results that are difficult to compare. Moreover, existing metrics focus mainly on accuracy or distributional measures, overlooking the critical dimension of ranking alignment. In practice, a simulation can achieve high accuracy while still failing to capture the option most preferred by humans - a distinction that is critical in decision-making applications. We introduce RADIUS, a comprehensive two-dimensional alignment suite for survey simulation that captures: 1) RAnking alignment and 2) DIstribUtion alignment, each complemented by statistical Significance testing. RADIUS highlights the limitations of existing metrics,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Methodology and Nonresponse · Statistical Methods and Bayesian Inference · Human Mobility and Location-Based Analysis
