Adaptive Budget Allocation in LLM-Augmented Surveys
Zikun Ye, Jiameng Lyu, Rui Tao

TL;DR
This paper introduces an adaptive algorithm for allocating limited human verification resources across survey questions to optimize the reliability of LLM-augmented survey responses, reducing waste and improving efficiency.
Contribution
It presents a novel adaptive allocation method that learns question difficulty and LLM reliability in real time without prior knowledge, with proven guarantees and practical validation.
Findings
Reduces human labeling waste from 10-12% to 2-6% on real survey data.
Achieves the same estimation quality with fewer human samples compared to uniform allocation.
Validated on synthetic and real datasets, with formal performance guarantees.
Abstract
Large language models (LLMs) can generate survey responses at low cost, but their reliability varies substantially across questions and is unknown before data collection. Deploying LLMs in surveys still requires costly human responses for verification and correction. How should a limited human-labeling budget be allocated across questions in real time? We propose an adaptive allocation algorithm that learns which questions are hardest for the LLM while simultaneously collecting human responses. Each human label serves a dual role: it improves the estimate for that question and reveals how well the LLM predicts human responses on it. The algorithm directs more budget to questions where the LLM is least reliable, without requiring any prior knowledge of question-level LLM accuracy. We prove that the allocation gap relative to the best possible allocation vanishes as the budget grows, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
