Labels or Preferences? Budget-Constrained Learning with Human Judgments over AI-Generated Outputs
Zihan Dong, Xiaotian Hou, Ruijia Wu, and Linjun Zhang

TL;DR
This paper introduces a statistically principled method, PCAL, for optimally allocating a fixed annotation budget between ground-truth labels and human preferences to improve AI data quality efficiently.
Contribution
It formulates the budget allocation as a semi-parametric inference problem and develops PCAL, a novel active learning method with proven asymptotic optimality and robustness guarantees.
Findings
PCAL outperforms baseline methods in simulations.
The method achieves lower estimator variance.
Real-data experiments confirm practical benefits.
Abstract
The increasing reliance on human preference feedback to judge AI-generated pseudo labels has created a pressing need for principled, budget-conscious data acquisition strategies. We address the crucial question of how to optimally allocate a fixed annotation budget between ground-truth labels and pairwise preferences in AI. Our solution, grounded in semi-parametric inference, casts the budget allocation problem as a monotone missing data framework. Building on this formulation, we introduce Preference-Calibrated Active Learning (PCAL), a novel method that learns the optimal data acquisition strategy and develops a statistically efficient estimator for functionals of the data distribution. Theoretically, we prove the asymptotic optimality of our PCAL estimator and establish a key robustness guarantee that ensures robust performance even with poorly estimated nuisance models. Our flexible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Ethics and Social Impacts of AI
