Incentivizing High-Quality Human Annotations with Golden Questions
Shang Liu, Zhongze Cai, Hanzhao Wang, Zhongyao Ma, Xiaocheng Li

TL;DR
This paper models how to incentivize human annotators to produce high-quality data for training language models using a principal-agent framework, highlighting the importance of carefully designed golden questions.
Contribution
It introduces a novel hypothesis testing approach based on a principal-agent model to effectively incentivize annotators and proposes criteria for selecting golden questions.
Findings
Golden questions should be high certainty and similar to normal questions.
Incentive-compatible experiments reveal annotator behavior more effectively.
The hypothesis testing rate is of order Θ(1/√(n log n)), different from traditional large deviation results.
Abstract
Human-annotated data plays a vital role in training large language models (LLMs), such as supervised fine-tuning and human preference alignment. However, it is not guaranteed that paid human annotators produce high-quality data. In this paper, we study how to incentivize human annotators to do so. We start from a principal-agent model to model the dynamics between the company (the principal) and the annotator (the agent), where the principal can only monitor the annotation quality by examining samples. We investigate the maximum likelihood estimators (MLE) and the corresponding hypothesis testing to incentivize annotators: the agent is given a bonus if the MLE passes the test. By analyzing the variance of the outcome, we show that the strategic behavior of the agent makes the hypothesis testing very different from traditional ones: Unlike the exponential rate proved by the large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
