Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Zhaoyi Zhou, Yuda Song, Andrea Zanette

TL;DR
This paper introduces a statistically principled framework that combines human and synthetic feedback to evaluate large language models more efficiently, reducing reliance on costly human annotations while maintaining unbiased performance metrics.
Contribution
It proposes a novel method to integrate human and synthetic feedback for unbiased LLM evaluation, significantly reducing the need for human annotations.
Findings
Up to 12.2% reduction in human annotations with synthetic evaluators.
Up to 24.8% reduction with a finetuned synthetic evaluator.
Method is scalable, generalizable, and free of hyper-parameter tuning.
Abstract
When developing new large language models (LLMs), a key step is evaluating their final performance, often by computing the win-rate against a reference model based on external feedback. Human feedback is the gold standard, particularly for capturing nuanced qualities like coherence, readability, and alignment with human expectations. However, human evaluations are costly -- even for large tech companies -- and when conducted with active users, they may negatively impact user experience. A promising alternative is synthetic feedback, where evaluations are conducted by other large language models, including reward models. While this eliminates the need for costly human annotations, it introduces biases that may distort the evaluation process. In this work, we propose a statistically principled framework that integrates human and synthetic feedback to reduce reliance on human annotations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNon-Destructive Testing Techniques
