Fairer Preferences Elicit Improved Human-Aligned Large Language Model   Judgments

Han Zhou; Xingchen Wan; Yinhong Liu; Nigel Collier; Ivan Vuli\'c; Anna; Korhonen

arXiv:2406.11370·cs.CL·October 15, 2024

Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments

Han Zhou, Xingchen Wan, Yinhong Liu, Nigel Collier, Ivan Vuli\'c, Anna, Korhonen

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper identifies biases in LLM-based evaluators and introduces ZEPO, a zero-shot prompt optimization method that enhances fairness and alignment with human judgments in language quality assessments.

Contribution

The paper proposes ZEPO, a novel zero-shot prompt optimization framework that improves fairness and human alignment of LLM evaluators without needing labeled data.

Findings

01

ZEPO significantly outperforms state-of-the-art evaluators.

02

Fairer preferences lead to better human alignment.

03

ZEPO requires no labeled data for optimization.

Abstract

Large language models (LLMs) have shown promising abilities as cost-effective and reference-free evaluators for assessing language generation quality. In particular, pairwise LLM evaluators, which compare two generated texts and determine the preferred one, have been employed in a wide range of applications. However, LLMs exhibit preference biases and worrying sensitivity to prompt designs. In this work, we first reveal that the predictive preference of LLMs can be highly brittle and skewed, even with semantically equivalent instructions. We find that fairer predictive preferences from LLMs consistently lead to judgments that are better aligned with humans. Motivated by this phenomenon, we propose an automatic Zero-shot Evaluation-oriented Prompt Optimization framework, ZEPO, which aims to produce fairer preference decisions and improve the alignment of LLM evaluators with human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems