Aligning Black-box Language Models with Human Judgments

Gerrit J. J. van den Burg; Gen Suzuki; Wei Liu; Murat Sensoy

arXiv:2502.04997·cs.CL·February 10, 2025

Aligning Black-box Language Models with Human Judgments

Gerrit J. J. van den Burg, Gen Suzuki, Wei Liu, Murat Sensoy

PDF

Open Access 1 Video

TL;DR

This paper presents a simple linear mapping method to align large language model judgments with human evaluations, significantly improving agreement without retraining, and enabling smaller models to match larger ones in human-aligned performance.

Contribution

It introduces a calibration framework that aligns LLM judgments with human judgments using minimal data, without the need for retraining or fine-tuning the models.

Findings

01

Over 142% improvement in agreement across 29 tasks.

02

Works effectively in zero-shot and few-shot settings.

03

Smaller LLMs can match larger models in human-aligned performance.

Abstract

Large language models (LLMs) are increasingly used as automated judges to evaluate recommendation systems, search engines, and other subjective tasks, where relying on human evaluators can be costly, time-consuming, and unscalable. LLMs offer an efficient solution for continuous, automated evaluation. However, since the systems that are built and improved with these judgments are ultimately designed for human use, it is crucial that LLM judgments align closely with human evaluators to ensure such systems remain human-centered. On the other hand, aligning LLM judgments with human evaluators is challenging due to individual variability and biases in human judgments. We propose a simple yet effective framework to align LLM judgments with individual human evaluators or their aggregated judgments, without retraining or fine-tuning the LLM. Our approach learns a linear mapping between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Aligning Black-box Language Models with Human Judgments· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsALIGN