Efficient LLM Comparative Assessment: a Product of Experts Framework for   Pairwise Comparisons

Adian Liusie; Vatsal Raina; Yassir Fathullah; Mark Gales

arXiv:2405.05894·cs.CL·November 13, 2024

Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons

Adian Liusie, Vatsal Raina, Yassir Fathullah, Mark Gales

PDF

Open Access 1 Repo

TL;DR

This paper presents a Product of Experts framework for efficient pairwise comparison of large candidate sets using LLMs, significantly reducing computational costs while maintaining high correlation with human judgments.

Contribution

Introduces a flexible PoE framework for scalable LLM-based pairwise assessment, enabling high-quality rankings with minimal comparisons.

Findings

01

Achieves comparable performance using only 2% of comparisons.

02

Provides closed-form solutions for Gaussian expert cases.

03

Demonstrates significant computational savings on multiple NLG tasks.

Abstract

LLM-as-a-judge approaches are a practical and effective way of assessing a range of text tasks. However, when using pairwise comparisons to rank a set of candidates, the computational cost scales quadratically with the number of candidates, which has practical limitations. This paper introduces a Product of Expert (PoE) framework for efficient LLM Comparative Assessment. Here individual comparisons are considered experts that provide information on a pair's score difference. The PoE framework combines the information from these experts to yield an expression that can be maximized with respect to the underlying set of candidates, and is highly flexible where any form of expert can be assumed. When Gaussian experts are used one can derive simple closed-form solutions for the optimal candidate ranking, and expressions for selecting which comparisons should be made to maximize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adianliusie/poe-llm-comparative-assessment
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and Computational Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training