Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons
Adian Liusie, Vatsal Raina, Yassir Fathullah, Mark Gales

TL;DR
This paper presents a Product of Experts framework for efficient pairwise comparison of large candidate sets using LLMs, significantly reducing computational costs while maintaining high correlation with human judgments.
Contribution
Introduces a flexible PoE framework for scalable LLM-based pairwise assessment, enabling high-quality rankings with minimal comparisons.
Findings
Achieves comparable performance using only 2% of comparisons.
Provides closed-form solutions for Gaussian expert cases.
Demonstrates significant computational savings on multiple NLG tasks.
Abstract
LLM-as-a-judge approaches are a practical and effective way of assessing a range of text tasks. However, when using pairwise comparisons to rank a set of candidates, the computational cost scales quadratically with the number of candidates, which has practical limitations. This paper introduces a Product of Expert (PoE) framework for efficient LLM Comparative Assessment. Here individual comparisons are considered experts that provide information on a pair's score difference. The PoE framework combines the information from these experts to yield an expression that can be maximized with respect to the underlying set of candidates, and is highly flexible where any form of expert can be assumed. When Gaussian experts are used one can derive simple closed-form solutions for the optimal candidate ranking, and expressions for selecting which comparisons should be made to maximize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and Computational Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training
