Learning What Evaluators Value: A Reliable Approach to Modeling Evaluator Preferences
Madeline Celi Kitch, Nihar B. Shah

TL;DR
This paper introduces a robust method for learning evaluator preferences that minimizes assumptions about preference models, ensuring accurate preference learning in diverse real-world evaluation scenarios.
Contribution
The paper proposes a minimally assumption-based algorithm for preference learning that is robust to model mismatch and theoretically guarantees performance.
Findings
The algorithm accurately learns preferences in synthetic and real-world data.
Model mismatch can significantly impair preference learning if not properly addressed.
The method performs well even when traditional linearity assumptions are violated.
Abstract
In many applications, human and LLM evaluators use assessments of relevant criteria to create an overall evaluation for an item or individual. For example, in admissions, committees assess candidates on attributes such as test scores, GPA, and research experience to evaluate their overall fit for the program. Another example arises in medical care where clinicians use patient reports of symptoms to consider preliminary diagnoses and assess risks. Each setting involves mapping multiple criteria to an overall evaluation -- a process that reflects the evaluator's underlying preferences. We focus on the fundamental question of learning these preferences. Many applications of this problem make specific modeling assumptions on evaluator preferences that may be substantially violated in the real world. We make the minimal assumption that the preference function is coordinate-wise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
