Human Preferences as Dueling Bandits

Xinyi Yan; Chengxi Luo; Charles L. A. Clarke; Nick Craswell; Ellen M.; Voorhees; Pablo Castells

arXiv:2204.10362·cs.IR·April 25, 2022

Human Preferences as Dueling Bandits

Xinyi Yan, Chengxi Luo, Charles L. A. Clarke, Nick Craswell, Ellen M., Voorhees, Pablo Castells

PDF

2 Repos

TL;DR

This paper explores using dueling bandits algorithms to evaluate neural rankers through human preference judgments, proposing a framework for offline evaluation that accounts for ties and minimizes judgments.

Contribution

It introduces a novel application of dueling bandits for offline human preference-based ranking evaluation and proposes modifications to improve algorithm performance.

Findings

01

Simulations show one algorithm's potential for human preference judgments.

02

Modified algorithm performs well in collecting preference data.

03

Over 10,000 judgments collected for TREC submissions validate the approach.

Abstract

The dramatic improvements in core information retrieval tasks engendered by neural rankers create a need for novel evaluation methods. If every ranker returns highly relevant items in the top ranks, it becomes difficult to recognize meaningful differences between them and to build reusable test collections. Several recent papers explore pairwise preference judgments as an alternative to traditional graded relevance assessments. Rather than viewing items one at a time, assessors view items side-by-side and indicate the one that provides the better response to a query, allowing fine-grained distinctions. If we employ preference judgments to identify the probably best items for each query, we can measure rankers by their ability to place these items as high as possible. We frame the problem of finding best items as a dueling bandits problem. While many papers explore dueling bandits for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.