One-Shot Labeling for Automatic Relevance Estimation

Sean MacAvaney; Luca Soldaini

arXiv:2302.11266·cs.IR·July 12, 2023·5 cites

One-Shot Labeling for Automatic Relevance Estimation

Sean MacAvaney, Luca Soldaini

PDF

Open Access 1 Repo

TL;DR

This paper introduces One-Shot Labelers that leverage large language models to predict relevance of unjudged documents, significantly improving offline search system evaluations and statistical reliability.

Contribution

It presents novel methods for filling relevance assessment holes using large language models, enhancing evaluation accuracy and confidence in search system comparisons.

Findings

01

Predictions of 1SL often disagree with human assessments.

02

1SL labels produce more reliable system rankings.

03

System ranking correlations exceed 0.86 with full rankings.

Abstract

Dealing with unjudged documents ("holes") in relevance assessments is a perennial problem when evaluating search systems with offline experiments. Holes can reduce the apparent effectiveness of retrieval systems during evaluation and introduce biases in models trained with incomplete data. In this work, we explore whether large language models can help us fill such holes to improve offline evaluations. We examine an extreme, albeit common, evaluation setting wherein only a single known relevant document per query is available for evaluation. We then explore various approaches for predicting the relevance of unjudged documents with respect to a query and the known relevant document, including nearest neighbor, supervised, and prompting techniques. We find that although the predictions of these One-Shot Labelers (1SL) frequently disagree with human assessments, the labels they produce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seanmacavaney/autoqrels
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications