Active Sampling for Large-scale Information Retrieval Evaluation

Dan Li; Evangelos Kanoulas

arXiv:1709.01709·cs.IR·September 7, 2017

Active Sampling for Large-scale Information Retrieval Evaluation

Dan Li, Evangelos Kanoulas

PDF

TL;DR

This paper introduces an innovative active sampling method for large-scale information retrieval evaluation that combines sampling and active selection, reducing human effort and bias while improving evaluation accuracy.

Contribution

It proposes a novel active sampling approach that balances system quality assessment and sampling variance, enhancing evaluation efficiency and reducing bias.

Findings

01

Validated with TREC data showing improved evaluation accuracy

02

Reduces human judgment effort in large-scale retrieval evaluation

03

Balances bias and variance in system evaluation

Abstract

Evaluation is crucial in Information Retrieval. The development of models, tools and methods has significantly benefited from the availability of reusable test collections formed through a standardized and thoroughly tested methodology, known as the Cranfield paradigm. Constructing these collections requires obtaining relevance judgments for a pool of documents, retrieved by systems participating in an evaluation task; thus involves immense human labor. To alleviate this effort different methods for constructing collections have been proposed in the literature, falling under two broad categories: (a) sampling, and (b) active selection of documents. The former devises a smart sampling strategy by choosing only a subset of documents to be assessed and inferring evaluation measure on the basis of the obtained sample; the sampling distribution is being fixed at the beginning of the process.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.