How to Measure the Reproducibility of System-oriented IR Experiments

Timo Breuer; Nicola Ferro; Norbert Fuhr; Maria Maistro; Tetsuya Sakai,; Philipp Schaer; Ian Soboroff

arXiv:2010.13447·cs.IR·October 27, 2020

How to Measure the Reproducibility of System-oriented IR Experiments

Timo Breuer, Nicola Ferro, Norbert Fuhr, Maria Maistro, Tetsuya Sakai,, Philipp Schaer, Ian Soboroff

PDF

1 Repo

TL;DR

This paper proposes methods to quantify the reproducibility of system-oriented IR experiments and introduces a dataset to validate these measures, addressing a key gap in reproducibility assessment.

Contribution

It compares various measures for assessing IR experiment reproducibility and creates a dataset for validation and future research.

Findings

01

Different measures operate at various granularity levels.

02

The dataset enables validation of reproducibility measures.

03

The methods help quantify the extent of experiment replication.

Abstract

Replicability and reproducibility of experimental results are primary concerns in all the areas of science and IR is not an exception. Besides the problem of moving the field towards more reproducible experimental practices and protocols, we also face a severe methodological issue: we do not have any means to assess when reproduced is reproduced. Moreover, we lack any reproducibility-oriented dataset, which would allow us to develop such methods. To address these issues, we compare several measures to objectively quantify to what extent we have replicated or reproduced a system-oriented IR experiment. These measures operate at different levels of granularity, from the fine-grained comparison of ranked lists, to the more general comparison of the obtained effects and significant differences. Moreover, we also develop a reproducibility-oriented dataset, which allows us to validate our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

irgroup/sigir2020-measure-reproducibility
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.