ir_metadata: An Extensible Metadata Schema for IR Experiments

Timo Breuer; J\"uri Keller; Philipp Schaer

arXiv:2207.08922·cs.IR·July 20, 2022

ir_metadata: An Extensible Metadata Schema for IR Experiments

Timo Breuer, J\"uri Keller, Philipp Schaer

PDF

1 Repo

TL;DR

This paper introduces ir_metadata, an extensible schema for annotating IR experiment run files with metadata based on the PRIMAD model, enhancing reproducibility and reuse of experimental data.

Contribution

It proposes a new metadata schema aligned with PRIMAD for IR experiments and demonstrates its application in reproducibility studies and dataset curation.

Findings

01

Metadata annotations improve experiment reproducibility.

02

Implementation supports reproducibility studies in IR.

03

Annotated dataset facilitates reuse and validation.

Abstract

The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much information about the underlying experiment. For instance, the single run file is not of much use without the context of the shared task's website or the run data archive. In other domains, like the social sciences, it is good practice to annotate research data with metadata. In this work, we introduce ir_metadata - an extensible metadata schema for TREC run files based on the PRIMAD model. We propose to align the metadata annotations to PRIMAD, which considers components of computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

irgroup/ir_metadata
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · ALIGN