Resources for Brewing BEIR: Reproducible Reference Models and an   Official Leaderboard

Ehsan Kamalloo; Nandan Thakur; Carlos Lassance; Xueguang Ma,; Jheng-Hong Yang; Jimmy Lin

arXiv:2306.07471·cs.IR·June 14, 2023·1 cites

Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard

Ehsan Kamalloo, Nandan Thakur, Carlos Lassance, Xueguang Ma,, Jheng-Hong Yang, Jimmy Lin

PDF

Open Access 2 Repos

TL;DR

This paper introduces resources for the BEIR benchmark, including reproducible reference implementations of retrieval models and an official leaderboard, to enhance reproducibility, comparability, and research in zero-shot information retrieval across diverse domains.

Contribution

It provides reproducible reference implementations for dense and sparse retrieval models and establishes an official BEIR leaderboard for consistent model evaluation.

Findings

01

Reproducible implementations ease entry for new researchers.

02

The leaderboard enables fair comparison of retrieval models.

03

Facilitates future research in domain-specific information retrieval.

Abstract

BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across 18 different domain/task combinations. In recent years, we have witnessed the growing popularity of a representation learning approach to building retrieval models, typically using pretrained transformers in a supervised setting. This naturally begs the question: How effective are these models when presented with queries and documents that differ from the training data? Examples include searching in different domains (e.g., medical or legal text) and with different types of queries (e.g., keywords vs. well-formed questions). While BEIR was designed to answer these questions, our work addresses two shortcomings that prevent the benchmark from achieving its full potential: First, the sophistication of modern neural methods and the complexity of current software infrastructure create barriers to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications