Evaluating Dense Passage Retrieval using Transformers

Nima Sadri

arXiv:2208.06959·cs.IR·August 16, 2022·1 cites

Evaluating Dense Passage Retrieval using Transformers

Nima Sadri

PDF

Open Access

TL;DR

This paper proposes a standardized evaluation framework for dense passage retrieval models based on Transformers, enabling fairer comparisons and consistent benchmarking using MSMARCO dataset and MRR@100 metric.

Contribution

The work formalizes best practices for evaluating Transformer-based retrieval models, providing a clear, reproducible framework for embedding, scoring, and assessing dense retrieval methods.

Findings

01

Framework facilitates consistent evaluation of dense retrieval models

02

Experiments demonstrate the framework's application to well-known models

03

Enables fairer comparison and benchmarking of retrieval techniques

Abstract

Although representational retrieval models based on Transformers have been able to make major advances in the past few years, and despite the widely accepted conventions and best-practices for testing such models, a $standardized$ evaluation framework for testing them has not been developed. In this work, we formalize the best practices and conventions followed by researchers in the literature, paving the path for more standardized evaluations - and therefore more fair comparisons between the models. Our framework (1) embeds the documents and queries; (2) for each query-document pair, computes the relevance score based on the dot product of the document and query embedding; (3) uses the $dev$ set of the MSMARCO dataset to evaluate the models; (4) uses the $\texttt{trec_eval}$ script to calculate MRR@100, which is the primary metric used to evaluate the models. Most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications