Overview of the TREC 2021 deep learning track

Nick Craswell; Bhaskar Mitra; Emine Yilmaz; Daniel Campos; and Jimmy Lin

arXiv:2507.08191·cs.IR·July 14, 2025·57 cites

Overview of the TREC 2021 deep learning track

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin

PDF

Open Access

TL;DR

This paper reviews the third year of the TREC Deep Learning track, highlighting dataset updates, the performance of neural ranking models, and challenges related to data quality and collection size.

Contribution

It provides an overview of dataset refreshes, evaluates neural ranking models' performance, and discusses data quality issues in the TREC 2021 deep learning track.

Findings

01

Neural models outperform traditional methods

02

Single-stage retrieval performs well but lags behind multi-stage pipelines

03

Dataset size increase raises questions about data quality and relevance

Abstract

This is the third year of the TREC Deep Learning track. As in previous years, we leverage the MS MARCO datasets that made hundreds of thousands of human annotated training labels available for both passage and document ranking tasks. In addition, this year we refreshed both the document and the passage collections which also led to a nearly four times increase in the document collection size and nearly $16$ times increase in the size of the passage collection. Deep neural ranking models that employ large scale pretraininig continued to outperform traditional retrieval methods this year. We also found that single stage retrieval can achieve good performance on both tasks although they still do not perform at par with multistage retrieval pipelines. Finally, the increase in the collection size and the general data refresh raised some questions about completeness of NIST judgments and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Computational and Text Analysis Methods