Overview of the TREC 2023 deep learning track

Nick Craswell; Bhaskar Mitra; Emine Yilmaz; Hossein A. Rahmani; Daniel Campos; Jimmy Lin; Ellen M. Voorhees; Ian Soboroff

arXiv:2507.08890·cs.IR·July 15, 2025

Overview of the TREC 2023 deep learning track

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Hossein A. Rahmani, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, Ian Soboroff

PDF

TL;DR

The paper reviews the TREC 2023 deep learning track, highlighting the use of large language models for ranking tasks, the creation of synthetic queries, and the performance of prompt-based methods surpassing previous approaches.

Contribution

It introduces the use of LLM prompting in ranking tasks, compares synthetic and human queries, and demonstrates the effectiveness of prompt-based methods in the TREC 2023 track.

Findings

01

LLM prompting outperformed nnlm approaches.

02

Synthetic queries yielded similar system rankings as human queries.

03

No clear bias observed between GPT-4 and T5 evaluations.

Abstract

This is the fifth year of the TREC Deep Learning track. As in previous years, we leverage the MS MARCO datasets that made hundreds of thousands of human-annotated training labels available for both passage and document ranking tasks. We mostly repeated last year's design, to get another matching test set, based on the larger, cleaner, less-biased v2 passage and document set, with passage ranking as primary and document ranking as a secondary task (using labels inferred from passage). As we did last year, we sample from MS MARCO queries that were completely held out, unused in corpus construction, unlike the test queries in the first three years. This approach yields a more difficult test with more headroom for improvement. Alongside the usual MS MARCO (human) queries from MS MARCO, this year we generated synthetic queries using a fine-tuned T5 model and using a GPT-4 prompt. The new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.