NeuCLIRBench: A Modern Evaluation Collection for Monolingual, Cross-Language, and Multilingual Information Retrieval
Dawn Lawrie, James Mayfield, Eugene Yang, Andrew Yates, Sean MacAvaney, Ronak Pradeep, Scott Miller, Paul McNamee, Luca Soldani

TL;DR
NeuCLIRBench is a comprehensive evaluation dataset for monolingual, cross-language, and multilingual information retrieval, incorporating documents in Chinese, Persian, Russian, and English, with extensive relevance judgments for robust system comparison.
Contribution
It introduces a new multilingual test collection combining multiple languages, retrieval scenarios, and strong neural baselines, enhancing evaluation capabilities for retrieval systems.
Findings
Supports diverse retrieval scenarios including monolingual, cross-language, and multilingual tasks.
Contains over 250,000 relevance judgments across approximately 150 queries.
Includes a strong neural retrieval baseline for improved system evaluation.
Abstract
To measure advances in retrieval, test collections with relevance judgments that can faithfully distinguish systems are required. This paper presents NeuCLIRBench, an evaluation collection for cross-language and multilingual retrieval. The collection consists of documents written natively in Chinese, Persian, and Russian, as well as those same documents machine translated into English. The collection supports several retrieval scenarios including: monolingual retrieval in English, Chinese, Persian, or Russian; cross-language retrieval with English as the query language and one of the other three languages as the document language; and multilingual retrieval, again with English as the query language and relevant documents in all three languages. NeuCLIRBench combines the TREC NeuCLIR track topics of 2022, 2023, and 2024. The 250,128 judgments across approximately 150 queries for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Biomedical Text Mining and Ontologies
