Overview of the TREC 2023 NeuCLIR Track

Dawn Lawrie; Sean MacAvaney; James Mayfield; Paul McNamee and; Douglas W. Oard; Luca Soldaini; Eugene Yang

arXiv:2404.08071·cs.IR·April 15, 2024·2 cites

Overview of the TREC 2023 NeuCLIR Track

Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee and, Douglas W. Oard, Luca Soldaini, Eugene Yang

PDF

Open Access 5 Datasets

TL;DR

The TREC 2023 NeuCLIR track evaluates neural cross-language information retrieval methods across multiple languages and document types, introducing new tasks and collections to advance multilingual IR research.

Contribution

This paper presents the second year of the NeuCLIR track, introducing new collections, tasks, and baseline results to advance neural cross-language IR research.

Findings

01

220 runs submitted by 6 teams and coordinators

02

Effective neural approaches demonstrated across multiple languages

03

New pilot task for Chinese technical document retrieval

Abstract

The principal goal of the TREC Neural Cross-Language Information Retrieval (NeuCLIR) track is to study the impact of neural approaches to cross-language information retrieval. The track has created four collections, large collections of Chinese, Persian, and Russian newswire and a smaller collection of Chinese scientific abstracts. The principal tasks are ranked retrieval of news in one of the three languages, using English topics. Results for a multilingual task, also with English topics but with documents from all three newswire collections, are also reported. New in this second year of the track is a pilot technical documents CLIR task for ranked retrieval of Chinese technical documents using English topics. A total of 220 runs across all tasks were submitted by six participating teams and, as baselines, by track coordinators. Task descriptions and results are presented.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Topic Modeling