Naver Labs Europe (SPLADE) @ TREC NeuCLIR 2022

Carlos Lassance; St\'ephane Clinchant

arXiv:2303.11171·cs.IR·March 21, 2023·1 cites

Naver Labs Europe (SPLADE) @ TREC NeuCLIR 2022

Carlos Lassance, St\'ephane Clinchant

PDF

Open Access

TL;DR

This paper details Naver Labs Europe's participation in the 2022 TREC NeuCLIR challenge, comparing monolingual and Adhoc retrieval strategies across Farsi and Russian, highlighting the effectiveness of back-translation of documents.

Contribution

The paper introduces a monolingual pretraining and fine-tuning approach for multilingual retrieval and compares it with translation-based strategies in a challenging IR task.

Findings

01

Monolingual strategies are strong in initial results.

02

Back-translation of documents outperforms query translation.

03

Adhoc approach achieved the best overall results.

Abstract

This paper describes our participation in the 2022 TREC NeuCLIR challenge. We submitted runs to two out of the three languages (Farsi and Russian), with a focus on first-stage rankers and comparing mono-lingual strategies to Adhoc ones. For monolingual runs, we start from pretraining models on the target language using MLM+FLOPS and then finetuning using the MSMARCO translated to the language either with ColBERT or SPLADE as the retrieval model. While for the Adhoc task, we test both query translation (to the target language) and back-translation of the documents (to English). Initial result analysis shows that the monolingual strategy is strong, but that for the moment Adhoc achieved the best results, with back-translating documents being better than translating queries.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies

MethodsTest