Learning Cross-Lingual IR from an English Retriever
Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-Suk, Lee, Avirup Sil

TL;DR
This paper introduces DR.DECR, a cross-lingual IR system trained via knowledge distillation from an English retriever, achieving high accuracy in multilingual retrieval tasks without extensive labeled data.
Contribution
The paper proposes a novel KD-based training method for cross-lingual IR that leverages multilingual encoders and a cross-lingual token alignment algorithm, outperforming traditional fine-tuning.
Findings
DR.DECR outperforms direct fine-tuning in accuracy.
Achieves state-of-the-art results on XOR-TyDi benchmark.
Effective zero-shot out-of-domain performance.
Abstract
We present DR.DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD). The teacher of DR.DECR relies on a highly effective but computationally expensive two-stage inference process consisting of query translation and monolingual IR, while the student, DR.DECR, executes a single CLIR step. We teach DR.DECR powerful multilingual representations as well as CLIR by optimizing two corresponding KD objectives. Learning useful representations of non-English text from an English-only retriever is accomplished through a cross-lingual token alignment algorithm that relies on the representation capabilities of the underlying multilingual encoders. In both in-domain and zero-shot out-of-domain evaluation, DR.DECR demonstrates far superior accuracy over direct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms
MethodsKnowledge Distillation
