Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models
Suraj Nair, Eugene Yang, Dawn Lawrie, Kevin Duh, Paul McNamee, Kenton, Murray, James Mayfield, Douglas W. Oard

TL;DR
This paper presents ColBERT-X, a cross-language dense retrieval model leveraging XLM-R, trained via zero-shot and translate-train methods, significantly outperforming traditional lexical baselines in multilingual document ranking tasks.
Contribution
It introduces ColBERT-X, a novel cross-language dense retrieval model using transformer encoders, and demonstrates effective training strategies for multilingual information retrieval.
Findings
Significant improvements over lexical CLIR baselines.
Effective zero-shot and translate-train training methods.
Statistically significant results across multiple languages.
Abstract
The advent of transformer-based models such as BERT has led to the rise of neural ranking models. These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25. While monolingual retrieval tasks have benefited from large-scale training collections such as MS MARCO and advances in neural architectures, cross-language retrieval tasks have fallen behind these advancements. This paper introduces ColBERT-X, a generalization of the ColBERT multi-representation dense retrieval model that uses the XLM-RoBERTa (XLM-R) encoder to support cross-language information retrieval (CLIR). ColBERT-X can be trained in two ways. In zero-shot training, the system is trained on the English MS MARCO collection, relying on the XLM-R encoder for cross-language mappings. In translate-train, the system is trained on the MS MARCO English queries…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗eugene-yang/colbertx-xlmr-large-tt-eng.zhomodel· 1 dl1 dl
- 🤗eugene-yang/colbertx-xlmr-large-tt-eng.fasmodel· 1 dl1 dl
- 🤗eugene-yang/colbertx-xlmr-large-tt-eng.rusmodel· 1 dl1 dl
- 🤗hltcoe/plaidx-large-zho-tdist-mt5xxl-engzhomodel· 4 dl4 dl
- 🤗hltcoe/plaidx-large-zho-tdist-mt5xxl-zhozhomodel· 1 dl1 dl
- 🤗hltcoe/plaidx-large-zho-tdist-t53b-engengmodel· 2 dl2 dl
- 🤗hltcoe/plaidx-large-fas-tdist-mt5xxl-engfasmodel· 2 dl2 dl
- 🤗hltcoe/plaidx-large-fas-tdist-mt5xxl-fasfasmodel· 1 dl1 dl
- 🤗hltcoe/plaidx-large-fas-tdist-t53b-engengmodel· 2 dl2 dl
- 🤗hltcoe/plaidx-large-rus-tdist-mt5xxl-engrusmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsAttention Is All You Need · XLM-R · Linear Layer · Layer Normalization · Dense Connections · Linear Warmup With Linear Decay · Softmax · Multi-Head Attention · Weight Decay · Adam
