Cross-Lingual Relevance Transfer for Document Retrieval
Peng Shi, Jimmy Lin

TL;DR
This paper demonstrates that multilingual BERT can effectively transfer relevance models across languages, improving document retrieval performance in multiple languages without additional processing.
Contribution
It shows that multilingual BERT trained on English data enhances both monolingual and cross-lingual retrieval in diverse languages, leveraging zero-shot transfer capabilities.
Findings
Improved ranking quality in five diverse languages.
Effective zero-shot cross-lingual relevance transfer.
No special processing needed for multilingual retrieval.
Abstract
Recent work has shown the surprising ability of multi-lingual BERT to serve as a zero-shot cross-lingual transfer model for a number of language processing tasks. We combine this finding with a similarly-recently proposal on sentence-level relevance modeling for document retrieval to demonstrate the ability of multi-lingual BERT to transfer models of relevance across languages. Experiments on test collections in five different languages from diverse language families (Chinese, Arabic, French, Hindi, and Bengali) show that models trained with English data improve ranking quality, without any special processing, both for (non-English) mono-lingual retrieval as well as cross-lingual retrieval.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsTest · Linear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece
