XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
Peiqin Lin, Andr\'e F. T. Martins, Hinrich Sch\"utze

TL;DR
XAMPLER is a novel cross-lingual example retrieval method that enhances few-shot in-context learning for multiple languages using only English data, leveraging multilingual models to improve performance on diverse benchmarks.
Contribution
Introduces XAMPLER, a cross-lingual retrieval approach that uses English annotations and multilingual models to improve in-context learning across many languages.
Findings
Significant performance gains on multilingual benchmarks.
Effective retrieval of cross-lingual examples using only English data.
Demonstrates scalability across 176 languages.
Abstract
Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these methods to other languages, especially low-resource ones, poses challenges due to the scarcity of cross-lingual retrievers and annotated data. Thus, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data. XAMPLER first trains a retriever based on Glot500, a multilingual small language model, using positive and negative English examples constructed from the predictions of a multilingual large language model, i.e., MaLA500. Leveraging the cross-lingual capacity of the retriever, it can directly retrieve English examples as few-shot examples for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
