Nearest Neighbour Few-Shot Learning for Cross-lingual Classification
M Saiful Bari, Batool Haider, Saab Mansour

TL;DR
This paper introduces a simple nearest neighbor few-shot inference method for cross-lingual classification, which improves performance over traditional fine-tuning in low-resource scenarios across multiple languages and tasks.
Contribution
It proposes a novel nearest neighbor few-shot approach for cross-lingual NLP tasks, demonstrating consistent improvements over fine-tuning with limited labeled data.
Findings
Improves classification accuracy with fewer than 15 samples per language.
Effective across 16 languages and two NLP tasks.
Generalizes well across different tasks.
Abstract
Even though large pre-trained multilingual models (e.g. mBERT, XLM-R) have led to significant performance gains on a wide range of cross-lingual NLP tasks, success on many downstream tasks still relies on the availability of sufficient annotated data. Traditional fine-tuning of pre-trained models using only a few target samples can cause over-fitting. This can be quite limiting as most languages in the world are under-resourced. In this work, we investigate cross-lingual adaptation using a simple nearest neighbor few-shot (<15 samples) inference technique for classification tasks. We experiment using a total of 16 distinct languages across two NLP tasks- XNLI and PAWS-X. Our approach consistently improves traditional fine-tuning using only a handful of labeled samples in target locales. We also demonstrate its generalization capability across tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Speech Recognition and Synthesis
MethodsmBERT
