Analysing Cross-Lingual Transfer in Low-Resourced African Named Entity Recognition
Michael Beukman, Manuel Fokam

TL;DR
This paper examines how transfer learning performs in low-resource African languages for named entity recognition, highlighting the impact of fine-tuning, transfer language choice, and data overlap on zero-shot transfer success.
Contribution
It provides an analysis of cross-lingual transfer properties in low-resource African NER, emphasizing the importance of data overlap over linguistic similarity for transfer effectiveness.
Findings
Models with high single-language performance often lack cross-lingual generalization.
Models with better cross-lingual generalization tend to perform worse on individual languages.
Data overlap between source and target datasets predicts transfer success better than language similarity.
Abstract
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downstream models easier and faster to train. This has also been extended to low-resourced languages, with some success. We investigate the properties of cross-lingual transfer learning between ten low-resourced languages, from the perspective of a named entity recognition task. We specifically investigate how much adaptive fine-tuning and the choice of transfer language affect zero-shot transfer performance. We find that models that perform well on a single language often do so at the expense of generalising to others, while models with the best generalisation to other languages suffer in individual language performance. Furthermore, the amount of data overlap between the source and target datasets is a better predictor of transfer performance than either the geographical or genetic distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques
