Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology
Ajitesh Bankula, Praney Bankula

TL;DR
This paper explores how linguistic relationships like language families and morphology influence cross-lingual transfer in multilingual NLP models, analyzing performance patterns and potential improvements through typological integration.
Contribution
It provides a comprehensive analysis of the impact of linguistic proximity and morphology on transfer performance and reviews methods incorporating typological data into pre-training.
Findings
Language family proximity correlates with transfer success.
Morphological similarity influences model performance.
Typological information integration can enhance transfer to low-resource languages.
Abstract
Cross-lingual transfer has become a crucial aspect of multilingual NLP, as it allows for models trained on resource-rich languages to be applied to low-resource languages more effectively. Recently massively multilingual pre-trained language models (e.g., mBERT, XLM-R) demonstrate strong zero-shot transfer capabilities[14] [13]. This paper investigates cross-linguistic transfer through the lens of language families and morphology. Investigating how language family proximity and morphological similarity affect performance across NLP tasks. We further discuss our results and how it relates to findings from recent literature. Overall, we compare multilingual model performance and review how linguistic distance metrics correlate with transfer outcomes. We also look into emerging approaches that integrate typological and morphological information into model pre-training to improve transfer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsmBERT
