The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer
Pavel Efimov, Leonid Boytsov, Elena Arslanova, Pavel, Braslavski

TL;DR
This paper investigates how cross-lingual adjustment of multilingual models affects zero-shot transfer across diverse languages and tasks, revealing benefits and limitations of fine-tuning and continual learning strategies.
Contribution
It extends previous methods to more diverse languages and tasks, analyzing the effects of fine-tuning and continual learning on cross-lingual transfer performance.
Findings
Reproduced NLI gains in four languages
Improved NER, XSR, and cross-lingual QA results in three languages
Fine-tuning can cause loss of cross-lingual alignment information
Abstract
Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. (2020) proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a typologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
MethodsXLM-R · mBERT
