Cross-lingual Transfer of Monolingual Models
Evangelia Gogoulou, Ariel Ekgren, Tim Isbister, Magnus Sahlgren

TL;DR
This paper proposes a cross-lingual transfer method for monolingual models using domain adaptation, demonstrating improved English performance after transfer from various languages, while analyzing linguistic knowledge retention.
Contribution
It introduces a novel cross-lingual transfer approach for monolingual models based on domain adaptation, challenging previous assumptions about shared vocabulary and joint pre-training.
Findings
Transferred models outperform native English models regardless of source language.
Semantic knowledge is retained after transfer, while syntactic knowledge is learned during transfer.
Performance in source language tasks deteriorates after transfer.
Abstract
Recent studies in zero-shot cross-lingual learning using multilingual models have falsified the previous hypothesis that shared vocabulary and joint pre-training are the keys to cross-lingual generalization. Inspired by this advancement, we introduce a cross-lingual transfer method for monolingual models based on domain adaptation. We study the effects of such transfer from four different languages to English. Our experimental results on GLUE show that the transferred models outperform the native English model independently of the source language. After probing the English linguistic knowledge encoded in the representations before and after transfer, we find that semantic information is retained from the source language, while syntactic information is learned during transfer. Additionally, the results of evaluating the transferred models in source language tasks reveal that their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
