Analyzing Zero-shot Cross-lingual Transfer in Supervised NLP Tasks
Hyunjin Choi, Judong Kim, Seongho Joe, Seungjai Min, Youngjune Gwon

TL;DR
This paper investigates the effectiveness of XLM-RoBERTa in enabling zero-shot cross-lingual transfer across various NLP tasks, revealing that transfer strength varies with task complexity and is most evident in semantic similarity.
Contribution
It empirically validates the cross-lingual transfer capabilities of XLM-RoBERTa across multiple NLP tasks and analyzes how task complexity affects transfer effectiveness.
Findings
Cross-lingual transfer is strongest in semantic textual similarity.
Sentiment analysis shows moderate transfer effectiveness.
Machine reading comprehension exhibits the least transfer.
Abstract
In zero-shot cross-lingual transfer, a supervised NLP task trained on a corpus in one language is directly applicable to another language without any additional training. A source of cross-lingual transfer can be as straightforward as lexical overlap between languages (e.g., use of the same scripts, shared subwords) that naturally forces text embeddings to occupy a similar representation space. Recently introduced cross-lingual language model (XLM) pretraining brings out neural parameter sharing in Transformer-style networks as the most important factor for the transfer. In this paper, we aim to validate the hypothetically strong cross-lingual transfer properties induced by XLM pretraining. Particularly, we take XLM-RoBERTa (XLMR) in our experiments that extend semantic textual similarity (STS), SQuAD and KorQuAD for machine reading comprehension, sentiment analysis, and alignment of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsLinear Layer · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Attention Dropout · Attention Is All You Need · Byte Pair Encoding · Dense Connections · Adam · Dropout
