Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

Kuan-Hao Huang; Wasi Uddin Ahmad; Nanyun Peng; Kai-Wei Chang

arXiv:2104.08645·cs.CL·September 13, 2021

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a robust training approach using adversarial training and randomized smoothing to enhance zero-shot cross-lingual transfer in multilingual models, especially for low-resource languages, without relying on costly parallel corpora.

Contribution

It proposes a novel robust training strategy that improves cross-lingual transfer by making models tolerant to embedding noise, reducing dependence on language alignment data.

Findings

01

Robust training significantly improves zero-shot transfer performance.

02

Enhanced results in generalized cross-lingual transfer with mixed-language inputs.

03

Robust methods outperform standard fine-tuning in low-resource scenarios.

Abstract

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contextual embedding spaces such that even if the representations of different languages are not aligned well, the model can still achieve good performance on zero-shot cross-lingual transfer. In this work, we propose a learning strategy for training robust models by drawing connections between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uclanlp/robust-xlt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsXLM-R · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Adam · Dense Connections · Attention Is All You Need · Softmax · Linear Warmup With Linear Decay · WordPiece