Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Fei Wang; Kuan-Hao Huang; Kai-Wei Chang; Muhao Chen

arXiv:2309.10891·cs.CL·September 21, 2023·1 cites

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Fei Wang, Kuan-Hao Huang, Kai-Wei Chang, Muhao Chen

PDF

Open Access

TL;DR

The paper introduces SALT, a simple self-augmentation method using code-switching and embedding mixup to enhance zero-shot cross-lingual transfer in multilingual models without external data.

Contribution

It presents SALT, a novel self-augmentation technique that improves cross-lingual transferability of multilingual models without relying on external alignment resources.

Findings

01

Improves zero-shot transfer on XNLI and PAWS-X datasets.

02

Enhances transferability without external data.

03

Effective distillation of cross-lingual knowledge.

Abstract

Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data to improve cross-lingual transferability, which are typically expensive to obtain. In this paper, we propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer of the multilingual pretrained language models without the help of such external data. By incorporating code-switching and embedding mixup with self-augmentation, SALT effectively distills cross-lingual knowledge from the multilingual PLM and enhances its transferability on downstream tasks. Experimental results on XNLI and PAWS-X show that our method is able to improve zero-shot cross-lingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsMixup