Bilingual Alignment Pre-Training for Zero-Shot Cross-Lingual Transfer

Ziqing Yang; Wentao Ma; Yiming Cui; Jiani Ye; Wanxiang Che; Shijin; Wang

arXiv:2106.01732·cs.CL·November 29, 2021

Bilingual Alignment Pre-Training for Zero-Shot Cross-Lingual Transfer

Ziqing Yang, Wentao Ma, Yiming Cui, Jiani Ye, Wanxiang Che, Shijin, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces WEAM, a pre-training task that uses statistical alignment to improve zero-shot cross-lingual transfer in multilingual models, leading to significant performance gains on MLQA and XNLI tasks.

Contribution

The paper proposes WEAM, a novel pre-training task that leverages alignment information to enhance cross-lingual transfer in multilingual models.

Findings

01

WEAM significantly improves zero-shot transfer performance.

02

The model outperforms baseline multilingual models on MLQA and XNLI.

03

Alignment-guided pre-training enhances cross-lingual understanding.

Abstract

Multilingual pre-trained models have achieved remarkable performance on cross-lingual transfer learning. Some multilingual models such as mBERT, have been pre-trained on unlabeled corpora, therefore the embeddings of different languages in the models may not be aligned very well. In this paper, we aim to improve the zero-shot cross-lingual transfer performance by proposing a pre-training task named Word-Exchange Aligning Model (WEAM), which uses the statistical alignment information as the prior knowledge to guide cross-lingual word prediction. We evaluate our model on multilingual machine reading comprehension task MLQA and natural language interface task XNLI. The results show that WEAM can significantly improve the zero-shot performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Katherinaxxx/Paper-ReadingNotes
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsmBERT