Bilingual Alignment Pre-Training for Zero-Shot Cross-Lingual Transfer
Ziqing Yang, Wentao Ma, Yiming Cui, Jiani Ye, Wanxiang Che, Shijin, Wang

TL;DR
This paper introduces WEAM, a pre-training task that uses statistical alignment to improve zero-shot cross-lingual transfer in multilingual models, leading to significant performance gains on MLQA and XNLI tasks.
Contribution
The paper proposes WEAM, a novel pre-training task that leverages alignment information to enhance cross-lingual transfer in multilingual models.
Findings
WEAM significantly improves zero-shot transfer performance.
The model outperforms baseline multilingual models on MLQA and XNLI.
Alignment-guided pre-training enhances cross-lingual understanding.
Abstract
Multilingual pre-trained models have achieved remarkable performance on cross-lingual transfer learning. Some multilingual models such as mBERT, have been pre-trained on unlabeled corpora, therefore the embeddings of different languages in the models may not be aligned very well. In this paper, we aim to improve the zero-shot cross-lingual transfer performance by proposing a pre-training task named Word-Exchange Aligning Model (WEAM), which uses the statistical alignment information as the prior knowledge to guide cross-lingual word prediction. We evaluate our model on multilingual machine reading comprehension task MLQA and natural language interface task XNLI. The results show that WEAM can significantly improve the zero-shot performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsmBERT
