Multi-Level Contrastive Learning for Cross-Lingual Alignment
Beiduo Chen, Wu Guo, Bin Gu, Quan Liu, Yongchao Wang

TL;DR
This paper introduces a multi-level contrastive learning framework that enhances cross-lingual transfer in pre-trained models by integrating word-level information and a novel CZ-NCE loss, leading to improved performance on zero-shot tasks.
Contribution
It presents a novel multi-level contrastive learning approach with word-level alignment and CZ-NCE loss to boost cross-lingual capabilities of mBERT.
Findings
Significant improvement in cross-lingual transfer performance.
Outperforms previous models on Xtreme benchmark tasks.
Effective use of word-level information in contrastive learning.
Abstract
Cross-language pre-trained models such as multilingual BERT (mBERT) have achieved significant performance in various cross-lingual downstream NLP tasks. This paper proposes a multi-level contrastive learning (ML-CTL) framework to further improve the cross-lingual ability of pre-trained models. The proposed method uses translated parallel data to encourage the model to generate similar semantic embeddings for different languages. However, unlike the sentence-level alignment used in most previous studies, in this paper, we explicitly integrate the word-level information of each pair of parallel sentences into contrastive learning. Moreover, cross-zero noise contrastive estimation (CZ-NCE) loss is proposed to alleviate the impact of the floating-point error in the training process with a small batch size. The proposed method significantly improves the cross-lingual transfer ability of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Contrastive Learning · WordPiece · Residual Connection · Layer Normalization · Attention Dropout · Dropout
