Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment
Chong Li, Shaonan Wang, Jiajun Zhang, Chengqing Zong

TL;DR
This paper introduces a cross-lingual alignment framework that enhances the cross-lingual capabilities of multilingual generative models by aligning internal representations and outputs, significantly reducing performance bias towards high-resource languages.
Contribution
It proposes a simple contrastive learning-based alignment method that improves knowledge transfer across languages in multilingual models, even with minimal pre-training data.
Findings
Significant boost in cross-lingual abilities with less than 0.1 per thousand pre-training tokens.
Reduction of performance gap between high-resource and low-resource languages.
Improved internal multilingual representation distribution.
Abstract
Multilingual generative models obtain remarkable cross-lingual in-context learning capabilities through pre-training on large-scale corpora. However, they still exhibit a performance bias toward high-resource languages and learn isolated distributions of multilingual sentence representations, which may hinder knowledge transfer across languages. To bridge this gap, we propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences. It aligns the internal sentence representations across different languages via multilingual contrastive learning and aligns outputs by following cross-lingual instructions in the target language. Experimental results show that even with less than 0.1 {\textperthousand} of pre-training tokens, our alignment framework significantly boosts the cross-lingual abilities of generative language models and mitigates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Computational and Text Analysis Methods
MethodsContrastive Learning
