Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Muhammad Reza Qorib, Junyi Li, Hwee Tou Ng

TL;DR
This paper systematically investigates how incorporating parallel data can enhance the multilingual translation and reasoning abilities of large language models, challenging the notion that scale alone suffices.
Contribution
It provides a controlled experimental analysis showing that adding parallel data significantly boosts LLMs' multilingual performance, especially in translation and reasoning tasks.
Findings
Parallel data improves translation accuracy.
Multilingual reasoning capabilities are enhanced with parallel data.
Scale alone does not fully account for multilingual abilities.
Abstract
Large language models (LLMs) have demonstrated impressive translation capabilities even without being explicitly trained on parallel data. This remarkable property has led some to believe that parallel data is no longer necessary for building multilingual language models. While some attribute this to the emergent abilities of LLMs due to scale, recent work suggests that it is actually caused by incidental bilingual signals present in the training data. Various methods have been proposed to maximize the utility of parallel data to enhance the multilingual capabilities of multilingual encoder-based and encoder-decoder language models. However, some decoder-based LLMs opt to ignore parallel data instead. In this work, we conduct a systematic study on the impact of adding parallel data on LLMs' multilingual capabilities, focusing specifically on translation and multilingual common-sense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗nusnlp/JGP-Parallel-Last-allmodel· 3 dl3 dl
- 🤗nusnlp/JGP-Parallel-Distributedmodel· 147 dl147 dl
- 🤗nusnlp/JGP-Parallel-Firstmodel· 247 dl247 dl
- 🤗nusnlp/JGP-Parallel-Non-Adjacentmodel· 2 dl2 dl
- 🤗nusnlp/JGP-No-Parallelmodel· 1 dl1 dl
- 🤗nusnlp/JGP-Multilingualmodel· 2 dl2 dl
- 🤗nusnlp/JGP-Parallel-Last-ID-ENmodel· 1 dl1 dl
- 🤗nusnlp/JGP-Parallel-Last-ZH-ENmodel· 1 dl1 dl
- 🤗nusnlp/JGP-Parallel-Last-EN-ZHmodel· 1 dl1 dl
- 🤗nusnlp/JGP-Parallel-Last-EN-IDmodel· 4 dl4 dl
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
