X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu, Kenton Murray, Philipp Koehn, Hieu Hoang, Akiko Eriguchi,, Huda Khayrallah

TL;DR
X-ALMA is a multilingual translation model that achieves high-quality results across 50 languages by using plug-and-play modules and a novel adaptive rejection optimization, outperforming existing models on major benchmarks.
Contribution
The paper introduces X-ALMA, a new multilingual translation model with a plug-and-play architecture and ARPO optimization, ensuring balanced high-quality translation across resource levels.
Findings
X-ALMA outperforms state-of-the-art open-source multilingual LLMs on FLORES-200 and WMT'23 datasets.
The plug-and-play module architecture prevents language conflicts during training.
ARPO optimization surpasses existing preference methods in translation quality.
Abstract
Large language models (LLMs) have achieved remarkable success across various NLP tasks with a focus on English due to English-centric pre-training and limited multilingual data. In this work, we focus on the problem of translation, and while some multilingual LLMs claim to support for hundreds of languages, models often fail to provide high-quality responses for mid- and low-resource languages, leading to imbalanced performance heavily skewed in favor of high-resource languages. We introduce **X-ALMA**, a model designed to ensure top-tier performance across 50 diverse languages, regardless of their resource levels. X-ALMA surpasses state-of-the-art open-source multilingual LLMs, such as Aya-101 and Aya-23, in every single translation direction on the FLORES-200 and WMT'23 test datasets according to COMET-22. This is achieved by plug-and-play language-specific module architecture to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗haoranxu/X-ALMA-13B-Pretrainmodel· 2.1k dl· ♡ 112.1k dl♡ 11
- 🤗haoranxu/X-ALMA-13B-Group2model· 350 dl· ♡ 1350 dl♡ 1
- 🤗haoranxu/X-ALMA-13B-Group3model· 135 dl· ♡ 1135 dl♡ 1
- 🤗haoranxu/X-ALMA-13B-Group5model· 153 dl· ♡ 1153 dl♡ 1
- 🤗haoranxu/X-ALMA-13B-Group4model· 419 dl419 dl
- 🤗haoranxu/X-ALMA-13B-Group6model· 693 dl· ♡ 4693 dl♡ 4
- 🤗haoranxu/X-ALMA-13B-Group7model· 4 dl4 dl
- 🤗haoranxu/X-ALMA-13B-Group1model· 217 dl· ♡ 2217 dl♡ 2
- 🤗haoranxu/X-ALMA-13B-Group8model· 197 dl· ♡ 1197 dl♡ 1
- 🤗haoranxu/X-ALMAmodel· 12 dl· ♡ 2012 dl♡ 20
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsFocus
