Scaling Model and Data for Multilingual Machine Translation with Open Large Language Models
Yuzhe Shang, Pengzhi Gao, Wei Liu, Jian Luan, Jinsong Su

TL;DR
This paper explores the impact of model and data scaling on multilingual machine translation using open large language models, introducing a new model that outperforms existing open and proprietary systems across 46 languages.
Contribution
It presents MiLMMT-46, a new open multilingual translation model that leverages model and data scaling, achieving state-of-the-art results across numerous languages.
Findings
MiLMMT-46 outperforms recent SOTA models in multilingual translation.
MiLMMT-46 achieves competitive results with proprietary systems.
Scaling models and data improves multilingual translation performance.
Abstract
Open large language models (LLMs) have demonstrated improving multilingual capabilities in recent years. In this paper, we present a study of open LLMs for multilingual machine translation (MT) across a range of languages, and investigate the effects of model scaling and data scaling when adapting open LLMs to multilingual MT through continual pretraining and instruction finetuning. Based on the Gemma3 model family, we develop MiLMMT-46, which achieves top-tier multilingual translation performance across 46 languages. Extensive experiments show that MiLMMT-46 consistently outperforms recent state-of-the-art (SOTA) models, including Seed-X, HY-MT-1.5, and TranslateGemma, and achieves competitive performance with strong proprietary systems such as Google Translate and Gemini 3 Pro. Models are released at https://huggingface.co/collections/xiaomi-research/milmmt-46. Codes are released at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗xiaomi-research/MiLMMT-46-1B-v0.1model· 10.0k dl· ♡ 510.0k dl♡ 5
- 🤗xiaomi-research/MiLMMT-46-1B-Pretrainmodel· 5 dl· ♡ 15 dl♡ 1
- 🤗xiaomi-research/MiLMMT-46-4B-Pretrainmodel· 68 dl· ♡ 168 dl♡ 1
- 🤗xiaomi-research/MiLMMT-46-4B-v0.1model· 263 dl· ♡ 2263 dl♡ 2
- 🤗xiaomi-research/MiLMMT-46-12B-Pretrainmodel· 16 dl· ♡ 116 dl♡ 1
- 🤗xiaomi-research/MiLMMT-46-12B-v0.1model· 301 dl· ♡ 2301 dl♡ 2
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Big Data and Digital Economy
