Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs
Ricardo Rei, Nuno M. Guerreiro, Jos\'e Pombal, Jo\~ao Alves, Pedro Teixeirinha, Amin Farajian, Andr\'e F. T. Martins

TL;DR
Tower+ introduces a suite of multilingual LLMs that balance translation specialization and general-purpose skills, achieving state-of-the-art results across diverse tasks through a novel training approach.
Contribution
The paper presents Tower+, a new training recipe and model suite that effectively combines translation and general-purpose capabilities in multilingual LLMs, with multiple scales and competitive performance.
Findings
Smaller models outperform larger open-weight models on certain tasks.
Largest model achieves top translation performance for high-resource languages.
Models perform well on both translation and instruction-following benchmarks.
Abstract
Fine-tuning pretrained LLMs has been shown to be an effective strategy for reaching state-of-the-art performance on specific tasks like machine translation. However, this process of adaptation often implies sacrificing general-purpose capabilities, such as conversational reasoning and instruction-following, hampering the utility of the system in real-world applications that require a mixture of skills. In this paper, we introduce Tower+, a suite of models designed to deliver strong performance across both translation and multilingual general-purpose text capabilities. We achieve a Pareto frontier between translation specialization and multilingual general-purpose capabilities by introducing a novel training recipe that builds on Tower (Alves et al., 2024), comprising continued pretraining, supervised fine-tuning, preference optimization, and reinforcement learning with verifiable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Unbabel/Tower-Plus-2Bmodel· 1.4k dl· ♡ 181.4k dl♡ 18
- 🤗Unbabel/Tower-Plus-9Bmodel· 11k dl· ♡ 3611k dl♡ 36
- 🤗Unbabel/Tower-Plus-72Bmodel· 1.3k dl· ♡ 211.3k dl♡ 21
- 🤗pinzhenchen/Unbabel_Tower-Plus-9Bmodel
- 🤗MultiSynt/nemotron-cc-finnish-tower72bmodel· 5 dl5 dl
- 🤗Ryex/Tower-Plus-9B-abliterated-hf-datamodel· 5 dl5 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
