SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline, Brun, James Henderson, Laurent Besacier

TL;DR
SMaLL-100 is a compact, distilled multilingual translation model that outperforms similar-sized models on low-resource languages, offering faster inference and lower memory usage while maintaining high translation quality.
Contribution
The paper introduces SMaLL-100, a smaller, efficient distilled version of M2M-100 that preserves performance on low-resource languages and improves inference speed and memory efficiency.
Findings
SMaLL-100 outperforms comparable-sized multilingual models on low-resource benchmarks.
It achieves similar results to M2M-100 (1.2B) while being 3.6x smaller and 4.3x faster.
The model effectively balances performance and resource constraints in multilingual translation.
Abstract
In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation. To overcome the "curse of multilinguality", these models often opt for scaling up the number of parameters, which makes their use in resource-constrained environments challenging. We introduce SMaLL-100, a distilled version of the M2M-100 (12B) model, a massively multilingual machine translation model covering 100 languages. We train SMaLL-100 with uniform sampling across all language pairs and therefore focus on preserving the performance of low-resource languages. We evaluate SMaLL-100 on different low-resource benchmarks: FLORES-101, Tatoeba, and TICO-19 and demonstrate that it outperforms previous massively multilingual models of comparable sizes (200-600M) while improving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsOPT
