Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Zhi Qu, Yiran Wang, Jiannan Mao, Chenchen Ding, Hideki Tanaka, Masao Utiyama, Taro Watanabe

TL;DR
This paper introduces 'registering', a novel method for multilingual neural machine translation that uses artificial tokens to improve performance, enabling smaller models to compete with large language models.
Contribution
The paper proposes a new 'registering' technique that enhances MNMT models by incorporating language-specific tokens, significantly boosting their translation quality and adaptability.
Findings
Outperforms previous MNMT state-of-the-art on EC-40 benchmark.
MITRE-913M surpasses NLLB-3.3B and rivals commercial LLMs.
Models are open-sourced for community use.
Abstract
The multilingual neural machine translation (MNMT) aims for arbitrary translations across multiple languages. Although MNMT-specific models trained on parallel data offer low costs in training and deployment, their performance consistently lags behind that of large language models (LLMs). In this work, we introduce registering, a novel method that enables a small MNMT-specific model to compete with LLMs. Specifically, we insert a set of artificial tokens specifying the target language, called registers, into the input sequence between the source and target tokens. By modifying the attention mask, the target token generation only pays attention to the activation of registers, representing the source tokens in the target language space. Experiments on EC-40, a large-scale benchmark, show that our method advances the state-of-the-art of MNMT. We further pre-train two models, namely MITRE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training
