Integrating Multi-scale Contextualized Information for Byte-based Neural Machine Translation
Langlin Huang, Yang Feng

TL;DR
This paper introduces Multi-Scale Contextualization (MSC), a novel method for byte-based neural machine translation that dynamically integrates multi-scale contextual information, improving translation quality especially in multilingual and out-of-domain settings.
Contribution
The paper proposes MSC, a new approach that learns and dynamically integrates multi-scale contextualized information in byte-based NMT models, addressing limitations of previous tokenization methods.
Findings
MSC outperforms subword-based methods in multilingual translation.
MSC improves translation quality in out-of-domain scenarios.
The method effectively leverages multi-scale context for better translation accuracy.
Abstract
Subword tokenization is a common method for vocabulary building in Neural Machine Translation (NMT) models. However, increasingly complex tasks have revealed its disadvantages. First, a vocabulary cannot be modified once it is learned, making it hard to adapt to new words. Second, in multilingual translation, the imbalance in data volumes across different languages spreads to the vocabulary, exacerbating translations involving low-resource languages. While byte-based tokenization addresses these issues, byte-based models struggle with the low information density inherent in UTF-8 byte sequences. Previous works enhance token semantics through local contextualization but fail to select an appropriate contextualizing scope based on the input. Consequently, we propose the Multi-Scale Contextualization (MSC) method, which learns contextualized information of varying scales across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
