Multilingual Neural Machine Translation with Task-Specific Attention
Graeme Blackwood, Miguel Ballesteros, Todd Ward

TL;DR
This paper introduces task-specific attention mechanisms in multilingual neural machine translation, enhancing translation quality across multiple language pairs and even in zero-shot scenarios by balancing shared parameters with language-specific attention models.
Contribution
It proposes a simple yet effective task-specific attention approach that improves multilingual NMT performance while maintaining parameter sharing, especially benefiting low-resource and zero-shot translation directions.
Findings
Consistent translation quality improvements across all language pairs.
Enhanced zero-shot translation performance without explicit training data.
Effective balance between shared parameters and language-specific attention.
Abstract
Multilingual machine translation addresses the task of translating between multiple source and target languages. We propose task-specific attention models, a simple but effective technique for improving the quality of sequence-to-sequence neural multilingual translation. Our approach seeks to retain as much of the parameter sharing generalization of NMT models as possible, while still allowing for language-specific specialization of the attention model to a particular language-pair or task. Our experiments on four languages of the Europarl corpus show that using a target-specific model of attention provides consistent gains in translation quality for all possible translation directions, compared to a model in which all parameters are shared. We observe improved translation quality even in the (extreme) low-resource zero-shot translation directions for which the model never saw…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
