Multi-Way, Multilingual Neural Machine Translation with a Shared   Attention Mechanism

Orhan Firat; Kyunghyun Cho; Yoshua Bengio

arXiv:1601.01073·cs.CL·January 7, 2016

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

Orhan Firat, Kyunghyun Cho, Yoshua Bengio

PDF

TL;DR

This paper introduces a multi-way, multilingual neural machine translation model with a shared attention mechanism, enabling efficient translation across multiple languages and improving performance, especially for low-resource languages.

Contribution

The paper presents a novel shared attention mechanism allowing a single model to handle multiple language pairs with linear parameter growth, enhancing translation quality.

Findings

01

Improved translation performance over single-language models.

02

Significant gains for low-resource language pairs.

03

Efficient parameter sharing across multiple languages.

Abstract

We propose multi-way, multilingual neural machine translation. The proposed approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages. This is made possible by having a single attention mechanism that is shared across all language pairs. We train the proposed multi-way, multilingual model on ten language pairs from WMT'15 simultaneously and observe clear performance improvements over models trained on only one language pair. In particular, we observe that the proposed model significantly improves the translation quality of low-resource language pairs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.