Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional   Translation Modeling using a Two-Dimensional Grid

Parnia Bahar; Christopher Brix; Hermann Ney

arXiv:2011.12165·cs.CL·November 25, 2020

Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid

Parnia Bahar, Christopher Brix, Hermann Ney

PDF

TL;DR

This paper introduces a novel bidirectional neural machine translation model that uses a two-dimensional grid to simultaneously learn source-to-target and target-to-source translation within a single end-to-end system, demonstrating promising results on multiple language pairs.

Contribution

It presents a new approach for joint bidirectional translation modeling using a two-dimensional grid, reducing the need for separate models and encouraging shared learning.

Findings

01

Effective bidirectional translation on German-English and Turkish-English tasks.

02

Single model achieves comparable quality to separate models.

03

Potential to influence future research in bidirectional translation modeling.

Abstract

Neural translation models have proven to be effective in capturing sufficient information from a source sentence and generating a high-quality target sentence. However, it is not easy to get the best effect for bidirectional translation, i.e., both source-to-target and target-to-source translation using a single model. If we exclude some pioneering attempts, such as multilingual systems, all other bidirectional translation approaches are required to train two individual models. This paper proposes to build a single end-to-end bidirectional translation model using a two-dimensional grid, where the left-to-right decoding generates source-to-target, and the bottom-to-up decoding creates target-to-source output. Instead of training two models independently, our approach encourages a single network to jointly learn to translate in both directions. Experiments on the WMT 2018…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.