Agreement-based Joint Training for Bidirectional Attention-based Neural   Machine Translation

Yong Cheng; Shiqi Shen; Zhongjun He; Wei He; Hua Wu; Maosong Sun; and; Yang Liu

arXiv:1512.04650·cs.CL·April 25, 2016·20 cites

Agreement-based Joint Training for Bidirectional Attention-based Neural Machine Translation

Yong Cheng, Shiqi Shen, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and, Yang Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an agreement-based joint training method for bidirectional attention-based neural machine translation, encouraging models to agree on alignments, which improves translation quality.

Contribution

It proposes a novel joint training approach that enforces agreement between source-to-target and target-to-source models, enhancing translation performance.

Findings

01

Significant improvement in translation quality over independent models

02

Better word alignment accuracy achieved

03

Effective on Chinese-English and English-French tasks

Abstract

The attentional mechanism has proven to be effective in improving end-to-end neural machine translation. However, due to the intricate structural divergence between natural languages, unidirectional attention-based models might only capture partial aspects of attentional regularities. We propose agreement-based joint training for bidirectional attention-based end-to-end neural machine translation. Instead of training source-to-target and target-to-source translation models independently,our approach encourages the two complementary models to agree on word alignment matrices on the same training data. Experiments on Chinese-English and English-French translation tasks show that agreement-based joint training significantly improves both alignment and translation quality over independent training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bagequan/tencent-transformer-with-disagreement
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications