A GRU-Gated Attention Model for Neural Machine Translation

Biao Zhang; Deyi Xiong; Jinsong Su

arXiv:1704.08430·cs.CL·November 14, 2019·27 cites

A GRU-Gated Attention Model for Neural Machine Translation

Biao Zhang, Deyi Xiong, Jinsong Su

PDF

Open Access

TL;DR

This paper introduces a GRU-gated attention model for neural machine translation that produces more discriminative context vectors by making source representations sensitive to partial translations, leading to improved translation quality.

Contribution

The paper proposes a novel GRU-gated attention mechanism that enhances source representation discrimination in NMT, outperforming vanilla attention models.

Findings

01

GAtt models significantly improve translation quality over vanilla attention.

02

Enhanced discrimination of context vectors reduces over-translation issues.

03

Experimental results on Chinese-English translation demonstrate effectiveness.

Abstract

Neural machine translation (NMT) heavily relies on an attention network to produce a context vector for each target word prediction. In practice, we find that context vectors for different target words are quite similar to one another and therefore are insufficient in discriminatively predicting target words. The reason for this might be that context vectors produced by the vanilla attention network are just a weighted sum of source representations that are invariant to decoder states. In this paper, we propose a novel GRU-gated attention model (GAtt) for NMT which enhances the degree of discrimination of context vectors by enabling source representations to be sensitive to the partial translation generated by the decoder. GAtt uses a gated recurrent unit (GRU) to combine two types of information: treating a source annotation vector originally produced by the bidirectional encoder as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques

MethodsGated Recurrent Unit