Learning When to Attend for Neural Machine Translation

Junhui Li; Muhua Zhu

arXiv:1705.11160·cs.CL·June 1, 2017·1 cites

Learning When to Attend for Neural Machine Translation

Junhui Li, Muhua Zhu

PDF

Open Access

TL;DR

This paper introduces a novel attention mechanism for neural machine translation that dynamically decides when to attend to source words, improving translation quality on Chinese-English tasks.

Contribution

The paper proposes a new attention model that determines when the decoder should attend to source words, addressing limitations of previous attention mechanisms.

Findings

01

Achieves 0.8 BLEU score improvement over baseline

02

Demonstrates effectiveness on NIST Chinese-English translation

03

Addresses the issue of target words with no source counterparts

Abstract

In the past few years, attention mechanisms have become an indispensable component of end-to-end neural machine translation models. However, previous attention models always refer to some source words when predicting a target word, which contradicts with the fact that some target words have no corresponding source words. Motivated by this observation, we propose a novel attention model that has the capability of determining when a decoder should attend to source words and when it should not. Experimental results on NIST Chinese-English translation tasks show that the new model achieves an improvement of 0.8 BLEU score over a state-of-the-art baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications