Modeling Coverage for Neural Machine Translation

Zhaopeng Tu; Zhengdong Lu; Yang Liu; Xiaohua Liu; Hang Li

arXiv:1601.04811·cs.CL·August 9, 2016·160 cites

Modeling Coverage for Neural Machine Translation

Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, Hang Li

PDF

Open Access 3 Repos

TL;DR

This paper introduces a coverage mechanism in neural machine translation to track attention history, reducing translation errors and improving both translation and alignment quality.

Contribution

It proposes a novel coverage-based NMT model that incorporates a coverage vector into the attention mechanism to better handle untranslated words.

Findings

01

Significant improvement in translation quality.

02

Enhanced alignment accuracy.

03

Reduction in over- and under-translation issues.

Abstract

Attention mechanism has enhanced state-of-the-art Neural Machine Translation (NMT) by jointly learning to align and translate. It tends to ignore past alignment information, however, which often leads to over-translation and under-translation. To address this problem, we propose coverage-based NMT in this paper. We maintain a coverage vector to keep track of the attention history. The coverage vector is fed to the attention model to help adjust future attention, which lets NMT system to consider more about untranslated source words. Experiments show that the proposed approach significantly improves both translation quality and alignment quality over standard attention-based NMT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification