Modeling Coverage for Neural Machine Translation
Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, Hang Li

TL;DR
This paper introduces a coverage mechanism in neural machine translation to track attention history, reducing translation errors and improving both translation and alignment quality.
Contribution
It proposes a novel coverage-based NMT model that incorporates a coverage vector into the attention mechanism to better handle untranslated words.
Findings
Significant improvement in translation quality.
Enhanced alignment accuracy.
Reduction in over- and under-translation issues.
Abstract
Attention mechanism has enhanced state-of-the-art Neural Machine Translation (NMT) by jointly learning to align and translate. It tends to ignore past alignment information, however, which often leads to over-translation and under-translation. To address this problem, we propose coverage-based NMT in this paper. We maintain a coverage vector to keep track of the attention history. The coverage vector is fed to the attention model to help adjust future attention, which lets NMT system to consider more about untranslated source words. Experiments show that the proposed approach significantly improves both translation quality and alignment quality over standard attention-based NMT.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
