Document Context Neural Machine Translation with Memory Networks
Sameen Maruf, Gholamreza Haffari

TL;DR
This paper introduces a document-level neural machine translation model utilizing memory networks to incorporate source and target context, significantly improving translation quality over previous methods.
Contribution
The paper proposes a novel structured prediction model with dual memory networks for source and target contexts, trained end-to-end and decoded iteratively.
Findings
Model outperforms previous methods in BLEU and METEOR scores.
Effectively exploits document context for better translation.
Demonstrates improvements across multiple language pairs.
Abstract
We present a document-level neural machine translation model which takes both source and target document context into account using memory networks. We model the problem as a structured prediction problem with interdependencies among the observed and hidden variables, i.e., the source sentences and their unobserved target translations in the document. The resulting structured prediction problem is tackled with a neural translation model equipped with two memory components, one each for the source and target side, to capture the documental interdependencies. We train the model end-to-end, and propose an iterative decoding algorithm based on block coordinate descent. Experimental results of English translations from French, German, and Estonian documents show that our model is effective in exploiting both source and target document context, and statistically significantly outperforms the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
