Does Neural Machine Translation Benefit from Larger Context?
Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho

TL;DR
This paper introduces a neural machine translation model that incorporates surrounding context, improving translation and pronoun prediction on small datasets, with diminishing benefits on larger corpora.
Contribution
It presents a context-aware neural machine translation architecture and demonstrates its effectiveness, especially on small training datasets, highlighting the role of attention mechanisms.
Findings
Improved translation quality on small datasets with context modeling
Enhanced pronoun prediction accuracy using attention-based NMT
Diminishing benefits of context on large datasets
Abstract
We propose a neural machine translation architecture that models the surrounding text in addition to the source sentence. These models lead to better performance, both in terms of general translation quality and pronoun prediction, when trained on small corpora, although this improvement largely disappears when trained with a larger corpus. We also discover that attention-based neural machine translation is well suited for pronoun prediction and compares favorably with other approaches that were specifically designed for this task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
