Does Neural Machine Translation Benefit from Larger Context?

Sebastien Jean; Stanislas Lauly; Orhan Firat; Kyunghyun Cho

arXiv:1704.05135·stat.ML·April 19, 2017·128 cites

Does Neural Machine Translation Benefit from Larger Context?

Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho

PDF

Open Access

TL;DR

This paper introduces a neural machine translation model that incorporates surrounding context, improving translation and pronoun prediction on small datasets, with diminishing benefits on larger corpora.

Contribution

It presents a context-aware neural machine translation architecture and demonstrates its effectiveness, especially on small training datasets, highlighting the role of attention mechanisms.

Findings

01

Improved translation quality on small datasets with context modeling

02

Enhanced pronoun prediction accuracy using attention-based NMT

03

Diminishing benefits of context on large datasets

Abstract

We propose a neural machine translation architecture that models the surrounding text in addition to the source sentence. These models lead to better performance, both in terms of general translation quality and pronoun prediction, when trained on small corpora, although this improvement largely disappears when trained with a larger corpus. We also discover that attention-based neural machine translation is well suited for pronoun prediction and compares favorably with other approaches that were specifically designed for this task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications