Improving Context-aware Neural Machine Translation with Target-side Context
Hayahide Yamagishi, Mamoru Komachi

TL;DR
This paper proposes a novel weight sharing approach to incorporate target-side context into neural machine translation, demonstrating its usefulness when integrated as decoder states, challenging prior assumptions about its limited utility.
Contribution
The study introduces a weight sharing method that effectively utilizes target-side context in NMT, improving translation quality by leveraging decoder states from previous sentences.
Findings
Target-side context improves NMT when integrated as decoder states.
The proposed method outperforms previous models that used target-side context.
Target-side context is more useful than previously thought when modeled properly.
Abstract
In recent years, several studies on neural machine translation (NMT) have attempted to use document-level context by using a multi-encoder and two attention mechanisms to read the current and previous sentences to incorporate the context of the previous sentences. These studies concluded that the target-side context is less useful than the source-side context. However, we considered that the reason why the target-side context is less useful lies in the architecture used to model these contexts. Therefore, in this study, we investigate how the target-side context can improve context-aware neural machine translation. We propose a weight sharing method wherein NMT saves decoder states and calculates an attention vector using the saved states when translating a current sentence. Our experiments show that the target-side context is also useful if we plug it into NMT as the decoder state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
