Context-Aware Machine Translation with Source Coreference Explanation

Huy Hien Vu; Hidetaka Kamigaito; Taro Watanabe

arXiv:2404.19505·cs.CL·May 1, 2024·1 cites

Context-Aware Machine Translation with Source Coreference Explanation

Huy Hien Vu, Hidetaka Kamigaito, Taro Watanabe

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a context-aware machine translation model that explains translation decisions using coreference features, improving translation quality by addressing the explain-away effect in long contexts.

Contribution

It proposes a novel method that predicts coreference features to enhance context utilization in machine translation models, addressing a key limitation of existing approaches.

Findings

01

Achieved over 1.0 BLEU score improvement on multiple datasets.

02

Effectively utilizes coreference features for better context understanding.

03

Demonstrates robustness across different language pairs and datasets.

Abstract

Despite significant improvements in enhancing the quality of translation, context-aware machine translation (MT) models underperform in many cases. One of the main reasons is that they fail to utilize the correct features from context when the context is too long or their models are overly complex. This can lead to the explain-away effect, wherein the models only consider features easier to explain predictions, resulting in inaccurate translations. To address this issue, we propose a model that explains the decisions made for translation by predicting coreference features in the input. We construct a model for input coreference by exploiting contextual features from both the input and translation output representations on top of an existing MT model. We evaluate and analyze our method in the WMT document-level translation task of English-German dataset, the English-Russian dataset, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hienvuhuy/transcoref
pytorchOfficial

Videos

Context-Aware Machine Translation with Source Coreference Explanation· underline

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling