Evaluating Pronominal Anaphora in Machine Translation: An Evaluation   Measure and a Test Suite

Prathyusha Jwalapuram; Shafiq Joty; Irina Temnikova; Preslav Nakov

arXiv:1909.00131·cs.CL·September 4, 2019

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov

PDF

Open Access 2 Repos

TL;DR

This paper introduces a specialized test suite and evaluation measure for pronominal anaphora in machine translation, addressing the limitations of traditional metrics like BLEU in capturing discourse-level translation quality.

Contribution

It provides an extensive, multilingual test suite for pronoun translation errors and proposes an evaluation measure correlated with human judgments.

Findings

01

The test suite covers multiple source languages and real system errors.

02

The evaluation measure effectively differentiates good and bad pronoun translations.

03

User study shows strong correlation with human judgments.

Abstract

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification