Translationese in Machine Translation Evaluation

Yvette Graham; Barry Haddow; Philipp Koehn

arXiv:1906.09833·cs.CL·June 25, 2019·57 cites

Translationese in Machine Translation Evaluation

Yvette Graham, Barry Haddow, Philipp Koehn

PDF

Open Access

TL;DR

This paper examines how translationese affects machine translation evaluation accuracy, highlights issues with past assessments, and offers guidelines and statistical analyses to improve future evaluation reliability.

Contribution

It provides a detailed analysis of translationese effects, re-evaluates past human-parity claims, and offers a comprehensive checklist for more reliable future MT evaluations.

Findings

01

Translationese can bias MT evaluation results.

02

Past human-parity claims may be unreliable due to statistical issues.

03

A checklist is proposed to improve future MT evaluation practices.

Abstract

The term translationese has been used to describe the presence of unusual features of translated text. In this paper, we provide a detailed analysis of the adverse effects of translationese on machine translation evaluation results. Our analysis shows evidence to support differences in text originally written in a given language relative to translated text and this can potentially negatively impact the accuracy of machine translation evaluations. For this reason we recommend that reverse-created test data be omitted from future machine translation test sets. In addition, we provide a re-evaluation of a past high-profile machine translation evaluation claiming human-parity of MT, as well as analysis of the since re-evaluations of it. We find potential ways of improving the reliability of all three past evaluations. One important issue not previously considered is the statistical power of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification