AI-Assisted Human Evaluation of Machine Translation

Vil\'em Zouhar; Tom Kocmi; Mrinmaya Sachan

arXiv:2406.12419·cs.CL·January 30, 2025

AI-Assisted Human Evaluation of Machine Translation

Vil\'em Zouhar, Tom Kocmi, Mrinmaya Sachan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an AI-assisted annotation protocol for machine translation evaluation that halves annotation time and reduces costs by pre-filling error spans, while maintaining high quality and minimizing bias.

Contribution

It presents a novel AI-assisted annotation method that improves efficiency and reduces costs in human evaluation of machine translation quality.

Findings

01

AI assistance cuts annotation time by 50%

02

Pre-filled error spans improve annotation accuracy

03

Filtering reduces annotation budget by nearly 25%

Abstract

Annually, research teams spend large amounts of money to evaluate the quality of machine translation systems (WMT, inter alia). This is expensive because it requires a lot of expert human labor. In the recently adopted annotation protocol, Error Span Annotation (ESA), annotators mark erroneous parts of the translation and then assign a final score. A lot of the annotator time is spent on scanning the translation for possible errors. In our work, we help the annotators by pre-filling the error annotations with recall-oriented automatic quality estimation. With this AI assistance, we obtain annotations at the same quality level while cutting down the time per span annotation by half (71s/error span $\to$ 31s/error span). The biggest advantage of the ESA $^{AI}$ protocol is an accurate priming of annotators (pre-filled error spans) before they assign the final score. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wmt-conference/ErrorSpanAnnotation
noneOfficial

Videos

AI-Assisted Human Evaluation of Machine Translation· underline

Taxonomy

TopicsNatural Language Processing Techniques