MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic   Post-Editing in LLM Translation Evaluators

Qingyu Lu; Liang Ding; Kanjian Zhang; Jinxia Zhang; Dacheng Tao

arXiv:2409.14335·cs.CL·December 17, 2024

MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators

Qingyu Lu, Liang Ding, Kanjian Zhang, Jinxia Zhang, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper introduces MQM-APE, a universal, training-free framework that enhances error annotation quality in LLM-based MT evaluators by filtering out non-impactful errors through automatic post-editing, improving interpretability and reliability.

Contribution

MQM-APE is a novel, training-free approach that refines LLM error annotations by identifying impactful errors via automatic post-editing, improving evaluation accuracy and interpretability.

Findings

01

Improves error span reliability and quality across eight LLMs.

02

Enhances evaluation consistency in high- and low-resource languages.

03

Complements existing translation evaluators like Tower.

Abstract

Large Language Models (LLMs) have shown significant potential as judges for Machine Translation (MT) quality assessment, providing both scores and fine-grained feedback. Although approaches such as GEMBA-MQM have shown state-of-the-art performance on reference-free evaluation, the predicted errors do not align well with those annotated by human, limiting their interpretability as feedback signals. To enhance the quality of error annotations predicted by LLM evaluators, we introduce a universal and training-free framework, $MQM-APE$ , based on the idea of filtering out non-impactful errors by Automatically Post-Editing (APE) the original translation based on each error, leaving only those errors that contribute to quality improvement. Specifically, we prompt the LLM to act as 1) $evaluator$ to provide error annotations, 2) $post-editor$ to determine whether…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

coldmist-lu/mqm_ape
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsALIGN