HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation

Shijie Zhang; Renhao Li; Songsheng Wang; Philipp Koehn; Min Yang; Derek F. Wong

arXiv:2505.16281·cs.CL·September 17, 2025

HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation

Shijie Zhang, Renhao Li, Songsheng Wang, Philipp Koehn, Min Yang, Derek F. Wong

PDF

Open Access 1 Video

TL;DR

HiMATE introduces a hierarchical multi-agent system leveraging MQM for more accurate, human-aligned machine translation evaluation, effectively detecting error spans and severity with improved performance over existing methods.

Contribution

The paper presents a novel hierarchical multi-agent framework that exploits MQM's structure, incorporating self-reflection and discussion strategies to enhance translation error evaluation.

Findings

01

Outperforms baselines in human-aligned evaluation tasks

02

Achieves 89% F1-score improvement in error span detection

03

Effectively assesses error severity with higher accuracy

Abstract

The advancement of Large Language Models (LLMs) enables flexible and interpretable automatic evaluations. In the field of machine translation evaluation, utilizing LLMs with translation error annotations based on Multidimensional Quality Metrics (MQM) yields more human-aligned judgments. However, current LLM-based evaluation methods still face challenges in accurately identifying error spans and assessing their severity. In this paper, we propose HiMATE, a Hierarchical Multi-Agent Framework for Machine Translation Evaluation. We argue that existing approaches inadequately exploit the fine-grained structural and semantic information within the MQM hierarchy. To address this, we develop a hierarchical multi-agent system grounded in the MQM error typology, enabling granular evaluation of subtype errors. Two key strategies are incorporated to further mitigate systemic hallucinations within…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification