MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task

Juraj Juraska; Daniel Deutsch; Mara Finkelstein; Markus Freitag

arXiv:2410.03983·cs.CL·October 8, 2024

MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task

Juraj Juraska, Daniel Deutsch, Mara Finkelstein, Markus Freitag

PDF

Open Access 1 Repo

TL;DR

MetricX-24 is a hybrid translation quality metric that improves over previous versions by training on augmented data and handling various translation failure modes, demonstrating superior performance in WMT evaluations.

Contribution

The paper introduces MetricX-24, a novel hybrid metric trained with synthetic data and designed to evaluate translations with or without source or reference, advancing translation quality assessment.

Findings

01

Significant performance boost over MetricX-23 on WMT23 MQM ratings.

02

Effective handling of fluent but unrelated translations and undertranslation.

03

Ablation study confirms impact of individual modifications.

Abstract

In this paper, we present the MetricX-24 submissions to the WMT24 Metrics Shared Task and provide details on the improvements we made over the previous version of MetricX. Our primary submission is a hybrid reference-based/-free metric, which can score a translation irrespective of whether it is given the source segment, the reference, or both. The metric is trained on previous WMT data in a two-stage fashion, first on the DA ratings only, then on a mixture of MQM and DA ratings. The training set in both stages is augmented with synthetic examples that we created to make the metric more robust to several common failure modes, such as fluent but unrelated translation, or undertranslation. We demonstrate the benefits of the individual modifications via an ablation study, and show a significant performance increase over MetricX-23 on the WMT23 MQM ratings, as well as our new synthetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/metricx
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training