A Measure of the System Dependence of Automated Metrics

Pius von D\"aniken; Jan Deriu; Mark Cieliebak

arXiv:2412.03152·cs.CL·December 31, 2024

A Measure of the System Dependence of Automated Metrics

Pius von D\"aniken, Jan Deriu, Mark Cieliebak

PDF

Open Access

TL;DR

This paper emphasizes the importance of fairness and consistency in automated machine translation metrics, proposing a new method to evaluate how well metrics treat different systems uniformly.

Contribution

It introduces a novel approach to assess the system dependence of automated metrics, focusing on fairness and consistency across systems.

Findings

01

Metrics vary in their treatment of different systems

02

The proposed method effectively evaluates system dependence

03

Ensures fair comparison of translation systems

Abstract

Automated metrics for Machine Translation have made significant progress, with the goal of replacing expensive and time-consuming human evaluations. These metrics are typically assessed by their correlation with human judgments, which captures the monotonic relationship between human and metric scores. However, we argue that it is equally important to ensure that metrics treat all systems fairly and consistently. In this paper, we introduce a method to evaluate this aspect.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Advanced Database Systems and Queries