A Measure of the System Dependence of Automated Metrics
Pius von D\"aniken, Jan Deriu, Mark Cieliebak

TL;DR
This paper emphasizes the importance of fairness and consistency in automated machine translation metrics, proposing a new method to evaluate how well metrics treat different systems uniformly.
Contribution
It introduces a novel approach to assess the system dependence of automated metrics, focusing on fairness and consistency across systems.
Findings
Metrics vary in their treatment of different systems
The proposed method effectively evaluates system dependence
Ensures fair comparison of translation systems
Abstract
Automated metrics for Machine Translation have made significant progress, with the goal of replacing expensive and time-consuming human evaluations. These metrics are typically assessed by their correlation with human judgments, which captures the monotonic relationship between human and metric scores. However, we argue that it is equally important to ensure that metrics treat all systems fairly and consistently. In this paper, we introduce a method to evaluate this aspect.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Advanced Database Systems and Queries
