Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation
Luke Zhang, Justin Vasselli, Aditya Khan, York Hay Ng, En-Shiun Annie Lee

TL;DR
This paper introduces Dynamic Meta-Metrics, a flexible framework for machine translation evaluation that adapts metric combinations based on source sentence properties, improving agreement with human judgments.
Contribution
The paper presents a novel source-conditioned metric combination approach that outperforms static ensembles and linear models in MT evaluation.
Findings
MLP-based combinations outperform linear and Gaussian process ensembles.
Soft conditioning improves over linear models.
DMM achieves higher agreement with human judgments across language pairs.
Abstract
We propose Dynamic Meta-Metrics (DMM), a framework for machine translation evaluation that learns source-sentence conditioned combinations of existing metrics. Rather than relying on a single static ensemble or language-specific weighting, DMM adapts the metric combination based on properties of the source segment. We study hard conditioning, which fits an interpretable combiner per cluster, and an exploratory soft-conditioned extension whose weights vary continuously with source-cluster responsibilities. We evaluate DMM on the WMT Metrics Shared Task data across multiple language pairs using pairwise agreement measures at the system and segment levels. Across settings, MLP-based combinations outperform linear and Gaussian process-based ensembles, and introducing soft conditioning yields gains over linear models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
