Uncertainty-Aware Machine Translation Evaluation

Taisiya Glushkova; Chrysoula Zerva; Ricardo Rei; Andr\'e F. T. Martins

arXiv:2109.06352·cs.CL·March 28, 2022

Uncertainty-Aware Machine Translation Evaluation

Taisiya Glushkova, Chrysoula Zerva, Ricardo Rei, Andr\'e F. T. Martins

PDF

2 Repos

TL;DR

This paper introduces an uncertainty-aware approach to machine translation evaluation that provides confidence intervals for quality scores, improving trustworthiness and usefulness in flagging critical translation errors.

Contribution

It combines the COMET framework with uncertainty estimation methods like Monte Carlo dropout and deep ensembles to enhance MT evaluation reliability.

Findings

01

Uncertainty-aware metrics outperform point estimates in reliability.

02

Confidence intervals help identify potentially critical translation errors.

03

Method is effective across multiple language pairs and datasets.

Abstract

Several neural-based metrics have been recently proposed to evaluate machine translation quality. However, all of them resort to point estimates, which provide limited information at segment level. This is made worse as they are trained on noisy, biased and scarce human judgements, often resulting in unreliable quality predictions. In this paper, we introduce uncertainty-aware MT evaluation and analyze the trustworthiness of the predicted quality. We combine the COMET framework with two uncertainty estimation methods, Monte Carlo dropout and deep ensembles, to obtain quality scores along with confidence intervals. We compare the performance of our uncertainty-aware MT evaluation methods across multiple language pairs from the QT21 dataset and the WMT20 metrics task, augmented with MQM annotations. We experiment with varying numbers of references and further discuss the usefulness of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout · Monte Carlo Dropout