Unsupervised Quality Estimation for Neural Machine Translation

Marina Fomicheva; Shuo Sun; Lisa Yankovskaya; Fr\'ed\'eric Blain,; Francisco Guzm\'an; Mark Fishel; Nikolaos Aletras; Vishrav Chaudhary; Lucia; Specia

arXiv:2005.10608·cs.CL·July 21, 2020

Unsupervised Quality Estimation for Neural Machine Translation

Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Fr\'ed\'eric Blain,, Francisco Guzm\'an, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia, Specia

PDF

4 Repos

TL;DR

This paper introduces an unsupervised method for estimating the quality of machine translation outputs by extracting information directly from the MT system, eliminating the need for annotated data and training.

Contribution

It presents a novel unsupervised QE approach that leverages uncertainty quantification from MT systems, matching supervised models without additional resources.

Findings

01

Achieves high correlation with human quality judgments.

02

Requires no training data or external resources.

03

Works for both black-box and glass-box MT systems.

Abstract

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By employing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.