Online Learning Meets Machine Translation Evaluation: Finding the Best   Systems with the Least Human Effort

V\^ania Mendon\c{c}a (1; 2); Ricardo Rei (1; 2; 3); Luisa; Coheur (1; 2); Alberto Sardinha (1; 2); Ana L\'ucia Santos (4; 5); ((1) INESC-ID Lisboa; (2) Instituto Superior T\'ecnico; (3) Unbabel AI; (4); Centro de Lingu\'istica da Universidade de Lisboa; (5) Faculdade de Letras da; Universidade de Lisboa)

arXiv:2105.13385·cs.CL·May 31, 2021

Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort

V\^ania Mendon\c{c}a (1, 2), Ricardo Rei (1, 2, 3), Luisa, Coheur (1, 2), Alberto Sardinha (1, 2), Ana L\'ucia Santos (4, 5), ((1) INESC-ID Lisboa, (2) Instituto Superior T\'ecnico, (3) Unbabel AI, (4), Centro de Lingu\'istica da Universidade de Lisboa

PDF

1 Repo

TL;DR

This paper introduces an online learning method that efficiently identifies the best machine translation systems with minimal human evaluation, significantly reducing effort while maintaining high accuracy.

Contribution

It presents a novel online learning framework that leverages limited human feedback to rapidly find top-performing translation systems among an ensemble.

Findings

01

Quickly converges to top-3 systems in experiments

02

Reduces human evaluation effort significantly

03

Effective even with sparse human feedback

Abstract

In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging. Automatic metrics are not reliable when it comes to high performing systems. In addition, resorting to human evaluators can be expensive, especially when evaluating multiple systems. To overcome the latter challenge, we propose a novel application of online learning that, given an ensemble of Machine Translation systems, dynamically converges to the best systems, by taking advantage of the human feedback available. Our experiments on WMT'19 datasets show that our online approach quickly converges to the top-3 ranked systems for the language pairs considered, despite the lack of human feedback for many translations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vania-mendonca/MTOL
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.