Unbabel's Participation in the WMT20 Metrics Shared Task

Ricardo Rei; Craig Stewart; Catarina Farinha; Alon Lavie

arXiv:2010.15535·cs.CL·October 30, 2020·26 cites

Unbabel's Participation in the WMT20 Metrics Shared Task

Ricardo Rei, Craig Stewart, Catarina Farinha, Alon Lavie

PDF

Open Access 1 Repo

TL;DR

Unbabel's team developed models based on the COMET framework to evaluate machine translation quality across multiple levels and tracks, achieving state-of-the-art results in the WMT20 shared task.

Contribution

The paper introduces new estimator and ranking models within the COMET framework for translation quality assessment, including a technique for segment-to-document score conversion.

Findings

01

Achieved strong results across all language pairs

02

Set new state-of-the-art performance in many tracks

03

Demonstrated effectiveness of COMET-based models

Abstract

We present the contribution of the Unbabel team to the WMT 2020 Shared Task on Metrics. We intend to participate on the segment-level, document-level and system-level tracks on all language pairs, as well as the 'QE as a Metric' track. Accordingly, we illustrate results of our models in these tracks with reference to test sets from the previous year. Our submissions build upon the recently proposed COMET framework: We train several estimator models to regress on different human-generated quality scores and a novel ranking model trained on relative ranks obtained from Direct Assessments. We also propose a simple technique for converting segment-level predictions into a document-level score. Overall, our systems achieve strong results for all language pairs on previous test sets and in many cases set a new state-of-the-art.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Unbabel/COMET
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies