Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality   Estimation Shared Task

Ricardo Rei; Nuno M. Guerreiro; Jos\'e Pombal; Daan van Stigt; Marcos; Treviso; Luisa Coheur; Jos\'e G.C. de Souza; Andr\'e F.T. Martins

arXiv:2309.11925·cs.CL·September 22, 2023

Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

Ricardo Rei, Nuno M. Guerreiro, Jos\'e Pombal, Daan van Stigt, Marcos, Treviso, Luisa Coheur, Jos\'e G.C. de Souza, Andr\'e F.T. Martins

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper describes how Unbabel and IST scaled up the COMETKIWI model to achieve state-of-the-art results in multilingual quality estimation tasks at sentence, word, and span levels, significantly outperforming previous models.

Contribution

The paper introduces an improved, scalable version of the COMETKIWI model that achieves top performance across all quality estimation tasks in a shared multilingual benchmark.

Findings

01

Achieved first place in all shared task categories.

02

Significant improvements in correlation with human judgments.

03

Surpassed previous models by up to 10 Spearman points.

Abstract

We present the joint contribution of Unbabel and Instituto Superior T\'ecnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level granularity. Compared to the previous state-of-the-art COMETKIWI-22, we show large improvements in correlation with human judgements (up to 10 Spearman points). Moreover, we surpass the second-best multilingual submission to the shared-task with up to 3.8 absolute points.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Unbabel/COMET
pytorch

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling