CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared   Task

Ricardo Rei; Marcos Treviso; Nuno M. Guerreiro; Chrysoula Zerva; Ana; C. Farinha; Christine Maroti; Jos\'e G. C. de Souza; Taisiya Glushkova,; Duarte M. Alves; Alon Lavie; Luisa Coheur; Andr\'e F. T. Martins

arXiv:2209.06243·cs.CL·September 15, 2022·33 cites

CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task

Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana, C. Farinha, Christine Maroti, Jos\'e G. C. de Souza, Taisiya Glushkova,, Duarte M. Alves, Alon Lavie, Luisa Coheur, Andr\'e F. T. Martins

PDF

Open Access 1 Repo

TL;DR

This paper introduces CometKiwi, a framework for quality estimation in machine translation that leverages reference data, joint training, and combined attention-gradient explanations to achieve state-of-the-art results across multiple tasks and language pairs.

Contribution

It presents a novel approach combining COMET and OpenKiwi architectures with reference-based pretraining, joint sentence and word-level training, and attention-gradient explanation methods for quality estimation.

Findings

01

Incorporating references during pretraining improves performance.

02

Joint training on sentence and word-level objectives boosts results.

03

Attention and gradient-based explanations effectively interpret models.

Abstract

We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE). Our team participated on all three subtasks: (i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii) Critical Error Detection. For all tasks we build on top of the COMET framework, connecting it with the predictor-estimator architecture of OpenKiwi, and equipping it with a word-level sequence tagger and an explanation extractor. Our results suggest that incorporating references during pretraining improves performance across several language pairs on downstream tasks, and that jointly training with sentence and word-level objectives yields a further boost. Furthermore, combining attention and gradient information proved to be the top strategy for extracting good explanations of sentence-level QE models. Overall, our submissions achieved the best results for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Unbabel/COMET
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification