Classification-based Quality Estimation: Small and Efficient Models for   Real-world Applications

Shuo Sun; Ahmed El-Kishky; Vishrav Chaudhary; James Cross; Francisco; Guzm\'an; Lucia Specia

arXiv:2109.08627·cs.CL·September 20, 2021

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco, Guzm\'an, Lucia Specia

PDF

Open Access

TL;DR

This paper proposes reframing sentence-level quality estimation of machine translation as a classification task, enabling the use of smaller, more efficient models suitable for real-world applications, instead of relying on large, computationally expensive models.

Contribution

It demonstrates that classification-based QE models can match state-of-the-art performance while being more efficient, challenging the necessity of large models for regression-based QE.

Findings

01

Model compression techniques perform poorly for QE regression.

02

Full model parameterization is needed for state-of-the-art results.

03

Reframing QE as classification better aligns with real-world application needs.

Abstract

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels. Recent QE models have achieved previously-unseen levels of correlation with human judgments, but they rely on large multilingual contextualized language models that are computationally expensive and make them infeasible for real-world applications. In this work, we evaluate several model compression techniques for QE and find that, despite their popularity in other NLP tasks, they lead to poor performance in this regression setting. We observe that a full model parameterization is required to achieve SoTA results in a regression task. However, we argue that the level of expressiveness of a model in a continuous range is unnecessary given the downstream applications of QE, and show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research