Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model
Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit, Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers

TL;DR
This paper introduces a novel neural machine translation model that simultaneously estimates translation quality and generates translations, significantly improving efficiency and quality without requiring additional models.
Contribution
The paper proposes a single model that integrates quality estimation with translation, enabling efficient decoding and quality improvement in neural machine translation.
Findings
Two-orders of magnitude speed-up in decoding.
Comparable or better translation quality than reranking methods.
Efficient single-pass decoding with quality-aware capabilities.
Abstract
Maximum-a-posteriori (MAP) decoding is the most widely used decoding strategy for neural machine translation (NMT) models. The underlying assumption is that model probability correlates well with human judgment, with better translations getting assigned a higher score by the model. However, research has shown that this assumption does not always hold, and generation quality can be improved by decoding to optimize a utility function backed by a metric or quality-estimation signal, as is done by Minimum Bayes Risk (MBR) or quality-aware decoding. The main disadvantage of these approaches is that they require an additional model to calculate the utility function during decoding, significantly increasing the computational cost. In this paper, we propose to make the NMT models themselves quality-aware by training them to estimate the quality of their own output. Using this approach for MBR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning and Data Classification · Topic Modeling
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
