Learning Compact Metrics for MT

Amy Pu; Hyung Won Chung; Ankur P. Parikh; Sebastian Gehrmann; Thibault; Sellam

arXiv:2110.06341·cs.CL·October 14, 2021

Learning Compact Metrics for MT

Amy Pu, Hyung Won Chung, Ankur P. Parikh, Sebastian Gehrmann, Thibault, Sellam

PDF

1 Repo

TL;DR

This paper explores how model size impacts multilingual machine translation evaluation metrics and demonstrates that distillation can significantly improve performance while reducing model complexity.

Contribution

It introduces a distillation approach for multilingual metrics that balances model capacity and multilinguality, improving performance with fewer parameters.

Findings

01

Model size limits cross-lingual transfer in evaluation metrics.

02

Distillation with synthetic data enhances performance.

03

Achieves 92.6% of RemBERT's performance with one-third of parameters.

Abstract

Recent developments in machine translation and multilingual text generation have led researchers to adopt trained metrics such as COMET or BLEURT, which treat evaluation as a regression problem and use representations from multilingual pre-trained models such as XLM-RoBERTa or mBERT. Yet studies on related tasks suggest that these models are most efficient when they are large, which is costly and impractical for evaluation. We investigate the trade-off between multilinguality and model capacity with RemBERT, a state-of-the-art multilingual language model, using data from the WMT Metrics Shared Task. We present a series of experiments which show that model size is indeed a bottleneck for cross-lingual transfer, then demonstrate how distillation can help addressing this bottleneck, by leveraging synthetic data generation and transferring knowledge from one teacher to multiple students…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/bleurt
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsmBERT