The Multi-Range Theory of Translation Quality Measurement: MQM scoring models and Statistical Quality Control
Arle Lommel, Serge Gladkoff, Alan Melby, Sue Ellen Wright, Ingemar, Strandvik, Katerina Gasova, Angelika Vaasa, Andy Benzo, Romina Marazzato, Sparano, Monica Foresi, Johani Innis, Lifeng Han, Goran Nenadic

TL;DR
This paper reviews the 10-year development of the MQM framework for translation quality evaluation, introduces new scoring models, and advocates for statistical quality control for small sample sizes.
Contribution
It presents new linear and non-linear MQM scoring models and a universal approach for different sample sizes, emphasizing statistical quality control for small samples.
Findings
Introduction of the Linear Calibrated Scoring Model
Presentation of the Non-Linear Scoring Model
Advocacy for statistical quality control for small samples
Abstract
The year 2024 marks the 10th anniversary of the Multidimensional Quality Metrics (MQM) framework for analytic translation quality evaluation. The MQM error typology has been widely used by practitioners in the translation and localization industry and has served as the basis for many derivative projects. The annual Conference on Machine Translation (WMT) shared tasks on both human and automatic translation quality evaluations used the MQM error typology. The metric stands on two pillars: error typology and the scoring model. The scoring model calculates the quality score from annotation data, detailing how to convert error type and severity counts into numeric scores to determine if the content meets specifications. Previously, only the raw scoring model had been published. This April, the MQM Council published the Linear Calibrated Scoring Model, officially presented herein, along…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
