Non-Linear Scoring Model for Translation Quality Evaluation

Serge Gladkoff; Lifeng Han; Katerina Gasova

arXiv:2511.13467·cs.CL·January 15, 2026

Non-Linear Scoring Model for Translation Quality Evaluation

Serge Gladkoff, Lifeng Han, Katerina Gasova

PDF

Open Access

TL;DR

This paper introduces a non-linear translation quality scoring model based on psychophysical principles, improving fairness and reliability in evaluating translations of varying lengths.

Contribution

It presents a calibrated, non-linear error model that better aligns with human perception and addresses biases in traditional linear scoring methods.

Findings

01

Error counts grow logarithmically with sample size

02

Model improves inter-rater reliability

03

Enhances interpretability and fairness in evaluation

Abstract

Analytic Translation Quality Evaluation (TQE), based on Multidimensional Quality Metrics (MQM), traditionally uses a linear error-to-penalty scale calibrated to a reference sample of 1000-2000 words. However, linear extrapolation biases judgment on samples of different sizes, over-penalizing short samples and under-penalizing long ones, producing misalignment with expert intuition. Building on the Multi-Range framework, this paper presents a calibrated, non-linear scoring model that better reflects how human content consumers perceive translation quality across samples of varying length. Empirical data from three large-scale enterprise environments shows that acceptable error counts grow logarithmically, not linearly, with sample size. Psychophysical and cognitive evidence, including the Weber-Fechner law and Cognitive Load Theory, supports this premise by explaining why the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Authorship Attribution and Profiling