Quality Estimation with $k$-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation
Tu Anh Dinh, Tobias Palzer, Jan Niehues

TL;DR
This paper introduces an unsupervised, model-specific quality estimation method for machine translation that leverages k-nearest neighbors on training data and proposes an automatic evaluation approach using reference-based metrics.
Contribution
It presents the first unsupervised, model-specific QE approach using kNN and an automatic evaluation method with reference-based metrics.
Findings
The automatic evaluation method is effective for measuring QE performance.
Reference-based MetricX-23 is the most suitable metric for this task.
The proposed approach provides reliable quality scores without human annotations.
Abstract
Providing quality scores along with Machine Translation (MT) output, so-called reference-free Quality Estimation (QE), is crucial to inform users about the reliability of the translation. We propose a model-specific, unsupervised QE approach, termed NN-QE, that extracts information from the MT model's training data using -nearest neighbors. Measuring the performance of model-specific QE is not straightforward, since they provide quality scores on their own MT output, thus cannot be evaluated using benchmark QE test sets containing human quality scores on premade MT output. Therefore, we propose an automatic evaluation method that uses quality scores from reference-based metrics as gold standard instead of human-generated ones. We are the first to conduct detailed analyses and conclude that this automatic method is sufficient, and the reference-based MetricX-23 is best for the task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Rough Sets and Fuzzy Logic
