Textual Similarity as a Key Metric in Machine Translation Quality   Estimation

Kun Sun; Rong Wang

arXiv:2406.07440·cs.CL·July 2, 2024·2 cites

Textual Similarity as a Key Metric in Machine Translation Quality Estimation

Kun Sun, Rong Wang

PDF

Open Access

TL;DR

This paper proposes using textual similarity, measured via sentence transformers and cosine similarity, as a new and more effective metric for evaluating machine translation quality without reference texts, outperforming traditional metrics.

Contribution

Introduces textual similarity as a novel metric for MT quality estimation, demonstrating its superior correlation with human judgments across multiple language pairs.

Findings

01

Textual similarity correlates more strongly with human scores than traditional metrics.

02

Textual similarity outperforms other metrics in predicting human evaluation across datasets.

03

Hter metric fails to predict human scores effectively in QE.

Abstract

Machine Translation (MT) Quality Estimation (QE) assesses translation reliability without reference texts. This study introduces "textual similarity" as a new metric for QE, using sentence transformers and cosine similarity to measure semantic closeness. Analyzing data from the MLQE-PE dataset, we found that textual similarity exhibits stronger correlations with human scores than traditional metrics (hter, model evaluation, sentence probability etc.). Employing GAMMs as a statistical tool, we demonstrated that textual similarity consistently outperforms other metrics across multiple language pairs in predicting human scores. We also found that "hter" actually failed to predict human scores in QE. Our findings highlight the effectiveness of textual similarity as a robust QE metric, recommending its integration with other metrics into QE frameworks and MT system training for improved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies