Improve the Evaluation of Fluency Using Entropy for Machine Translation Evaluation Metrics
Hui Yu, Xiaofeng Wu, Wenbin Jiang, Qun Liu, Shouxun Lin

TL;DR
This paper introduces an entropy-based approach to improve automatic translation evaluation metrics by better capturing fluency through matched word distribution, enhancing correlation with human judgments.
Contribution
It proposes a novel entropy-based method that can be integrated with existing metrics like BLEU and METEOR to better evaluate translation fluency.
Findings
Improved correlation of BLEU and METEOR with human judgments.
Enhanced sentence-level evaluation accuracy.
Method effectively captures fluency through word distribution.
Abstract
The widely-used automatic evaluation metrics cannot adequately reflect the fluency of the translations. The n-gram-based metrics, like BLEU, limit the maximum length of matched fragments to n and cannot catch the matched fragments longer than n, so they can only reflect the fluency indirectly. METEOR, which is not limited by n-gram, uses the number of matched chunks but it does not consider the length of each chunk. In this paper, we propose an entropy-based method, which can sufficiently reflect the fluency of translations through the distribution of matched words. This method can easily combine with the widely-used automatic evaluation metrics to improve the evaluation of fluency. Experiments show that the correlations of BLEU and METEOR are improved on sentence level after combining with the entropy-based method on WMT 2010 and WMT 2012.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
