Enhanced Bilingual Evaluation Understudy

Krzysztof Wo{\l}k; Krzysztof Marasek

arXiv:1509.09088·cs.CL·October 1, 2015·2 cites

Enhanced Bilingual Evaluation Understudy

Krzysztof Wo{\l}k, Krzysztof Marasek

PDF

Open Access

TL;DR

This paper enhances the BLEU evaluation metric for machine translation by making it more adaptable and better aligned with human judgment, considering linguistic variations like synonyms and word order.

Contribution

It introduces a modified BLEU metric that accounts for linguistic variations, improving correlation with human evaluations in machine translation assessment.

Findings

01

Improved correlation with human judgments

02

Enhanced robustness to linguistic variations

03

Better alignment with human evaluation methods

Abstract

Our research extends the Bilingual Evaluation Understudy (BLEU) evaluation technique for statistical machine translation to make it more adjustable and robust. We intend to adapt it to resemble human evaluation more. We perform experiments to evaluate the performance of our technique against the primary existing evaluation methods. We describe and show the improvements it makes over existing methods as well as correlation to them. When human translators translate a text, they often use synonyms, different word orders or style, and other similar variations. We propose an SMT evaluation technique that enhances the BLEU metric to consider variations such as those.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling