Machine Translation Evaluation using Bi-directional Entailment
Rakesh Khobragade, Heaven Patel, Anand Namdev, Anish Mishra, Pushpak, Bhattacharyya

TL;DR
This paper introduces a novel machine translation evaluation metric based on bi-directional entailment using deep learning, which correlates better with human judgment than traditional n-gram overlap metrics.
Contribution
The paper presents a new semantic similarity metric for MT evaluation using BERT-based entailment in both directions, improving correlation with human scores.
Findings
Our metric outperforms BLEU and METEOR in correlation with human judgments.
The approach effectively captures paraphrasing and semantic equivalence.
Evaluation on WMT datasets demonstrates its robustness.
Abstract
In this paper, we propose a new metric for Machine Translation (MT) evaluation, based on bi-directional entailment. We show that machine generated translation can be evaluated by determining paraphrasing with a reference translation provided by a human translator. We hypothesize, and show through experiments, that paraphrasing can be detected by evaluating entailment relationship in the forward and backward direction. Unlike conventional metrics, like BLEU or METEOR, our approach uses deep learning to determine the semantic similarity between candidate and reference translation for generating scores rather than relying upon simple n-gram overlap. We use BERT's pre-trained implementation of transformer networks, fine-tuned on MNLI corpus, for natural language inferencing. We apply our evaluation metric on WMT'14 and WMT'17 dataset to evaluate systems participating in the translation task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
