DATScore: Evaluating Translation with Data Augmented Translations
Moussa Kamal Eddine, Guokan Shang, Michalis Vazirgiannis

TL;DR
DATScore is a novel evaluation metric for machine translation that leverages data augmentation and improved scoring strategies to better align with human judgments, especially in low-resource language scenarios.
Contribution
It introduces data augmentation techniques and new scoring strategies to enhance BARTScore for more accurate translation quality evaluation.
Findings
DATScore correlates better with human evaluations than existing metrics.
Data augmentation significantly improves evaluation accuracy.
The method performs well across multiple NLG tasks.
Abstract
The rapid development of large pretrained language models has revolutionized not only the field of Natural Language Generation (NLG) but also its evaluation. Inspired by the recent work of BARTScore: a metric leveraging the BART language model to evaluate the quality of generated text from various aspects, we introduce DATScore. DATScore uses data augmentation techniques to improve the evaluation of machine translation. Our main finding is that introducing data augmented translations of the source and reference texts is greatly helpful in evaluating the quality of the generated translation. We also propose two novel score averaging and term weighting strategies to improve the original score computing process of BARTScore. Experimental results on WMT show that DATScore correlates better with human meta-evaluations than the other recent state-of-the-art metrics, especially for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Dropout · Adam · Dense Connections
