Variance-Aware Machine Translation Test Sets

Runzhe Zhan; Xuebo Liu; Derek F. Wong; Lidia S. Chao

arXiv:2111.04079·cs.CL·November 9, 2021·1 cites

Variance-Aware Machine Translation Test Sets

Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao

PDF

Open Access 1 Repo

TL;DR

This paper introduces variance-aware test sets for machine translation evaluation, automatically created to better correlate with human judgment and highlight challenging linguistic features, aiding future test set construction.

Contribution

It proposes a novel variance-aware filtering method to automatically generate discriminative MT test sets without human labor, improving evaluation reliability.

Findings

01

VAT correlates better with human judgment than original WMT sets

02

VAT highlights challenging linguistic features like low-frequency words

03

The method is applicable across multiple language pairs and test sets

Abstract

We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions. VAT is automatically created by a novel variance-aware filtering method that filters the indiscriminative test instances of the current MT test sets without any human labor. Experimental results show that VAT outperforms the original WMT test sets in terms of the correlation with human judgement across mainstream language pairs and test sets. Further analysis on the properties of VAT reveals the challenging linguistic features (e.g., translation of low-frequency words and proper nouns) for competitive MT systems, providing guidance for constructing future MT test sets. The test sets and the code for preparing variance-aware MT test sets are freely available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nlp2ct/variance-aware-mt-test-sets
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications