USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics   for Machine Translation

Jonas Belouadi; Steffen Eger

arXiv:2202.10062·cs.CL·March 5, 2024·1 cites

USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation

Jonas Belouadi, Steffen Eger

PDF

Open Access 1 Repo

TL;DR

USCORE introduces a fully unsupervised method for evaluating machine translation that leverages pseudo-parallel data and multilingual embeddings, outperforming supervised metrics on most datasets.

Contribution

The paper presents a novel fully unsupervised evaluation framework for machine translation, combining metric induction, pseudo-parallel data mining, and multilingual embeddings.

Findings

01

Outperforms supervised metrics on 4 out of 5 datasets

02

Develops an iterative process for remapping vector spaces

03

Induces unsupervised multilingual sentence embeddings

Abstract

The vast majority of evaluation metrics for machine translation are supervised, i.e., (i) are trained on human scores, (ii) assume the existence of reference translations, or (iii) leverage parallel data. This hinders their applicability to cases where such supervision signals are not available. In this work, we develop fully unsupervised evaluation metrics. To do so, we leverage similarities and synergies between evaluation metric induction, parallel corpus mining, and MT systems. In particular, we use an unsupervised evaluation metric to mine pseudo-parallel data, which we use to remap deficient underlying vector spaces (in an iterative manner) and to induce an unsupervised MT system, which then provides pseudo-references as an additional component in the metric. Finally, we also induce unsupervised multilingual sentence embeddings from pseudo-parallel data. We show that our fully…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

potamides/unsupervised-metrics
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification