As Little as Possible, as Much as Necessary: Detecting Over- and   Undertranslations with Contrastive Conditioning

Jannis Vamvas; Rico Sennrich

arXiv:2203.01927·cs.CL·March 4, 2022

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Jannis Vamvas, Rico Sennrich

PDF

1 Repo

TL;DR

This paper introduces a contrastive conditioning method to detect over- and undertranslations in neural machine translation, effectively identifying content omissions and additions without needing reference translations.

Contribution

It presents a novel, reference-free approach using off-the-shelf models to identify translation errors through likelihood comparisons.

Findings

01

Comparable accuracy to supervised quality estimation methods

02

Effective detection of superfluous words and untranslated content

03

Works without reference translations

Abstract

Omission and addition of content is a typical issue in neural machine translation. We propose a method for detecting such phenomena with off-the-shelf translation models. Using contrastive conditioning, we compare the likelihood of a full sequence under a translation model to the likelihood of its parts, given the corresponding source or target sequence. This allows to pinpoint superfluous words in the translation and untranslated words in the source even in the absence of a reference translation. The accuracy of our method is comparable to a supervised method that requires a custom quality estimation model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zurichnlp/coverage-contrastive-conditioning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis