Translationese as a Language in "Multilingual" NMT

Parker Riley; Isaac Caswell; Markus Freitag; David Grangier

arXiv:1911.03823·cs.CL·July 13, 2020

Translationese as a Language in "Multilingual" NMT

Parker Riley, Isaac Caswell, Markus Freitag, David Grangier

PDF

TL;DR

This paper explores modeling translationese as a separate language in multilingual NMT to improve naturalness and control over translation style, using classifiers to bias outputs and analyze translationese effects.

Contribution

It introduces a method to bias NMT models towards natural or translationese outputs by tagging training data with classifiers, enabling control over translation style.

Findings

01

Biasing towards natural text improves human-rated quality.

02

Biasing towards translationese increases BLEU scores but reduces human preference.

03

The classifier-based tagging reveals limitations of heuristic data tagging.

Abstract

Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters. Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text? There is no data with original source and original target, so we train sentence-level classifiers to distinguish translationese from original target text, and use this classifier to tag the training data for an NMT model. Using this technique we bias the model to produce more natural outputs at test time, yielding gains in human evaluation scores on both accuracy and fluency. Additionally, we demonstrate that it is possible to bias the model to produce translationese and game the BLEU…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest