Enhancing Supervised Learning with Contrastive Markings in Neural   Machine Translation Training

Nathaniel Berger; Miriam Exel; Matthias Huck; Stefan Riezler

arXiv:2307.08416·cs.CL·July 18, 2023

Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training

Nathaniel Berger, Miriam Exel, Matthias Huck, Stefan Riezler

PDF

Open Access

TL;DR

This paper introduces a contrastive marking objective to enhance supervised neural machine translation training, improving translation quality without changing inference, especially effective with human-edited data.

Contribution

It proposes a simple, automatic contrastive marking method integrated into maximum likelihood training for NMT, improving performance over standard methods.

Findings

01

Improved translation quality with contrastive markings

02

Effective in learning from human-edited post-edits

03

Requires only one additional training pass per epoch

Abstract

Supervised learning in Neural Machine Translation (NMT) typically follows a teacher forcing paradigm where reference tokens constitute the conditioning context in the model's prediction, instead of its own previous predictions. In order to alleviate this lack of exploration in the space of translations, we present a simple extension of standard maximum likelihood estimation by a contrastive marking objective. The additional training signals are extracted automatically from reference translations by comparing the system hypothesis against the reference, and used for up/down-weighting correct/incorrect tokens. The proposed new training procedure requires one additional translation pass over the training set per epoch, and does not alter the standard inference setup. We show that training with contrastive markings yields improvements on top of supervised learning, and is especially useful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling