Controlling Neural Machine Translation Formality with Synthetic Supervision
Xing Niu, Marine Carpuat

TL;DR
This paper presents a neural method for controlling the formality level in machine translation by generating synthetic training data, leading to translations that better match desired formality without losing meaning.
Contribution
It introduces a novel multi-task training scheme that automatically creates synthetic triplets, enabling end-to-end training for formality control in neural machine translation.
Findings
Outperforms existing models in formality control accuracy
Produces translations that better match target formality levels
Preserves source meaning effectively
Abstract
This work aims to produce translations that convey source language content at a formality level that is appropriate for a particular audience. Framing this problem as a neural sequence-to-sequence task ideally requires training triplets consisting of a bilingual sentence pair labeled with target language formality. However, in practice, available training examples are limited to English sentence pairs of different styles, and bilingual parallel sentences of unknown formality. We introduce a novel training scheme for multi-task models that automatically generates synthetic training triplets by inferring the missing element on the fly, thus enabling end-to-end training. Comprehensive automatic and human assessments show that our best model outperforms existing models by producing translations that better match desired formality levels while preserving the source meaning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
